|posted 9/22/2012 23:58|
|After reading up a bit on speech recognition and processing, I noticed an ongoing theme; the majority of effort in the field seems to be focused on writing AI that can untangle all the problems inherent in existing human languages. Things like grammatical ambiguity, pronunciation differences between dialects and individuals, etc. I would imagine that some languages that have a more logical structure would be easier for a computer to interpret (Japanese comes to mind) and that others would be far more difficult (English).|
Why not design a new, simple language for verbal communication with computers? An Esperanto for machines, if you will. The typed languages for this function, such as C++ and python, already exist and are actually extremely simple - so why not make one in a verbal form?
I would think the first step would be to determine which phenomes are easiest for a computer to accurately recognize, and then to take those phenomes and cross-reference against a table of phenomes common to most languages - with this, you could make a phonetic alphabet of sorts that could be accessible to more or less everyone. After that, it would more or less be like writing a computer language - just design a simple set of rules for grammar/syntax and a vocabulary/function library.
Does anyone have a link to any research applicable to this subject, especially regarding phenome differentiation?
|posted 9/23/2012 00:31|
|Then this should be some artificial language, like enochian language, hehehehe. Ollor ipamis vran oiad lap ol ds oiad oi ipamis thdoh. Enochian language, btw, is very much based on sounds, because spells were really combinations of certain sounds. It was believed that certain sounds would entice a certain entity, and there was a certain sound or combination of sounds for every entity. These entities, btw, were often associated with certain states of the mind, so there may be a rational meaning of these sounds and combinations of sounds.|
| Artificial Consciousness ADS-AC project|
|Last edited by tkorrovi @ 9/23/2012 12:38:00 AM|
|posted 9/23/2012 01:47|
|steyr wrote @ 9/22/2012 11:58:00 PM:|
> I would imagine that some languages that have a more logical structure would be easier for a computer to interpret (Japanese comes to mind) and that others would be far more difficult (English).
Not Japanese, but Sanskrit. Sanskrit has been proposed by NASA as the natural language best for artificial intelligence due to its lack of ambiguity. (One post in this forum about a year ago that made me aware of that fact.)
By the way, Japanese is often very ambiguous in several ways, and English is far from the most difficult language, contrary to popular belief. It is mostly grammar that influences the difficulty of a language, and there are at least several languages with grammar more difficult than English, such as (listed in approximate increasing order of grammatical complexity): German, Latin, Russian, and Finnish.
> The typed languages for this function, such as C++ and python, already exist and are actually extremely simple - so why not make one in a verbal form?
Basically what you're asking for is to represent an infinite, complex, noisy, ambiguous, analog, ever-changing universe by using only a manageable, finite set of terms that use crisp set boundaries. It should be clear you can't accomplish this task in any simple way. Semantic Web languages would love to do that, as would lawyers and patent offices, but they're up against fundamental difficulties. Computer languages deal with a very narrow, clearly-defined virtual world of numbers and text, so any computer language is very unsuitable for use in the real world.
One problem with using even constrained natural language is that natural language is said to *require* ambiguity since the real world is ambiguous. I haven't studied Sanskrit and its implications, but I strongly suspect that language also has plenty of ambiguities, especially in multiple meanings of words, or in boundaries of concepts.
"all natural languages are highly ambiguous"
"The grammar for natural languages is ambiguous and typical sentences have multiple possible analyses. In fact, perhaps surprisingly, for a typical sentence there may be thousands of potential parses (most of which will seem completely nonsensical to a human)."
As I mentioned in another thread, no matter what AI problem you try to solve, you run into the same set of problems, whether you're dealing with language, vision, sound, games, the Blocks World, chatbots, or whatever. Some of these basic problems have names, like the Knowledge Acquisition Bottleneck, Combinatorial Explosion, Credit-Assignment Problem, Frame Problem, the Binding Problem, the Aperture Problem, and so on.
So here are some things that would likely happen if you tried to create an artificial language for a computer to understand language: (1) You'd eventually need a huge vocabulary, which runs into the Knowledge Acquisition Bottleneck. (2) To solve the Knowledge Acquisition Bottleneck, you'd consider having the system learn itself, like from examples, but then you'd run into learning problems such as the Credit-Assignment Problem. (3) To avoid ambiguous terms, your words would need to greatly increase in size. (4) To solve the large word and large vocabulary problem, you'd consider creating hierarchies of context, but then you'd run into the Frame Problem. (5) To define the meanings of words, you'd have to try to describe objects and shapes in the real world, which is an uncountably infinite set. (6) To fight against these problems of volume, you'd try to limit your system to a specific domain, whereupon your system would not be scalable to useful problems in the real world, so you'd almost be back to where you started. (7) To define the edges of fuzzy sets, you'd have to start integrating mathematical functions with definitions, which are uncountably infinite in number. (8) Once you start integrating math functions with concepts, then you have to deal with how to combine math functions for combined concepts, which is a huge research topic in itself.
As a last resort, then you'd want to consider the opinions of experts, but then you'd run into the I-Know-But-I-Ain't-Talkin' Problem. :-) That's where experts who have good ideas as to how to solve major AI problems aren't going to disclose potential solutions since they're trying to get them published or commercialized.
So basically you just run into the well-known hard problems of AI all over again. So as not to discourage you too much, let me state that I think that this idea is a good one, but it would need to have some clever things done in combination to make it work well. For starters, I would look at the ten grammatical forms of English, generalize those as much as possible, then implement an approach that is almost universally eschewed by the Semantic Web crowd, namely creating a master list of definitions. That means you'll run into yet another problem, which is non-acceptance from the Semantic Web community. You could get everybody contributing to the project, Wikipedia style, but then you'd run into data provenance problems, which is itself a huge research project. AI's a bitch, ain't it?
> Does anyone have a link to any research applicable to this subject, especially regarding phenome differentiation?
I think you mean "phoneme", not "phenome".
|Last edited by simnia @ 9/24/2012 7:11:00 PM|
|posted 9/24/2012 05:06|
|I'll make a stronger statement than I did before...|
I think this is an *excellent* idea. Why? Because I came up with the same idea in 2009. I thought the idea was good enough that I spent 1-2 weeks writing an SBIR proposal for it that had additional clever twists on this basic idea. The idea is based on a great insight relating to tackling the human-computer language barrier at its source. As far as I know, my proposal didn't win, though. The reason I never found out is because I was laid off shortly after submitting the proposal.
What you can infer from my story is that my manager thought the idea was good enough to pay me for 1-2 weeks at a professional salary to work on it, and that there was commercial promise in it, or else the government wouldn't have been soliciting proposals on that general topic. For example, you could sell software that would help a user to convert natural language into this artificial language, similar to software that writes OWL/RDF from a graphical interface. Or you might also infer that it was such a loser idea that it was rejected and possibly cost me my job, but I have so little respect for the managers and reviewer involved in that incident that I believe it would be very foolish to credit them with good decisions. Whatever you want to believe.
If you're really interested in pursuing this idea, one possibility is that we and others could develop it publicly in a devoted thread in this forum. One reason I don't mind giving away some of my ideas on this topic is that they're not close enough to my main area of interest to be a disclosure threat for me. If the architecture started cutting in on my own domain, I would simply cease to work on that part and let you try to develop your own insights.
|posted 9/24/2012 15:12|
To be honest, i don't give a dmn to your manager, and software managers in general. And to other people like physicists you may like to worship. You tell them, you can make a miraculous thing which can make them huge sums of money, they will always give you some *short time* to do it. Because they believe in miracles, that one programmer can make something which would make hundreds of millions. But even if one programmer makes a thing which potentially can bring such money (sure not in two weeks) then they are too stupid to be able to sell it, because the only thing which they don't know how to do is their work, so that too would be completely useless for them. If i had such an idea, and i could also sell the thing i make, then why in hell did i have to work for these managers? This is the question which the managers never understand.
simnia wrote @ 9/24/2012 5:06:00 AM:
my manager thought the idea was good enough to pay me for 1-2 weeks at a professional salary to work on it
My managers were even not that generous, by the way. They too liked some of my ideas, and said that they allow me to work on them. All other things which i had to do had to be done of course, and this was exactly so much as i was able to do. So yes, many thanks, could never believe that they give me a great opportunity like that.
This is not a new idea, winograd used a special language to describe actions in the simple world of geometric objects. The problem is, the human expression is complex, simplifying language does not simplify it much. The most difficult is to model human thinking, not processing the formal structures of the language. So, good enough idea of course to make stupid software managers to believe, but not necessarily good idea for AI research (pretty much depends on the task the AI is made for).
If hunt were here, he said, yes this is a good idea, and others said, yes, this is a good idea, physicist said that, yes yes yes. Someone dares to say something different, all would talk reason, how dare you say that, physicist said differently. The discussion which were here then were such that for some people it can smother soul.
| Artificial Consciousness ADS-AC project|
|Last edited by tkorrovi @ 9/24/2012 3:29:00 PM|
|posted 9/24/2012 20:03|
|tkorrovi wrote @ 9/24/2012 3:12:00 PM:|
> The problem is, the human expression is complex, simplifying language does not simplify it much.
Let's look at what we're doing when we write something in a natural language, say on a web page that others will want to access, read, understand, store, and use. We express ourselves using undefined or ambiguous terms, and in an ambiguous grammar besides. Garbage in.
Then, as Steyr noted, we invest huge amounts of time and money trying to get our machines/software to understand the garbage that we just wrote. Ultimately, due to theoretical issues, our parsers cannot reinterpret our text with 100% accuracy, and even if they could, we likely still wouldn't know exactly what the original author intended because the author used ambiguous terms and grammar. Garbage out.
The situation places the impossible burden on AI to interpret our own garbage. The sensible thing to do is not to input garbage in the first place. That's a relatively simple task, and involves things like using unambiguous words, unambiguous grammatical structures such as providing the intended parse tree, removal of ambiguous references...
and maybe a few other simple clarifications. I emphasize that these are very simple things to do, and are the primary cause of ongoing irritation with bad writers. If only the U.S. Bill of Rights had defined "militia" and if only Jesus hadn't used the ambiguous reference "this" in "this generation will not pass away before...", the world might be a much better place. Clearly these problems have plagued humankind for all of recorded history, and have incurred outrageous amounts of money, time, political battles, religious battles, and more, just because of unclear expressions. Therefore this is definitely a project worth tackling.
I believe the key issue in such a project will be representation. We would need to create a standard form of representation, such as a parse tree in LISP-like parenthesized form, or tag words like Japanese uses, or other ideas I have in mind. That's largely an artistic decision since it involves finding a compromise between representation that humans find convenient and digital computers find convenient.
One would want to study numerous natural languages to make sure that the proposed representation didn't miss any intended nuances like politeness, type of past tense intended, number of people intended by "you", etc. And yes, this involves changing and promoting a new natural language, which is a formidable chore in itself, especially when combined with the realization that we might have to relearn that new language as we discover mistakes in it and need to refine it. One would also want to consider whether this new language should be one that exists naturally or only in graphical form, such as a parse tree or directed acyclic graph (= dag), so that computers would be required for humans to read and write in this language. It may get to the stage that most of what we read won't be lines of text but rather parse trees or semantic nets.
Ultimately all those problems I mentioned such as fuzzy boundaries will still exist, but the vast majority of language usage, meaning outside of technical and legal usage, doesn't need such nuances of expression. Usually language is used to quickly convey well-known concepts in novel combinations that qualifies as "news".
I have several other ideas for mitigating the problems in such a system, but we'll see what Steyr wants to do with this idea, if anything. Although there is money in it, I'm not sure there's enough to interest me. I suspect people writing ontology language (OWL/RDF) software...
...aren't making that much money. The real payoff would be indirectly, for everybody and everything on the planet that uses language for communication, including governments and computers.
|Last edited by simnia @ 9/24/2012 8:18:00 PM|
|posted 9/24/2012 21:30|
|Have you thought how many possibilities we have to analyze before we say something. Like, "She visited you, did she like your living room?" Now we should think, whether something she said expresses that she liked something or not, considering her person and the way she used to express. What kind of things she likes and so likely expected to see, so that we can understand her opinion. There are endless number of possibilities, even if there are things we know which significantly restrict the number of possibilities, and we should analyze all of them, before we can say anything. But all this analyze often happens without us noticing, we just like get the answer which we find to be self-evident, seemingly effortlessly. And none of such analyze of possibilities is expressed as a language, some form of representation used for doing that must be translated into language.|
| Artificial Consciousness ADS-AC project|
|Last edited by tkorrovi @ 9/24/2012 9:32:00 PM|
|posted 9/24/2012 22:02|
We agree almost exactly on this, I'm surprised to say.
tkorrovi wrote @ 9/24/2012 9:30:00 PM:
There are endless number of possibilities, even if there are things we know which significantly restrict the number of possibilities, and we should analyze all of them, before we can say anything. But all this analyze often happens without us noticing, we just like get the answer which we find to be self-evident, seemingly effortlessly.
This topic is called "reflexive reasoning"...
Reflexive reasoning is one of those heavy problems of the type I mentioned that plague all AI domains in different guises. But my/Steyr's suggestion isn't intended to solve all such problems, only the most common, annoying problems that could have been corrected with a little more diligence in word choice and phrasing choice. I claim that the issues of reflexive reasoning don't need to be built into a language; we're not trying to solve the whole world here. This is a case where a lot can be accomplished with a little--"high leverage"--therefore this proposed approach is potentially very valuable to pursue, even if it doesn't generalize to every aspect of language.
|Last edited by simnia @ 9/24/2012 10:05:00 PM|
|posted 9/28/2012 09:09|
Oh yeah, what an appreciation by so important person like you, i can even sometimes say something rightly. I guess i can now write it on my tombstone as an achievement of my life.
simnia wrote @ 9/24/2012 10:02:00 PM:
We agree almost exactly on this, I'm surprised to say.
|posted 9/28/2012 12:59|
|I fully agree - much scope to make a new logical language - who wants to help me make a try at it?|
|posted 10/7/2012 08:49|
|Why reinvent the wheel? The grammatical structure is already there. Take a look at bash script, C++, java, python, etc. The only reason that you can't use those languages as they are is that they are not designed to be spoken; "Semicolon" and "close parenthesis" are a mouthful to say, and it could be tough for the voice interpreter to tell the difference between a "&" and a "&&". I'd start with making an alphabet of easily recognizable sounds, and then using that alphabet to make standard console commands and functions. |
For example, "(" could be pronounced "Ka" and ")" could be "Ko". "," could be "Na" and ";" could be "Jo". Quotation marks could be "Ta". Even if you keep the standard pronunciation of everything else, the following becomes a lot easier to say:
String message1 = "hello ";
String message2 = "world";
print(message1 + message2);
With a simple interpretation layer and a defined pronunciation of functions, operators, and your different brackets and things, you could start writing programs by speaking with your computer instead of typing on it. With some control words to do things like navigate between and edit sections of code, compile, and execute, you could actually do it with some efficiency. And once you have that in place, you could modify the language structure without having to touch your keyboard - simply by using the language to modify the interpretation layer. In this sense, the language could actually evolve somewhat naturally.
|posted 12/30/2012 22:34|
|If you want to have your own chatbot or want to chat with laura chatbot please see this site|
| laura chatbot|
Devender Singh Rana
|posted 1/9/2014 14:55|
Have you thought how many possibilities we have to analyze before we say something. Like, "She visited you, did she like your living room?"
One! tkorrovi ;)
"She didn't mentioned anything as such" can be good bot answer if you not training it to lie of course...heheh! Anyway not disagreeing with you trav. :)
for you stevr, I think it all in your perspective.
Language doesn't solve the problems stevr, we do! Language all by itself is just another problem for us. ;)
...if this simple or out of context statement means something to you... I mean can you improvise on it, how its related to your matter ;)