When we first tried to attempt to make a computer talk, we realized how incredibly complicated that language is. Perhaps, it is because children learn to talk through exposure to language that we assumed that the process was straight forward. As fifty years of A.I. research has shown, language appears to be one of our most intensely complicated inventions, one which even our modern computers began to match.
Of course, language and technology have an equal measure of influence upon each other; for instance, “Gutenberg’s invention of metallic movable type elevated writing into a central position in the culture”(Kelley). As we discover how complicated language is, we also move into a new form of language, a visual literacy, that takes place in the advent of the ubiquity of computer screens. Both of these technologies, language modeling and visual literacy, are two interesting approaches that we can take to the ways that language and technology influence each other.
The hope of language comprehension software is a simple one: “By giving computers the ability to understand speech, humankind would marry its two greatest technologies: language and toolmaking” (Seabrook). But language comprehension is a far cry from our current experience with language software and recognition. Calling about a bill to any major corporation, one thing is clear – computers have been imbued with a certain level of ability to be able to recognize the human voice.
Considering the level of customer satisfaction that most people seem to have in regards to these services, as Seabrook mentions, there is still room for considerable improvement. But speech recognition is in part based upon probabilistic algorithms. Probabilistic algorithms are used in place of real speech comprehension because they have no place in actual language. As the cognitive scientist Steven Pinker said in Seabrook’s article, “the consensus as far as I have experienced it among A.I. researchers is that natural-language processing is extraordinarily difficult, as it could involve the entirety of a person’s knowledge, which of course is extraordinarily difficult to model on a computer.”
Furthermore, there are many other factors in language that the current models do not take into consideration; they are the aspects of non-verbal language. While aggression can be modeled and recognized on computer, “The current technology can capture neither the play of emphasis, rhythm, and intonation in spoken language (which linguists call prosody) nor the emotional experience of speaking and understanding language” (Kelley).
Emotional content is another aspect that completely confuses computers. Consider for a moment that people do not always understand whether what a person is saying is actually meant to be true. How would we program a computer to understand sarcasm? There is, perhaps, nothing more confusing in emails and written speech when a person uses a sarcastic tone. But this is just one of the myriad ways in which computers can not match the linguistic comprehension of an adolescent, not even taking into consideration an adult.
While speech comprehension technology shows the limits of our contemporary computers, we can turn to visual literacy as a way to reflect the emotions. As the author Kelley points out in regards to the website Flickr, “more than three billion photos posted to the site so far cover any subject you can imagine.” What computers need for increased visual literacy is a huge library to work with, which every person with a camera on their phone provides.
The technology is not quite there yet as Kelley notes that “With full-blown visuality, I should be able to annotate any object, frame or scene in a motion picture with any other object, frame or motion-picture clip. I should be able to search the visual index of a film, or peruse a visual table of contents, or scan a visual abstract of its full length.” The way to visual literacy is understood at least as opposed to speech comprehension, and there are many interesting aspects that we can ponder about the direction our language may take in an image oriented society.
One of the least interesting aspects of the technological revolution is to consider the way in which it is supposedly negatively affecting language. A quick internet search can demonstrate that there are always people declaring the death of language through the language of a new generation. Google, on the one hand, is apparently attempting to keep a certain resistance to new text-talk abbreviations out of their advertising: “Is Google an Internet incarnation of the grammar prescriptivist, insisting that language has rules and that communication without those rules leads to confusion and the decay of civility?” (Lefton).
Perhaps, there is some use for prescriptivist grammar, but advertising does not seem to be the most vital places for it. If an advertisement contains words and symbols which are unknown to a certain person, that person is not most likely to be part of the target-audience for the advertisement. The point why Google wants their ads to be understandable to the widest possible audience is, perhaps, over-stepping the bounds of what advertising is about. In Bob Garfield’s interview Generation Text, the linguist Geoffrey Nunberg points out the various affect that language and technology have had on each other:
There’s a complicated relationship between technology and language. People used to say that because of the telegraph, reporters were wiring their stories in — stories and sentences both became shorter, and that affected the English language, and, and it’s certainly true that between the late 19th Century and the 1960s or ’70s, the average sentence got a lot shorter, whether you’re looking at the New York Times or best-selling novels.
As Nunberg points out, language did not completely devolve in the wake of the telegraph. We still have prose stylists who grew up without anything resembling a telegraph and can drag a sentence out for pages. There are always those who are resistant to current language trends, even within the generation in which the trends supposedly take place. So even if there are people who are going to start writing novels in text-talk, make feature length movies on an iPhone, or place the value of an image above all else, there is still to be room for written language in whatever form it takes. Somebody is to tell a computer what a bowler hat is if it is going to run a visual search on one.