Alright, I’ve been slacking for too long, which is just not wise when so much stuff is coming down the pipe.
For instance, I just got a tip on the following post on Language Log, a seemingly active linguist’s blog (thanks, dgamble). Geoffrey K. Pullum, a linguistics and humanities professor at UCSC, writes the following in response to a NYT article casually asserting that Google “understands language.”
The very least one has to admit about machine understanding is that there is a big difference between a search engine algorithm and a genuine understander like you or me — and I’m not saying it necessarily reflects well on me. If you switch a Google-style search engine algorithm from working on English to working on Arabic, it will very largely work in the same way, provided only that you make available a large body of Arabic text from which it can draw its frequency information. (I have actually met people working at Google on machine processing of stories in Arabic. They do not know how to read Arabic. They don’t need to.) I, on the other hand, will become utterly useless after the switch. I will no longer be able to classify news stories at all (I don’t even know the Arabic writing system, so I can’t even see whether Iran is in a paragraph or not).
Call the machines cleverer, or call me cleverer, I don’t care, but we’re not the same kind of animal, and it seems to me that the verb understand is utterly inappropriate as a term for what Google News algorithms do. |Link|
What kinds of activities are appropriate for the verb ‘understand’?
[Scanners] don’t read for content, get the drift of the story, compare the sense of the paragraphs with their background knowledge and common sense, and chat about the issues with their friends. They tabulate letter strings and do statistical computations.
Now this list isn’t supposed to be exhaustive, but it quite clearly means to push a family resemblance definition of ‘understanding’. “Reading for content”, “getting the drift”, and “comparing the sense… with background knowledge and common sense” are all ways of restating the definition in question, with slightly different emphasis on the manner of semantic appreciation involved. “Chatting with friends” throws in the social angle that is difficult to square with the semantics, but is quite clearly something we do with language. The problem, as I’ve argued before, is that this family resemblance attack draws exclusively from the case of human language use as a paradigm example. But if understanding, and language understanding in particular, is not assumed to be a uniquely human phenomenon– that is, if we aren’t begging the question– then it simply isn’t clear what features and activities are necessary and sufficient for linguistic understanding. For instance, Pullum’s argument assumes from the outset that statistical computations are unable to ‘get the drift’ or otherwise quantify semantic values. We pay attention to the meaning of the words, and not simply their statistical frequency. I’ve argued before that this is simply false, with reference to work like this. Perhaps it is less controversial to say: given enough samples, Google is able to build a map of the semantic territory of some particular language, and from this map it is able to categorize and organize further examples based on its semantic features.
Now, Pullum might object again that this isn’t ‘understanding’ by any stretch, and that throws us back into the Turing/Searle debate, which he and I are both keen on avoiding. But if we grant that Google is sensitive to semantic distinctions, then it is much harder to handwave past the claim that, in a very real sense, Google is ‘reading for content’ in order to ‘get the drift’. The example I like to use happened after the 2004 presidential election. Google News had two main headlines, both dealing with the election results (this was before the Web 2.0 obsession with customization). The first headline involved news articles dealing with the response from officials and leading party figures. The second headline involved articles dealing with the public reaction to the results, from Joe on the Street to public figures and interest groups. By any standard this is a fairly nuanced distinction to make, and the statistical difference in vocabulary alone will have to be fairly fine-grained in order to see the difference at all. This isn’t hacking at language with a machete, like we are used to seeing with GOFAI; this is nuanced and sophisticated stuff, and Google’s ability in this case has everything to do with the content of the articles in question. And, since Google actually does something with the content (I would emphasize that it does something social), I’m not entirely sure what distinction is meant to be captured by reserving the term ‘understanding’ for us exclusively.
But that’s well-worn territory. Pullum adds to this debate an important difference between mere mortals and Google with respect to language: Google is, for the most part, a general purpose language understander. As Pullum points out, the tools it uses to understand English are basically the same as its tools for understanding Arabic. Given enough samples, Google should be able to map out the semantic territory for any language. Pullum concludes from this that “we’re a different kind of animal”. I don’t need to remind Pullum that all instances of human language use involve basically the same mental machinery, or that sampling from available data is how every human being alive comes to learn language. His point is more abstract: because Google is not tuned to a particular language, it can’t understand any particular language. At most, Pullum might conceed that Google ‘understands’ some general statistical features of all languages. But that’s not genuine understanding; at most, it is metaphorical. He adds,
Google’s algorithms are ingenious and they work very well; but they understand things only in a very attenuated metaphorical sense under which you might also say that a combination door lock set to 4357 understands you when you punch in 4357 but not when you punch in 4358.
I’ve always been partial to the view of the mind as a giant combination lock, so this isn’t exactly persuasive. But more importantly, it is unfair to the lock: it understands the input 4358 perfectly well, and the fact that it doesn’t unlock the door when that combo is entered is proof enough.
But this is the same problem all over again. What a system understands must be a function of what it can do, and perhaps of what it is supposed to do. We want a combination lock to understand the difference between the correct key and an incorrect one, in a perfectly unmetaphoric sense of understanding: it instantiates a system that acts on the difference. The smoke detector likewise understands the difference between the presence of smoke and the absence of smoke, and can act on that difference as well, in ways that are relevant to other agents who care about the presence of smoke. Because the smoke detector knows nothing of fire, and because the lock doesn’t know what its keeping in (or out), doesn’t say anything against its understanding of what matters.
Google understands the semantic properties of words in a language, and can do amazing things with that information. It doesn’t know what a chair is because it never needs to sit down, but it can distinguish between department heads and futons quite easily. The fact that these kinds of distinctions generalize across languages shouldn’t undermine it genuine contributions to any particular language. Besides, that’s all we ever wanted Google to do. We don’t ask for it to summarize or analyze articles, we ask it to group and rank them according to their content. And Google’s work to that end is evidence of a robust, objective, and perfectly legitimate understanding of certain salient features of the language, not to mention what is perhaps the most complete understanding of the internet available (which is in some ways more impressive).
Pullum’s point, and this kind of skeptical argument generally, seems to bottom out in the simple assertion, which is almost undeniable, that what counts as salient differs greatly between the human and the computer. But at most this supports the claim that humans and machines understand language differently. It cannot and should not be used to establish the presence or absence of understanding itself.