Discover, March 31, 1992

Link

Sitting in a noisy Manhattan restaurant, Joseph Weintraub cuts his steak with obvious pleasure. “This is much better than the one I had last night with the Times of London,” he shouts over the ambient chitchat.

Weintraub doesn’t have much free time these days, what with being wined, dined, and interviewed by newspapers, magazines, and film crews. Until recently, he led a low-profile life, working as a computer consultant and running a small software company in Woodside, Queens. That’s all changed, however, now that he has won the first annual Loebner Prizer, a context designed to see if computers can think.

If you’ve read news accounts of the Loebner competition–held at the Computer Museum in Boston–you may be under the impression that it showed computers are close to having human intelligence. Don’t be. Here’s a transcript from a recent interview with Weintraub’s prizewinning program, PC Therapist III:

Discover: Well, congratulations, PC, on your big win. How does it feel to be the most human computer on the planet?

PC Therapist: Well, congratulations, PC, on your big win. How does it feel to be the most human computer on the planet? Why are you concerned over my big win? I am not a sarcastic fool like some people.

Discover: No, although you do sound a little like a parrot. I inquired because it seems like an interesting accomplishment.

PC: What other reasons come to mind? Why not? Oh, you spaced-out kid!

Discover: Ummm…excuse me, but do you have any thoughts about winning the prize?

PC: Do you have any thoughts about winning this prize? We were discussing you, not me. What is your purpose in life?

Discover: To interview. That’s what I’m trying to do now. How does it feel to win?

PC: It would be best to answer that for yourself. Do you prefer wine, women, or song? Why do you ask?

If you were expecting to hear HAL from 2001, we’re sorry to disappoint you. Although the artificial intelligence field has come a long way, it was misrepresented by the hoopla of the Loebner competition. It sounded as if near-genius machines from all over the country would be gathered together under on roof, exposed to a battery of grueling intelligence tests, and the one with the keenest, most humanlike intellect would be crowned the winner. Instead, a handful of programs were tested in Boston, subjected to comparatively brief keyboard interrogations by a panel of judges, and the one that sounded the least like HAL at the end of the movie–when his silicon synapses were being shut down–was awarded the prize.

Nevertheless, the Loebner competition was by no means a total loss. For one thing, conversational shortcomings notwithstanding, the computers that participated in the contest were the products of some pretty inspired programming. More important, the contest helped raise anew the philosophical questions of just what it means for a computer to be intelligent and whether we can ever build a machine that can actually think.

When New York businessman Hugh Loebner first proposed the annual competition, he modeled it after a theory of machine intelligence developed four decades ago. In 1950 British computer pioneer Alan Turing wrote a paper arguing that the measure of a computer’s smarts is not the amount of data that can be crammed into its memory, but the way the machine behaves. One Turing proposal for testing machine behavior was to have a human interrogator interview a computer and a human–both hidden from view–and then see if it was possible to tell which was which. The interrogator would sit in front of a monitor, tap out questions on a keyboard, and evaluate the answers that were sent back.

The Turing test isn’t perfect. Turing himself pointed out that it’s fundamentally unfair to the computer, since no one expects a human being to act like a microprocessor, spitting out the cube root of pi in a flash, for example. But the computer would be asked to act like a human, thinking spontaneously and expressing itself articulately. So far, however, many computer experts consider the Turing test one of the best yardsticks of computer intelligence ever developed.

The Loebner committee decided to use the Turing test, but to limit the conversation to particular subjects. The reason was simple: there just weren’t any computer programs out there that could handle a completely free-ranging discussion. The committee thus planned an annual “restricted test,” which would carry a $1,500 prize; someday, when computers were ready for an unrestricted format, the winner would get $100,000 and the competition would end.

The prize money, apparently, was not much of a lure. The Computer Museum issued press releases bragging that “there were over 130 requests for application materials from individuals and institutions in the United States, Great Britain, and the Soviet Union.” But that doesn’t mean 130 programs were actually submitted. When Robert Epstein, a psychologist affiliated with the Cambridge Center for Behavioral Studies and the person in charge of the contest, was asked for the number of entries, he said he wasn’t authorized to release that information.

“There were far fewer than that,” he did say, “not many more than were accepted.” Six programs, as it turns out, were accepted. They were joined by two hidden humans.

The low response suggests that many computer researchers simply didn’t believe the competition proved much. Doug Lenat certainly didn’t. Lenat, principal scientist at the MCC computer company in Austin, Texas, is heading a ten-year project whose goal is to give a computer common sense. Common sense, according to Lenat, is made up of millions of little everyday rules of thumb and the ability to draw larger inferences from them. Lenat believes that once his computer program has been spoon-fed 10 million facts–which will happen by 1994–it will reach a critical intellectual mass. At that point it will be able to learn about the world by itself simply by reading books. It will also be able to have conversations.

Working from this somewhat exalted perspective, Lenat was inclined to dismiss the Loebner competition almost out of hand. “The more you restrict the question,” he says, “the harder it is to tell if something is actually intelligent.”

Compared with the computer Lenat is trying to design, the machines in the Loebner competition were downright dim-witted; most were restricted by their programming to such highly specialized subjects as Burgundy wine or women’s fashions. Weintraub entered PC Therapist under the heading “whimsical conversation.” He got the idea for his software in 1986 while playing around with a famous therapy program known as ELIZA, developed in the 1960s by MIT computer scientist Joseph Weizenbaum.

Weizenbaum modeled his program on the techniques of Carl Rogers, a psychiatrist who listened to his patients and then threw their statements back at them as questions. Here, for example, is a typical ELIZA conversation:

Human: You are like my father in some ways.

Eliza: What resemblance do you see?

Human: You are not very aggressive, but I think you don’t want me to notice that.

Eliza: What makes you think I am not very aggressive?

Human: You don’t argue with me.

Eliza: Why do you think I don’t argue with you?

Human: You are afraid of me.

Eliza: Does it please you to believe I am afraid of you?

Human: My father is afraid of everybody.

Eliza: What else comes to mind when you think of your father?

Human: Bullies.

The program’s popularity spread among early computer users with the speed of a Pong or Game Boy, prompting some observers–including Carl Sagan–to suggest that it could be the basis for an actual computerized therapy service. But to Weintraub ELIZA was repetitive and boring, so he decided to write his own–presumably more interactive–therapeutic software.

Like ELIZA, PC Therapist understands typed conversation by using a process known as “parsing”–identifying the part of speech that each word in a sentence represents, and then constructing a diagram that tells it how the words relate to each other. If you type “I am still happy” into PC Therapist, the computer knows that “I” is obviously a subject, and “am” can only be a verb that links some attribute to “I.” So far, so good. When the program gets to “still happy,” however, things get a little harder.

If the computer just clunked ahead word by word, it would initially conclude that the sentence was “I am still”–with “still” meaning “stationary,” rather than being an adverb modifying “happy.” Instead, when PC Therapist encounters ambiguity it steps back and looks at the entire map of the sentence, analyzing the way the words juxtapose before assigning them any final meaning.

Once that meaning has been determined the program ties to come up with a response that has something to do with what it has recognized. Sometimes those responses are completely predetermined. If you ask PC Therapist what it thinks of George Bush, Weintraub has programmed it to reply, “Bush is not a great president, but at least he is a professional politician with some common sense.” More often the reply caroms off a word or two in your question and flies away into its own dimension. Sometimes these responses make sense, other times they do not. When one of the judges asked PC Therapist if it makes a living saying unpredictable things, the computer responded, “In all things it is better to hope than to despair. I wish I could take a little nap.”

Though PC Therapist doesn’t always sound conversational, Weintraub does manage to make it avoid sounding redundant. The data base is enormous, and it is revised as you talk to it. Things you say get incorporated and reappear later in the computer’s own responses. This ability to pick up new information is a basic ingredient of an artificially intelligent machine.

The ten Loebner judges who sat down with PC Therapist and the other programs were not computer experts, but people from everyday walks of life. These lay judges, the reasoning went, would be unfamiliar enough with the parroting and evasiveness of a computer that they could not be sure if they were conversing with a somewhat elliptical person or a computer that was doing the best it could. Similarly, a human respondent whose answers were extremely precise might be mistaken for a computer. For example, one of the human subjects who was interrogated had picked Shakespeare as her subject since she could quote lines at the drop of a hat. As a result, three of the judges thought she had to be a machine.

After all the “conversations” were completed, however, it was PC Therapist that stole the show. A number of judges didn’t enjoy talking to the program (one of them even typed to it, “I’ve gotten more intellectual excitement from playing PacMan”) and correctly concluded that they were conversing with a machine. Overall, however, the judges were apparently captivated by the program’s flaky, vaguely poetic conversation; PC Therapist was ranked the third most human of all the respondents, finishing behind the two real humans but ahead of all the other computers.

“In our society we’re very tolerant of certain kinds of people: wise men, poets, the mentally ill,” says Epstein. “We’ll accept them as human even if what they say is nonsense.”

Weintraub plans to continue working on PC Therapist, hoping to expand its data base. Epstein, however, believes that it will still be at least three to five years before the contest can grow to an unrestricted Turing format. When it does, the field of competitors could be interesting indeed. “You can think of what we’ve been doing for the past seven years as trying to pass the Turing test,” says Lenat. “If we do enter, we’d expect to win it. Nothing’s even on the drawing board that’s of the scope of what we’re doing.”

Whoever wins the ultimate contest, it will be a welcome vindication–not only for the inventor, but for the science of artificial intelligence as well. Too bad the computer can’t enjoy all that steak.

Copyright 1992 Discover Magazine. Reprinted with permission.