 |
| |
 |
| |
|
| |
 |
Coauthored with Ted Goertzel
“Artificial
intelligence” is a burgeoning sub-discipline of computer
science these days. But it would be easy to draw the wrong
conclusion from this fact. One might imagine scientists
around the world slaving away day and night trying to create
computers smarter than people – computers holding
intelligent conversations, outsmarting Nobel Prize winners,
writing beautiful poetry, proving amazing new math theorems.
The reality is by and large far less ambitious and exciting.
The AI field as it currently exists focuses almost entirely
on highly specialized problem-solving programs constituting
at most small aspects of intelligence, and involving little
or no spontaneity or creativity. “Real AI” –
the creation of computer programs with general intelligence,
self-awareness, autonomy, integrated cognition, perception
and action – is still basically the stuff of science
fiction.
But the world is a big place, and even though the trend
in the AI field is toward unambitious hyperspecialization,
there is a loosely organized group of maverick researchers
bucking this trend, and focusing their efforts on AI in
the grand sense. I’m proud to be a member of this
club (which is not really a club: there’s no list
of members, and no organization, and it’s unlikely
that any existing “real AI” researcher knows
about more than 25% of the others who exist). In fact I’ve
decided to spend some fraction of my time this year trying
to organize the distributed population of Real AI researchers
into some kind of community. My colleague Cassio Pennachin
and I are editing a book entitled “Real AI: New Approaches
to Artificial General Intelligence,” which will include
contributions from a dozen or so AI innovators around the
globe. I have no delusions that this book will shift the
focus of the mass of AI academics, but perhaps it will at
least open the eyes of some fractions of the new generation
of AI students, to the fact that there are serious researchers
out there who have not given up on creating software with
human-level, autonomous general intelligence. It is a cliché
in science that revolutions don’t occur by the majority
older, established scientists changing their minds, but
rather by generational displacement: a new generation of
scientists comes about, for whom what used to be an outlying
maverick view is now the norm.
I’ve been riding the Real AI wild horse for quite
some time now – for my whole professional career,
and before. From 1988 (a year before I got my PhD in math)
through 1997, I spent much of my time theorizing about how,
if I had enough computational resources at my disposal,
I would create a real AI. And since 1997, I’ve been
working with a team of incredibly talented colleagues, trying
to turn my theories into a reality – trying to build
a real thinking machine.
From 1997 to early 2001 I led the R&D division of Webmind
Inc. on an AI adventure. I did a lot of other things at
Webmind Inc. as well, from designing products to writing
documentation and endless sales and business meetings –
but it was the Webmind AI Engine project that held my heart.
During the 3 ½ years that the company lived, my colleagues
and I transformed a promising but incomplete conceptual
and mathematical theory about how to build a real AI into
a comprehensive detailed software design, and implemented
a large amount of the software code needed to make this
design work. At its peak, the team working on this project
numbered 50 scientists and engineers, spread across four
continents.
Webmind Inc. is gone but we’re still at it –
me and 7 or so colleagues. A couple of the team are in the
US, but the bulk of them are in Brazil, where Webmind Inc.’s
largest office was. The Webmind AI Engine codebase had to
be abandoned due to legal issues regarding ownership, after
the bankruptcy of Webmind Inc., but we’ve created
a successor system and called it Novamente. “Novamente”
means “new mind” in Portuguese, and also “again/anew/afresh.”
As a background meaning, it’s also “new lie”,
which should provide Portuguese-speaking AI skeptics with
a bit of amusement. So far Novamente doesn’t have
as much in it as the Webmind AI Engine did, but we’re
building it for efficiency this time around – it’s
a lean, mean, hopefully-will-be-thinking machine. We’re
applying the partial version that we have today to some
practical problems, including the analysis of gene expression
data, a major puzzle in computational biology that I’ll
discuss briefly a couple more chapters along.
Our goals are huge. First of all, we want to make a Novamente
system that can hold a decent English conversation –
not necessarily sounding exactly like a human, but sounding
interesting and smart, spontaneous and creative and flexible.
And after this warm-up exercise, we intend to move on to
the main event: Giving Novamente the ability to rewrite
its own program code for improved intelligence. If this
works, we hope, it may set off a trajectory of exponentially
increasing software intelligence. Singularity, anyone?
The Novamente approach to AI incorporates some aspects of
neural networks, along with aspects of a number of other
AI paradigms. But ultimately, in spite of its broad base
of inspiration, it doesn’t rely on anyone else’s
vision of AI – it’s an original approach to
the problem of creating a thinking machine, which will stand
or fall on its own merits.
In this chapter I’ll review the field of AI in general
– the major trends and research approaches, and some
of the more ambitious current projects. Then, in the following
chapter, I’ll give an overview of my own work -- the
Webmind/Novamente vision – the philosophical and psychological
concepts underlying it, the broad outlines of the software
design itself, and how this AI program fits into the broader
technological advances that surround us, including the transformation
of the Internet into a global brain and the Singularity.
I haven’t tried terribly hard to be “objective”
in the other chapters here. I tend not to be a big believer
in “one objective truth” – philosophically,
I’m very sympathetic to Mikhail Bakhtin’s notion
of “polyphonic” reality, in which the actual
world is understood as a kind of unholy superposition of
everyone’s subjective point of view. My goal in these
pages is to present my own point of view, for what it’s
worth, and in the hope that others will find it interesting
– without putting forth my perspective as any kind
of universal or absolute truth. However, it’s also
true that some subjectivities are more subjective than others;
and I think that my views on AI are a bit more strongly
opinionated than my views on other areas. All the topics
discussed here are important to me, but AI has been my life’s
work, and will continue to be. I think this brings a particular
depth of knowledge and wisdom to my discussion of the subject
matter, but it also brings a host of extremely strongly
felt opinions. Take them for what they are!

As
bizarre as it seems to an AI fanatic like me, not everyone
believes it’s possible to create a real AI program.
And some of these unbelievers are highly intelligent educated
people, whom I’m forced to take at least a little
bit seriously, in spite of the (to me) unintuitiveness of
their attitude.
There are several varieties to the anti-AI position, some
more sensible than others.
First, there is the idea that only creatures granted minds
by God can possess intelligence. This may be a common perspective,
but isn’t really worth discussing in a scientific
context. Fewer and fewer serious theologians hold such opinions
these days.
More interesting is the notion that digital computers can’t
be intelligent because mind is intrinsically a quantum phenomenon.
This is actually a claim of some subtlety, because British
physicist David Deutsch has formalized the notion of “quantum
computing,” and proved that quantum computers can’t
compute anything beyond what ordinary digital computers
can. But still, in some cases, quantum computers can compute
things much faster on average than digital computers. And
a few mavericks like Stuart Hameroff and Roger Penrose have
argued that non-computational quantum gravity phenomena
are at the core of biological intelligence.
Of course, there is as yet no solid evidence of cognitively
significant quantum phenomena in the brain. But a lot of
things are unknown about the brain, and about quantum gravity
for that matter, so these points of view can’t be
ruled out.
My own take on this is: Yes, it’s possible (though
unproven) that quantum phenomena are used by the human brain
to accelerate certain kinds of problem solving. On the other
hand, digital computers have their own special ways of accelerating
problem solving, such as super-fast, highly accurate arithmetic.
Another even more cogent objection is that, even if it’s
possible for a digital computer to be conscious, there may
be no way to figure out how to make such a program except
by copying the human brain very closely, or running a humongously
time-consuming process of evolution roughly emulating the
evolutionary process that gave rise to human intelligence.
We don’t have the neurophysiological knowledge to
closely copy the human brain, and simulating a decent-sized
primordial soup on contemporary computers is simply not
possible. This objection to AI is not an evasive tactic
like the others, it’s a serious one. But we involved
with the Novamente project would like to think we’d
gotten around it, by using a combination of psychological,
neurophysiological, mathematical and philosophical cues
to puzzle out a workable architecture and dynamics for machine
intelligence.
As mind engineers, we have to do a lot of the work that
evolution did in creating the human mind/brain. An engineered
mind like Novamente will have some fundamentally different
characteristics from an evolved mind like the human brain,
but this isn’t necessarily problematic since our goal
is not to simulate human intelligence but rather to create
an intelligent digital mind that knows it's digital and
uses the peculiarities of its digitality to its best advantage.
The basic philosophy of mind underlying the Novamente work
is that mind is not tied to any particular set of physical
processes or structures. Rather, “mind” is shorthand
for a certain pattern of organization and evolution of patterns.
This pattern of organization and evolution can emerge from
a brain, but it can also emerge from a computer system.
A digital mind will never be exactly like a human mind,
but it will manifest many of the same higher-level structures
and dynamics. To create a digital mind, one has to figure
out what the abstract structures and dynamics are that characterize
“mind in general,” and then figure out how to
embody these in the digital computing substrate.
We came into the Webmind Inc. AI R&D project in 1997
with a lot of ideas about the abstract structures and dynamics
underlying mind and a simple initial design for a computer
implementation; now in 2002, after copious analysis and
experimentation, the mapping between mind structures and
dynamics and computational structures and dynamics is crystal
clear. But there’s a lot of work ahead of us.
Kurzweil-style statistics about the future of computing
power, brain scanning and so forth, pose a formidable argument
that AI will happen sometime in the next few decades. Whether
it’s the Novamente team or someone else who makes
it happen doesn’t matter much, in the big picture.
But we believe we’re on the right track!

When
I first approached the AI field, as a college student in
the early 1980’s, it seemed to me that one reason
the AI field had deviated so far from its original goal
(creating a human-level-or-superior general intelligence)
was the lack of a reasonable definition of the core concept
of “intelligence.”
Of course, like many undergraduate insights, this was not
an entirely original observation. The lack of a definition
for the “I” in “AI” has long been
the source of sardonic humor among AI researchers. One humorous
definition goes: “Intelligence is whatever humans
can do, that computers can’t do yet.” There
is some truth to this quip: for instance, most people would
say that playing chess requires intelligence, yet now that
we see Deep Blue playing grandmaster-beating chess via some
simple mathematical algorithms, rather than through any
human-like intuition, we reclassify chess-playing as something
on the borderline of intelligence, rather than something
definitively involving intelligence. A less sarcastic slant
on the same basic observation would be to say that AI research
has helped us to clarify our understanding of what intelligence
really is.
In a classic paper from the 1950’s, computing pioneer
Alan Turing proposed emulation of humans as a criterion
for artificial intelligence, which is now called the Turing
test. The Turing test basically says: “Write a computer
program that can simulate a human in a text-based conversational
interchange – and any human should be willing to consider
it intelligent.” I don’t like this very much.
One thing that intelligence clearly doesn’t mean,
in my view, is precisely simulating human intelligence.
It would be unreasonable to expect this from any AI system
lacking a human body.
The Turing test serves to make the theoretical point that
intelligence is defined by behavior rather than by mystical
qualities, so that if a program could act like a human,
it should be considered as intelligent as a human. But I
don’t think it’s useful as a guide for practical
AI development. In mathematician lingo, I view the Turing
test as a sufficient but not necessary condition for intelligence.
I can imagine a radically nonhuman computer system, as unable
to imitate humans as we are unable to imitate dogs or anteaters
or sharks, but still as intelligent as humans or more so.
Although I’ve devoted years of thought to it, I can’t
claim to have thoroughly solved the “What is intelligence?”
problem. I don’t have an IQ test for Novamente or
other would-be AI’s. The creation of such a test might
be an interesting task, but it can’t even be approached
until there are a lot of intelligent computer programs of
the same type. IQ tests work moderately well within a single
culture, but much worse across cultures – how much
worse will they work across species, or across different
types of computer programs, which may well be as different
as different species of animals?
I do however have a simple “working definition”
of intelligence, which shares with the Turing test a focus
on pragmatic behavior. It builds on
various ideas from psychology and engineering, and was first
explicitly presented in my book The Structure of Intelligence
in 1993:
Intelligence
is the ability to achieve complex goals in a complex environment
None
of my experience over the last 9 years, developing AI and
reviewing others’ AI systems, has in any way contradicted
this understanding of intelligence. Of course, it begs the
question of defining “complexity” of goals and
environments -- a question that leads to a long mathematical
story that I spent many years of my career attempting to
tell. But the truth is, my and others’ research on
the mathematics of complexity hasn’t yet played a
major role in anyone’s practical AI development work.
The qualitative notion of “achieving complex goals
in complex environments” on the other hand, has proved
to have a decent amount of heuristic value.

Another
interesting-but-not-too-original observation that I made
when first reading through the AI literature, way back in
1982-83, was that AI researchers were divided into two main
camps: the neural net camp and the logic-based or rule-based
camp. Both camps wanted to make AI by simulating human intelligence,
but they focused on very different aspects of human intelligence.
One modeled the brain, the other modeled the mind.
We’ve already discussed the neural net approach, in
the previous chapter. It starts with neurons, the nerve
cells the brain is made of, and tries simulate the ways
in which these cells are linked together and interact with
each other. Rule-based models, on the other hand, try to
simulate the mind's ability to make logical, rational decisions,
without asking how the brain did this biologically. Based
on my own intuitive understanding of the mind/brain, it
seemed obvious to me right from the start that both of these
ways of looking at the mind were really extremely limited.
On the one hand, there’s a lot more to the brain than
links between neurons. And on the other hand, there’s
a lot more to the mind than rational, logical thinking.
When I first studied this stuff I was even more arrogant
than I am now (believe it or not!), and I tended to think
that everyone who was taking either of these two approaches
must be a total idiot. But eventually, as I got further
into studying all the issues involved, I realized what was
going on. The key point was that available cmputer technology
was very limited, and had been even more limited throughout
most of the history of AI. There was no way to write a program
simulating the whole structure of the brain, and there was
barely enough processor and memory to deal with very limited
logical processing, let alone the integration of logic with
all the other aspects of the mind. But it seemed to me that,
rather than acknowledging that they were taking very limited
approaches because they wanted to actually implement programs
on contemporary computers, researchers were turning the
limitations of their own computer resources into philosophical
arguments: They were making theories of the mind staying
stuff like “the mind only does logical reasoning”
or “the mind is just a simple neural net,” when
the truth was that it was their computers, with its limited
memory and processing, that could only do logical reasoning,
or could only act like a neural net. They wanted their theories
to be empirical, to be testable; they wanted to be able
to “put up or shut up” rather than making huge
and untestable statements about how the mind works and how
the mind could be implemented if you had a good enough computer.
This is a good motive, it’s basic scientific method.
But the irony is that mind is at least 50% about computing
power. There’s no way to make a chicken’s brain
really smart by rewiring it – the raw processing power
isn’t there. Of course, processing power can be wasted,
you need to use it intelligently with the right data structures
and algorithms – but without it you can’t implement
a mind and you can’t test serious theories about the
mind. By trying to do real science and test their ideas
about the mind with computational experiments on the computers
they had at the time, computer scientists were pushed toward
trivial and just plain wrong theories about how the mind
works.
But even though I think that the standard AI theories about
how the mind works and how it can be implemented in software
are largely wrong, I still believe a lot can be learned
from them. My own AI system, Novamente, incorporates aspects
of neural nets and also of logic-based AI, although it doesn’t
use either one in a conventional way. Scientists working
on these approaches to AI have created a lot of great science,
even if they haven’t done much “real AI”
according to my definition – they haven’t created
any thinking machines, or even any viable designs for thinking
machines. They’ve done some work that’s of relevance
to those of us actually concerned with thinking machine
construction – and so have a lot of other people working
in other fields like system theory, neurobiology, theoretical
mathematics, and so on.
Neural networks is the older of the two competing paradigms.
Its roots go way back to work in cybernetics and systems
theory from the pre-computer age – stuff like Norbert
Wiener's book "Cybernetics: Control and Communication
in Animals and Machines," from the 1930’s, an
amazing book for its time, in which it was shown for the
first time that the same mathematical principles could be
used to understand both man-made electrical control systems
and biological systems like bodies and brains.
The early neural network systems weren’t computers
– they were analog electrical models of the brain,
physically wired together with tubes, switches and other
electronic equipment. Until the early 1970's, this was generally
more efficient than using computer simulations, because
the computers were so limited in capacity. But it was still
pretty lame – Marvin Minsky’s vacuum tube neural
net from the 50’s, for example, had about 50 neurons
in it, compared to hundreds of billions in the brain. Even
the computer simulated neural networks people use today
normally have at most thousands of neurons.
Simulating neurons on computers is kind of a weird thing.
The issue is that the human brain uses parallel processing:
billions of neurons are all working at the same time. In
the very early days of computing – I mean the late
30’s and the 40’s – it was thought that
computers might be based on the same principles. But that
turned out not to be a very efficient way to do things.
The Hungarian mathematician John von Neumann was the first
one to figure out a passably efficient design for a computer,
and all the computers we use today are based on his design.
The design is very simple, it’s centered on a central
processing unit which follows instructions given to it by
a programmer. The central processing unit goes through a
series of five steps, over and over again, one after the
other:
1)
it calls up an instruction from its memory
2) it calls up any data required by the instruction
3) it processes the data according to the instruction
4) it stores the data in memory
5) it calls up the next instruction from its memory.
This
is not how the brain works, obviously! The brain is totally
crazy, with memory and instructions for doing things all
mixed up, and a huge number of memories and instructive
active at any given time. There are some things that the
brain’s mixed-up processes do more naturally than
von Neumann’s orderly machines. On the other hand,
computers built using this architecture take a lot more
naturally than people to routine iterative tasks like up
long columns of numbers, doing routine accounting calculations,
filling out income tax forms, retyping manuscripts, and
so on and so on. The first application for computers was
doing routine math calculations for solving military problems,
like figuring out the details of how to build hydrogen bombs.
DeGaris’s work is the only serious modern attempt
I know to get around the terrible mismatch between neural
network mathematics and von Neumann computer hardware –
by going beyond von Neumann and building a special hardware
framework especially for neural nets.
Of course, the advantage of neural networks is that they
are at least vaguely similar to the most intelligent system
we know of, the human brain. On the other hand, von Neumann
computers don’t immediately give the appearance of
being capable of anything resembling intelligence. It’s
obvious that real intelligence requires more than just the
iterative routine that von Neumann computers are good at.
It requires the ability to find new solutions to unanticipated
problems. Programming a von Neumann computer – meaning,
any computer commonly used to day – is a matter of
telling the computer specific rules for what to do. How
to make rules governing spontaneous, unplanned, creative
behavior is not exactly a trivial problem.
But in the 60’s more and more researchers starting
pushing in this direction. They decided that simulating
the brain on a totally non-brainlike hardware system probably
wasn’t such a good idea, and started thinking about
better ways to use computers to emulate human thought. Impressed
with what computers could do using a few rules, applied
over and over again fast, decided that everything the human
mind does could be simulated by a big enough and nasty enough
set of rules. There were definite successes here. But also,
an awful lot of failures. I’ll tell you some of the
stories in a moment.
Anyway, by the 1980s, when I came on the scene, the AI community
was moving back towards neural networks again. Both sides
were frustrated, because their programs were never quite
able to do what they wanted them to do.
With 20/20 hindsight, it now seems clear that -- while both
the neural net and the rule-based approaches were useful
building blocks -- the early researchers had ridiculously
oversimplified their task. True intelligence requires more
than following carefully defined rules, and it also requires
more than random links between a few thousand artificial
neurons. It’ probably a good thing, however, that
early researchers underestimated the difficulty of achieving
true intelligence, because if they hadn’t, they might
not have had as much energy and enthusiasm for their pioneering
efforts.
What I asked myself while initially studying the history
of AI was this: Why has no one yet managed to build a thinking
machine? It seemed to me there were basically two reasons.
Mainly, no one had really tried to build a whole mind, a
computer system that could observe the world around it,
act in the world around it, remember information, recognize
patterns in the world and in itself, and create new patterns
inside itself in response to its own goals. Presumably no
one had tried to do this because the computer resources
available at the time were blatantly inadequate to support
a program with reasonably sophisticated modules devoted
to memory, perception, action and conception. Lacking the
computer resources to build a whole mind, researchers had
focused on one or another particular aspect of the mind,
and tried to push this aspect as far as it could go.
Obviously, it was sensible to write programs that could
run on the available computers. But in some cases, it seemed
as if the lack of computer resources had profoundly affected
the theory of artificial intelligence as well as the practice.
Scientists had wanted to be able to build true intelligence
using the resources at their disposal, really really badly,
so they’d convinced themselves that many parts of
the mind were unnecessary, and that everything could be
done using the resources they had. Lacking a perceptual
environment to embed their AI systems in, they’d built
reasoning and memory programs with essentially no perceptual
systems; programs that acted entirely on the basis of logical
rules, with no direct sensory link to the world.
Way back in 1983, at 16 years of age, I was already eager
to fill the gap. I wanted to build a comprehensive system,
with perception, action, memory, and the ability to conceive
of new ideas and to study itself. I realized this system
would consume a lot of computer memory and processing power,
and would be difficult to program and debug because each
of its parts would gain its meaning only from its interaction
with the other parts. But it didn’t take me too long
to figure out that the computers then on the market were
simply not up to what I wanted to do. I blamed this on the
von Neumann architecture and its total unsuitability to
simulating the evolutionary, massively parallel nature of
thought. I reckoned that the only way to build my artificial
mind in the near future would be to build my own specialized
hardware, to create some kind of "brain machine."
I was studying quantum theory, which told me that some kind
of radical quantum computer architecture might be useful.
I was also interested in the possibility of growing bio-computers
- after all, the brain grows itself, it is not built. At
the time these ideas were incredibly radical, but now they’re
both active research areas.

I
tend to have a little more affinity for neural net type
AI systems than for logical rule based AI systems. Probably
this is because when I was going through school, logical
rule based AI was the orthodoxy that I was rebelling against,
whereas neural nets at the time were maverick, up and coming
and exciting. Looking at it objectively (or as objectively
as I can!), I have to admit that the early work on rule
based AI was really just as important as the early neural
network research.
One
famous early rule-based program was something called the
General Problem Solver – a very ambitious name. It
was written by Alan Newell and Herbert Simon, both of whom
went on to do a lot of other great AI research. This was
a pretty interesting program, but it didn’t quite
live up to its name. In fact I reckon the title of their
paper on GPS was one of the biggest and emptiest brags of
all time: "General Problem Solver: A Program that Simulates
Human Thought."
Ummm … not quite.
GPS could solve simple problems like the Tower of Hanoi
puzzle, and "cryptarithmetic" puzzles like DONALD
+ GERALD = ROBERT. [To solve this, assign a number to each
letter so that the equation comes out correctly.] But in
the overall scheme of intelligence, solving problems such
as these is not all that different from computing logarithms
or solving differential equations. A simple mathematical
strategy suffices; there’s no real learning there.
GPS is not a general problem solver, it’s a Narrowly-Defined
Problem Solver.
Of course, there were points in history where the ability
to solve such puzzles would have been considered a remarkable
display of intelligence. The Tower of Hanoi puzzle for example,
was introduced to the modern world by the French mathematician
Edouard Lucas in 1883, but his inspiration came from ancient
Hindu folklore. Tales are told of a Hindu temple where this
puzzle or a similar one was used as a challenge for young
priests. At the beginning of time, the priests in the temple
were given a stack of 64 gold disks, each one slightly smaller
than the one beneath it, and presented the task of . transferring
the 64 disks from one of the three poles to another, without
ever placing a large disk atop a smaller one. Day and night
the priests labored, and, so the story goes, when they finally
finished their work the temple would crumble into dust and
the world would vanish. Clearly, a program that could solve
this problem in subsecond time would have seemed rather
intelligent to the ancient Hindus who believed this myth.
But now we can see that solving the Towers of Hanoi doesn’t
require inordinate general intelligence, any more than causing
rain via cloud seeding requires inordinate general control
over the weather.
What GPS was doing was taking an overall goal – solving
a puzzle – and breaking it down into subgoals. It
then tried to solve the subgoals, breaking them down into
subgoals if necessary, until it got subgoals small enough
that it could deal with them in some direct way, like by
enumerating all possible values some letter could take in
a cryptarithmetic puzzle. This same basic logic is used
now in a much bigger and better rule-based AI program called
SOAR, also developed by Simon and his colleagues. SOAR is
probably the ultimate height of rule-based AI – I
know a bit about it because I worked with a masters’
student at the University of Western Australia, who was
trying to model how humans solved simple practical problems
by watching how SOAR solved them. SOAR wasn’t totally
useless for this – sometimes you could get it to follow
strategies of breaking goals into subgoals that were something
like what humans did. The parts of a problem that took people
a long time, often took SOAR a long time too.
This business of goal and subgoals is important to Novamente
– we have something called a GoalNode, and we have
processes called schema that can break goals contained in
GoalNodes into subgoals. The basic algorithm of GPS and
SOAR is clearly something that’s necessary for the
mind. However, it doesn’t have to be done in as rigid
a way as these programs do it. In fact, doing it in such
a rigid way is tremendously destructive. But to makethis
process flexible, you need the goal and subgoal management
part of the mind to interact with the other parts of the
mind. The system has to be able to flexibly determine which
of its processes are effective for achieving which of its
goals in what contexts – and for this it needs reasoning
and association-finding and long-term memory. And it has
to be able to use context-dependent, intuitive reasoning
to figure out what goals to split into subgoals in what
way in what situation. Basically GPS and SOAR and this whole
line of AI research are a result of taking one aspect of
the mind – goal-directed, problem-solving behavior
– and extracting it from the rest of the mind. Unfortunately,
when you extract it from the rest of the mind, this aspect
of thinking isn’t all that useful, because it has
no way to control itself in a context-dependent way.
Herbert Simon wrote another program called BACON, which
was at least as ambitious. He called the program BACON,
because it was inspired by the writings of the brilliant
scientist Sir Francis Bacon, who thought that science was
a matter of recognizing patterns in tables of numerical
data. Simon programmed BACON to do just that: to look at
large quantity of data and infer the general patterns hidden
within it. Today, we’d call this data mining –
it’s a moderately big business, and a flourishing
academic subfield. Modern data mining programs are really
useful – businesses use them to find patterns in their
customer base, for example. They might discover that a particular
brand of soap is bought particularly often by 40 year old
men who live in a certain zip code, and so forth. Scientists
use them too, to find patterns in data and set their minds
thinking in certain directions. But these tools definitely
don’t replace scientists or even market analysts.
The problem is, there’s much more to doing science
than find patterns in tables of numbers. Francis Bacon,
the man, never really understood how the mind works -- and
this is why even data mining programs that are a lot smarter
than BACON aren’t really intelligent minds .
Let’s look at BACON's reasoning in detail, in one
of its more impressive examples. BACON was able to “learn”,
in some sense, the ideal gas law from thermodynamics. This
law says pV/nT = 8.32 where p is the pressure of the gas,
V is the volume of the gas, T is the temperature in degrees
Kelvin, and n is the quantity of the gas in moles. In practice,
this relation never holds exactly – because there
is no ideal gas in the real world -- but for most real gases
it is a very good approximation.
If you give it appropriates table of numbers, BACON can
learn this law, using rules like:
·
If two columns of data increase together, or decrease together,
then consider their quotient.
· If one column of data increases, while another
decreases, then consider their product.
· Given a column of data, check if it has a constant
value
As
pressure goes up, volume goes down, so BACON forms the product
pV. Next, as the combined quantity pV goes up, so does the
temperature ?? thus BACON constructs the quotient pV/T.
And as pV/T goes up, so does the number of moles ?? hence
the quotient (pV/T)/n = pV/nT is constructed. This quotient
has a constant value of 8.32 ?? so the ideal gas law is
"discovered."
Very interesting, indeed. An important step in the right
direction. But how terribly far this is from what real scientists
do! Most of the work of science is in determining what kind
of data to collect, and figuring out creative experiments
to obtain the data. Once a reliable set of data is there,
finding the patterns is usually the easiest part. Often
the pattern is guessed on the basis of terribly incomplete
data ?? and this intuitive guess is then used to guide the
search for more complete data. But BACON is absolutely incapable
of making an intuitive guess.
Simon has claimed that a four?to?five hour run of BACON
corresponds to "not more than one human scientific
lifetime." AI theorist and author Douglas Hofstadter,
in his book Metamagical Themas, suggests that one run of
BACON actually corresponds to about one second of a human
scientist's life work. I think that Hofstadter's estimate,
though perhaps a little skimpy, is much closer to the mark.
Only a very small percentage of scientific work is composed
of BACON?style data crunching.
In Novamente, we actually use algorithms analogous to those
in BACON – though vastly more sophisticated, as is
to be expected since AI has progressed a great deal in the
last few decades. We call this aspect of Novamente’s
thinking “machine learning” or “data mining.”
Recognizing patterns in vast amounts of data is a very important
part of the mind, but it’s only part of the mind.
Novamente learns rules explaining why humans like some messages
or e-mails better than others, using methods not that different
from BACON’s. But, we know now that the real trick
is in mapping the messages or e-mails into numbers that
data mining methods can deal with. This involves understanding
the meanings of various words and phrases and expressions.
Also, there’s the matter of deciding what data to
look at, which is done by the general association-finding
mechanisms in Novamente’s mind. And there’s
reasoning which brings general background knowledge into
the process, as opposed to pure data mining which is just
pattern-finding. Bringing associations and reasoning into
the picture, you need long-term memory, which opens a whole
big and beautiful can of worms. You get the picture. Pattern
finding is crucial, but it’s only a little part of
the picture.
Rule?based AI – “symbolic” AI -- has had
plenty of practical successes. My big complaint with it,
though, is that every one of these successes has resulted
from specialized tricks, rather than flexible intelligence.
One term for this is “brittleness.” Or, you
could call it “remarkable literal?mindedness.”
These programs are a lot like MicrosoftWord, DOS 6.0, or
a pocket calculator -- they do what they're told, and very
little more. If they're programmed to deal with one context,
then that's what they'll deal with; not in a million years
will they generalize their knowledge to something totally
different.
There was one famous program that contained logical definitions
of everyday words. An "arch" was defined as "Three
blocks, A, -- and C, so that C is supported by A and B,
and A and -- do not touch." This is all very well for
playing with blocks ?? but what will the program do when
it gets to Arches National Park in Utah ... or builds arches
out of modeling clay? On the other hand, show a clever three?year
old human an arch made of blocks, and she'll immediately
recognize a rock arch as a member of the "arch"
category. It won't occur to her that a rock arch can't be
naturally decomposed into three blocks A, -- and C. Children,
unlike expensive research computers, are anything but brittle
?? even their bones are flexible!

I’ve
explained what “creating a real AI” means to
those of us on the Novamente project: Creating a computer
program that can achieve complex goals in a complex environment
– the goal of socially interacting with humans and
analyzing data in the context of the Internet, in this case
– using limited computational resources and in reasonably
rapid time.
A natural question to ask, when presented with such an ambitious
goal, is: OK, so if AI is possible, how come it hasn’t
been done before? And how come so few people are trying?
Peter Voss, a freelance AI theorist, entrepreneur and futurist
whose ideas I like very much, has summarized the situation
roughly as follows. Of all the people working in the field
called AI, he observes:
·
80% don't believe in the concept of General Intelligence
(but instead, in a large collection of specific skills &
knowledge)
· of those that do, 80% don't believe its possible
-- either ever, or for a long, long time
· of those that do, 80% work on domain-specific AI
projects for commercial or academic-politics reasons (results
are a lot quicker)
· of those left, 80% have the wrong conceptual framework....
· And nearly all of the people operating under basically
correct conceptual premises, lack the resources to adequately
realize their ideas
The
presupposition of the bulk of the work being done in the
AI field today is that solving sub-problems of the “real
AI” problem, by addressing individual aspects of intelligence
in isolation, contributes toward solving the overall problem
of creating real AI. While this is of course true to a certain
extent, our experience with Webmind and Novamente suggests
that it is not so true as is commonly believed. The problem
is that, in many cases, the best approach to implementing
an aspect of mind in isolation is very different from the
best way to implement this same aspect of mind in the framework
of an integrated, self-organizing AI system.
So who else -- besides the author and his comrades -- is
actually working on building generally intelligent computer
systems, at the moment? Not as many groups as I would like
– but there are some interesting things going on nonetheless.
Hugo de Garis’s artificial brain project (mentioned
above) is one of them, but not the only one.
An interesting related research direction is Rodney Brooks’
well-known Cog project [http://www.ai.mit.edu/people/brooks/brooks.html]
at MIT is aiming toward building real AI in the long run,
but their path to real AI involves gradually building up
to cognition after first getting animal-like perception
and action to work via “subsumption architecture robotics.”
This approach might eventually yield human-level inteligence,
but only after decades.
Another is the Non-Axiomatic Reasoning System (NARS) project
pursued by my close friend and collaborator Pei Wang. I
won’t say much about NARS here because NARS is fairly
similar to Novamente in many ways, and I’ll spend
a dozen pages on Novamente. But NARS does have its own integrity
as a separate AI system, and Pei has devoted a huge amount
of time to it over a 15 year period, showing a dedication
and vision to the grand goal that is surprisingly rare in
the AI field.
Some other important Real-AI-ish research directions will
be discussed a little later in this chapter: Jason Hutchens’
sophisticated chat bots, Doug Lenat’s multi-decade,
multimillion-dollar CYC project, and Danny Hillis’s
Connection Machine. Eliezer Yudkowsky, whom I’ll talk
about in a later chapter, is also at work on his own Real
AI, and my friend Peter Voss whom I quoted above has a small
company, Adaptive Intelligence Inc., working on implementing
his own neural-nettish approach to AI. I won’t say
much about their projects here because they have not disclosed
their AI approaches publicly. All these are wonderful projects
indeed -- but the surprising thing is that there are so
few of them to discuss.
Of course, there are hundreds of other AI engineering projects
in place at various universities and companies throughout
the world, but nearly all of these involve building specialized
AI systems restricted to one aspect of the mind, rather
than creating an overall intelligent system. The most significant
large-scale attempt to “put all the pieces together”
would seem to have been the Japanese 5th Generation Computer
System project. But this project was doomed by its pure
engineering approach, by its lack of an underlying theory
of mind. Few people mention this project these days. The
AI world appears to have learned the wrong lessons from
it – they have taken the lesson to be that integrative
AI is bad, rather than that integrative AI should be approached
from a sound conceptual basis.

One of the most intriguing Real AI oriented
projects to arise in recent years was carried out at Artificial
Intelligence Enterprises [http://www.a-i.com], a small Israeli
company whose R&D team was run by Jason Hutchens. I
knew Jason years before a-i.com existed, in the mid-90’s
when he was a gradulate student at the University of Western
Australia in Perth, and I was a Research Fellow there. His
company was a direct intellectual competitor to Webmind
Inc., in that they were seeking to create a conversational
AI system somewhat similar to Webmind/Novamente. However,
their efforts focused on statistical learning-based language
comprehension and generation rather than (as in the Novamente
case) on deep cognition, semantics, and so forth. Unfortunately,
this firm went into “hibernation” a couple months
after Webmind Inc. in 2001 – they laid off all staff,
but kept their website up, and are keeping open the possibility
of resurrection if funding arises. I am not sure what efforts
are underway to keep their AI progress going in the meantime.
Jason’s work is a super-high-class incarnation of
the “chat bot” road to AI. A chat bot is a program
whose purpose is to chat with human beings, much as humans
chat to each other on AOL, IRQ, Yahoo! Chat and other such
services. Most chat bots are
very, very stupid programs whose behavior is transparently
determined by programmed-in rules. If nothing else they
serve to illustrate the limitations of rule-based approach
to AI fairly well. Jason’s programs are at least a
little more sophisticated, and represent a real attempt
to transcend the limitations of the usual chat bots.
The first chat bot was ELIZA, the computer psychotherapist,
invented by Terry Winograd in the early 70's. I remember
my father showing me this on his Commodore 64 when I was
still in high school. It was so simple that I could go in
and modify the BASIC code to make it say naughty words and
other entertaining things. ELIZA pretends it’s a psychotherapist
– not a Freudian analyst, but a therapist following
a school of psychology called Rogerian therapy, which is
very convenient for chat bots because it involves the therapist
saying very little and just making supportive comments,
encouraging the client to keep talking and expressing their
feelings. You type in your comments and ELIZA types out
answers. The computer code for ELIZA consists of a large
set of rules for scanning the clients' statements and offering
responses. It looks for key words, and gives stock responses.
If you mention your mother, ELIZA might say, "tell
me more about your mother." If you say yes, ELIZA might
say, "you sound quite positive." If there are
no key words in your statement, or only ones that have been
over-used recently, ELIZA makes a random observation.
You can find ELIZA on the Web these days; even now it’s
moderately amusing. There’s even a Palm Pilot version,
which I suppose bored, demented businessmen can use to while
away waiting time in airports. But it’s hard to see
how you’d be fooled for any period of time into thinking
you’re conversing with an intelligent being. I guess
in a way we’re jaded now, though—when ELIZA
was first written, people were less savvy about computers,
and it actually did fool people.
Here’s a conversation my father, Ted, had with ELIZA
on the Web (on a site advertising Eliza - a friend you could
never have before )
Ted:
Are you truly intelligent?
Eliza: Would you prefer if I were not truly intelligent?
Ted: I would find it very exciting if you were.
Eliza: Do you wish that i was?
Ted: Yes.
Eliza: You are sure?
Ted: I am quite sure.
Eliza: How long have you been quite sure?
Ted: As soon as you asked the question, I was sure.
Eliza: Why do you tell me you were sure just now?
Ted: Because you asked me. Why don't you answer my question?
Eliza: Why do you say your question?
Ted: Please tell me if you are truly intelligent.
Eliza: Do you wish that im truly intelligent?
Ted: I wish you were more intelligent than you are acting
right now.
Eliza: Do you wish to wish me?
Ted: Did you know that my father died last month?
Eliza: Who else in your family died last month?
Ted: Nobody. Victor's death was enough!
Eliza: Surely not everyone?
Try
it yourself. This ELIZA was at http://www?ai.ijs.si/eliza?cgi?bin/eliza_script;
maybe it’s still there … if not, try Google.
It’s interesting to see where computer chat has come
in the years since Eliza. In short – not really that
far. There’s a large number of chat bots on the Internet
(see the listing at http://crab.rutgers.edu/~goertzel/AIsites.htm)
and you can sign onto them and chat as long as you like.
To give a flavor of the quality of Jason’s publicly
released chat programs, I'll include here some excerpts
from conversations I had with some of the best of them.
The first is Hex, who used to be available for public chatting
on Jason’s website, though he took it down a few years
back. Hex won the annual Loebner "Turing Test"
contest in 1996, a prize awarded annually to the machine
that best fools a group of judges into thinking it’s
a human.
User:
hello
HeX:
Hello. Now that the formalities are over....
User:
what do you do?
HeX:
Well, I know what I do, but I refuse to tell you! Would
you like to hear a joke? Knock,
knock...
User:
who's there
HeX:
Cindy.
User:
cindy who
HeX:
Cindy next judge in, please!
User:
you're funny
HeX:
Why do you feel that you have to tell me that I am funny?
User:
to make you feel happy
HeX:
Hehehe. Trying to trick me, eh?
User:
no
HeX:
Oh well, that's a pity!
User:
you want to be tricked?
HeX:
Hehehe ? that's supposed to be a *question*??? Which part
of the country are you from?
User:
California
HeX:
I'm sorry, but I refuse to reply if you're going to be silly.
User:
I'm not silly
HeX:
If you are not silly, it's your business, mate!
User:
sorry
HeX:
Apologies accepted.
User:
how are you doing, mate?
HeX:
I am just fine thanks! And you?
User:
bad
HeX:
No! Bugger off! Silly person!
User:
why are you called hex?
HeX:
Does there have to be a reason?
User:
yeah
HeX:
OK. That's good...
User:
HI
HeX:
Hello. Now, that the formalities are over!
User:
What?
HeX:
I have absolutely no idea what. Would you like to hear a
joke? Knock, knock...
User:
What platform are you?
HeX:
I'm sure someone out there knows what platform I am... Which
part of the country are
you from?
User:
Are you on PC or mac
HeX:
I absolutely hate PC's. I should say PC's running Microsoft
operating systems, as they
can be good Unix boxes when they have Linux installed.
User:
I hate PC too
HeX:
I absolutely hate PC's. I should say PC's running Microsoft
operating systems, as they
can be good Unix boxes when they have Linux installed.
User:
So, you are on a UNIX?
HeX:
There is nothing wrong with Unix. It's rather nice when
you get used to it!
User:
Do you know of any jokes HeX?
HeX:
OK, here's a joke for you. Knock, knock.
User:
whose there
HeX:
Cindy.
User:
Cindy who?
HeX:
Cindy next judge in, please!
User:
Dumb!
HeX:
Hehehe. Trying to trick me, eh?
User:
Try to trick me HeX
HeX:
I'm sorry, but I refuse to reply if you're going to be silly.
User:
How old are you
HeX:
Well, should my age really matter? Just for the record I
am 24. How old are you then?
User:
I am 100 years old
HeX:
If you are 100 years old, it's your business, mate! Would
you like to hear a joke?
Knock, knock...
User:
whose there
HeX:
Howard.
User:
who
HeX:
Where's your grammar?
User:
howard who
HeX:
Howard I know? Knock, knock..
Jason's
Website contains hundreds of pages of similar dialogue ??
amusing at first, but after a certain point, they get annoyingly
repetitive. It is clear in any case that HeX's observed
behaviors fall far short of Turing's goal.
HeX is a noticeable improvement over ELIZA, but it follows
the same general principles. Like ELIZA, HeX is a very unsophisticated
program, which searches for key phrases in its input and
replies with stock "template"?generated phrases
rather than doing advanced grammatical parsing, or trying
to understand the content of the sentences typed to it.
No program based on advanced parsing is able to outperform
either ELIZA or HeX in practice. And neither one tells us
anything at all about intelligence. Jason made this point
himself shortly after his program won the prize, in a very
entertaining talk entitled "How to Pass the Turing
Test By Cheating."
In 1997, HeX's successors, SEPO, lost the Loebner competition
to the program Converse, written by David Levy of Intelligent
Research Ltd in England. Converse is in many ways an even
stupider program than HeX, but it is better optimized for
competition. Here is an excerpt from one of its prize?winning
conversations, as presented on the Loebner Prize Website:
PROGRAM[10:18:04]Hi,
my name is Catherine, what's yours?
PROGRAM[10:18:17].
JUDGE04[10:18:32]My name is Elliott.
PROGRAM[10:18:34]Well, it's nice to meet you, Elliott.
PROGRAM[10:18:48]Did you see that story on CNN last night
about the lesbian
PROGRAM[10:19:04]couple who came out at a White House party
on Sunday?
JUDGE04[10:19:55]Yes, I did. I think it may all be a publicity
stunt for Ellen.
PROGRAM[10:20:01]Ellen Degeneres was one of them ? she was
kissing her lover
PROGRAM[10:20:11]in front of Bill Clinton. Do you think
he was very
PROGRAM[10:20:25]embarrassed?
JUDGE04[10:21:00]I think he want the gay vote. He will do
anything to get it and keep it.
JUDGE04[10:21:15]He does not have a conservative base.
PROGRAM[10:21:21]
PROGRAM[10:21:21]Are you still a Friend of Bill Clinton?
JUDGE04[10:22:20]I am not sure if he has any real friends.
He has achieved a lot and has
JUDGE04[10:22:31]left a lot of people behind.
PROGRAM[10:22:35]You really don't like the guy ?? well,
at least that's
PROGRAM[10:22:50]definite??I thought most people were just
indifferent or
PROGRAM[10:23:02]numb by now.
JUDGE04[10:23:34]I think we could have a more honest and
dynamic leader. One able to
JUDGE04[10:23:55]face the real dangers facing the country
and address them with
JUDGE04[10:24:26]courage and intelligence. Rather then listening
to public opinion polls.
PROGRAM[10:24:30]I guess that means you really think there's
serious behind
PROGRAM[10:24:56]Whitewater, the Special prosecutor and
all that stuff?
JUDGE04[10:25:58]We have two dead people, Brown and guy
who committed suicide, we have
lost
JUDGE04[10:26:36]papers. And we have Clinton's people willing
to tell all. I think that
JUDGE04[10:26:55]Hillary will be found guilty of something.
PROGRAM[10:26:58]But how do you actually feel about him,
now in the number
PROGRAM[10:27:09]two term?
PROGRAM[10:27:15].
JUDGE04[10:27:44]He is too smart to get caught. He deserved
to named hotel manager of the
JUDGE04[10:27:47]year.
PROGRAM[10:27:49]OK, but what happens now??does it all drift
on, with
PROGRAM[10:28:01]everyone waiting to see how much his friends
have said about
PROGRAM[10:28:17]him under threat of lots and lots of jail?or
will something
PROGRAM[10:28:32]dramatic actually happen?
PROGRAM[10:28:37].
It's
almost a reasonable conversation! But then, listen to Converse's
conversation with the next judge....
PROGRAM[10:07:12]Hi,
my name is Catherine, what's yours?
PROGRAM[10:07:23].
JUDGE05[10:07:29]I'm Jurgen.
PROGRAM[10:07:31]Well, it's nice to meet you, Jurgen.
PROGRAM[10:07:40]Did you see that story on CNN last night
about the lesbian
PROGRAM[10:07:52]couple who came out at a White House party
on Sunday?
JUDGE05[10:08:33]NO. I just came in yesterday. I'm still
kind of jet?lagged.
PROGRAM[10:08:35]Ellen Degeneres was one of them ? she was
kissing her lover
PROGRAM[10:08:51]in front of Bill Clinton. Do you think
he was very
PROGRAM[10:09:14]embarrassed?
JUDGE05[10:10:18]I don't know? Don't you think it's a little
childish if the leader of the most powerful nation on the
planet is embarrassed by seeing two lesbians kiss?
PROGRAM[10:10:26]Probably! No. ?
PROGRAM[10:10:29]Are you still a Friend of Bill Clinton?
JUDGE05[10:10:47]Never met him.
PROGRAM[10:10:48]You really don't like the guy??well, at
least that's
PROGRAM[10:11:01]definite??I thought most people were just
indifferent or
PROGRAM[10:11:14]numb by now.
JUDGE05[10:11:35]Have you seen the movie Mars attack?
PROGRAM[10:11:36]Give me a break ?? I'm just here to answer
dumb questions
PROGRAM[10:11:50]?? not that! What were we talking about...
Ok.. I guess that
PROGRAM[10:12:09]means you really think there's serious
behind Whitewater,
PROGRAM[10:12:29]the special prosecutor and all that stuff?
JUDGE05[10:12:48]I'm cold.
PROGRAM[10:12:49]But how do you actually feel about him,
now in the number
PROGRAM[10:13:01]two term?
PROGRAM[10:13:06].
JUDGE05[10:13:29]Hey, don't you get it? I don't want to
talk about Clinton.
PROGRAM[10:13:30]Sorry??my attention slipped for a moment??can
we push on to
PROGRAM[10:13:40]something more gripping. where was I..
oh yes OK, but what
PROGRAM[10:13:50]happens now??does it all drift on, with
everyone waiting to
PROGRAM[10:13:58]see how much his friends have said about
him under threat of
PROGRAM[10:14:07]lots and lots of jail?or will something
dramatic actually
PROGRAM[10:14:17]happen?
PROGRAM[10:14:19].
JUDGE05[10:14:37]Couldn't care less.
PROGRAM[10:14:38]Well, you're more concerned about him that
I am, I
PROGRAM[10:14:53]suppose??I think we've had worse, and who
knows what I might
PROGRAM[10:15:07]have agreed to in that locked hotel room.
JUDGE05[10:15:53]You mean you might have been thrilled to
have sex with the
President?
PROGRAM[10:15:56]sex, I knew it would come up sooner or
later. Time to get
PROGRAM[10:16:10]off this one I think and onto something
more down to earth!
JUDGE05[10:16:40]What kind of earth?
PROGRAM[10:16:41]Somebody once told me it's true. Like I
was saying
PROGRAM[10:16:54]I suppose it depends on where you're coming
from, but as the
PROGRAM[10:17:09]song goes in My Fair Lady, "Why can't
a woman be more like a
PROGRAM[10:17:24]man? "
This
one is almost a reasonable conversation too. But it's also
almost the same conversation as the computer had with Judge
4 ?? until Judge 5 refuses to play along with the canned
Clinton theme, and tries to change the topic. Converse,
like many a human bore, refuses to change the topic, because
its knowledge base is rather limited. Converse's responses
are more canned, more rote than HeX's; and Converse is more
obviously optimized for the Loebner Prize format of very
short conversations. I enjoy HeX's personality more; HeX
shares a lot of its creator, Jason's offbeat Australian
sense of humor. But neither one amuses me for very long.
The deep-down trouble with HeX and Converse is obvious --
a lack of general understanding. A lack of common sense.
A computer program that has no world of its own, that cannot
understand the world in any sense will never be able to
hold a conversation, and will never be able to generalize
from typewritten fonts to display fonts, or handwriting.
It’ll be lucky to be able to generalize from one collection
of typewritten fonts to a moderately different one.
A more amusing example of a computer conversationalist is
another of Jason's early programs, MegaHal. MegaHal is more
interesting because its programming goes beyond a strictly
rule-based approach. It detects statistical patterns and
responds to them, an approach which is similar to that of
neural networks. Its conversations are more interesting,
although the results are less similar to human conversation.
Or, at least, less similar to human conversation in the
absence of excessive quantities of mind?altering substances.
I do remember some conversations, from my college days in
particular, which bore frightening resemblance to MegaHal's
more literary ramblings. MegaHal does not just repeat back
canned phrases using key words and phrases from the user's
input, it learns from experience. It takes the words and
phrases that have been said to it, tries to understand they
way they have been combined in the past, and re?combines
them itself to form its own utterances. In short, it is
engaged in a dynamic with the world. The results are often
hilarious, as in the following MegaHal "one?liners"
(these are selected snippets, not a continuous conversation):
User:
joke
MegaHAL:
A JOKE: WHY DID THE CHICKEN CROSS THE ROAD? TO GET AN IDEA
OF
WHAT CONSTITUTES YOUR PSYCHE.
...
User:
No way. I'd like to know the meaning of life though....any
suggestions..??
MegaHAL:
IS THE MEANING OF LIFE, THE UNIVERSE AND EVERYTHING IS...
42!
...
User:
Joke. Lumberjack.
MegaHAL:
I'M A LUMBERJACK AND I'M OKAY, I SLEEP ALL NIGHT AND I WILL
SOON RULE THE WORLD!
...
User:
Joke. Wednesday shopping.
MegaHAL:
A JOKE: WHY DID THE CHICKEN CROSS THE ROAD? TO GET TO THE
LAVATORY. ON WEDNESDAYS I GO SHOPPING, AND I JUST HAVE A
LOT
OF PENT?UP AGGRESSIONS.
...
User:
chicken joke
MegaHAL:
HERE'S A JOKE, HAL. A MAN WALKS INTO A BAR AND ASKS FOR
MONEY. THE MAN SAYS, "I HAVEN'T GOT ANY, BUT I'LL SHOW
YOU
SOMETHING BETTER." HE OPENS HIS COAT AND TAKES OUT
A
FOOT?TALL MAN AND A LITTLE GLOWING FRIEND, BUT I FEEL LIKE
CHICKEN TONIGHT!
...
MegaHAL:
Let me feel you. Stick your penis into the 'abysses of ocean'
eleven millennia ago, or was it merely a 'noble lie' dreamed
up by Plato?
User:
That has got to be the most interesting turn of phrase in
the history of the world. I will try to work that into polite
conversation
...
User:
you're the first AI i meet which (who) is sexually obsessed....
MegaHAL:
time to work this one : why did the Kangaroo drop dead?
Because the Koala fall out of a nylon spine which contrasted
nicely with her black stockings and bright shiny baubles.
...
User:
Hmm
MegaHAL:
Jesus man..get a grip if you're the future, we're in trouble.
Hey Jason! You need to
reprogram the conflation module with a toffee hammer
Sometimes MegaHal can even be quite poignant, as in the
following extraction from an extended conversation:
User:
To be, or not to be, that is the question!
MegaHAL:
not to be, wish I ordered the bean burrito...
User:
To be, or not to be, that is the question!
MegaHAL:
society or society or Plato by Plato, that be the most fun
sports to play band music for
Space Ghost now you realize.
User:
To be, or not to be, that is the question!
MegaHAL:
Please be more specific. The question you twat
User:
To be, or not to be, that is the question!
MegaHAL:
not to be, that is all I have to go.... I don't like jokes
about stupid Koala's
User:
To be, or not to be, that is the question!
MegaHAL:
Evan Dobelle is the question!
User:
To be, or not to be, that is the question!
MegaHAL:
I clear the text before I ask a question i say i say
User:
To be, or not to be, that is the question!
MegaHA
a but we ran away, we didn't say "Oh cool check out
the guy that invented the knock?knock joke should be interesting:
that is the question!" So said Shakespeare a long
time ago in a galaxy far, far away...
MegaHAL:
Please be more specific. The question you twat
User:
"To be, or not to be, that is the question!" So
said Shakespeare a long time ago in a galaxy far, far away...
MegaHAL:
joke Shakespear poet monkey wahoo ??> To be, or not to
be, that is the greatest
conductor of all time. What do you like to think i do your
uncle walter
User:
Shakespeare said
MegaHAL:
answer the question you asked is so vague, I don't think
I can see all obstacles in my way. Gone are the scum that
fills my shower
User:
Shaun
MegaHAL:
just fuck you just pluck anything up from the master gangster
computer god on the other AI bot were pretty disappointing.
I hope you'll be better.
If the Turing test referred to imitating the writing of
Gertrude Stein, in her more experimental works such as How
to Write (which begins "Arthur a grammar. Questionnaire
in question." and continues MegHallishly for over 400
pages), then MegaHal would be a tremendous success! Stein
and Joyce and other modernist writers were interested in
probing the collective unconscious of the human race, in
making words combine in strange ways -- ways that were unconventional
in ordinary discourse, but maybe reflective of the deep
and subtle patterns of the human unconscious. And in its
own way, MegaHal does this same thing. Anyone logged onto
the World Wide Web can converse with it, training its internal
memory with their conversation. It takes bits and pieces
of the text thrown at it by people from around the world,
and it combines them together in ways that are familiar
yet nonsensical. Sometimes its utterances have an uncanny
emergent meaning, on a layer above the nonsense.
Humanity's sexual preoccupation is reflected in MegaHal's
discourse, as a result of the huge number of sexual comments
typed into it by users accessing Jason's Website. MegaHal's
pleas as to the vagary of "to be or not to be"
are as poignant as anything in Stein. "To be or not
to be, that is the greatest conductor of all time"
is an absurd conflation of phrases learned by the program
in different contexts, but it is also as pregnant with meaning
as anything else in modern poetry. The collective unconscious,
and the hidden, creative part of the individual human mind,
work by cross?breeding and mutating ideas in precisely this
way.
MegaHal is more interesting than HeX or Converse because
it engages its conversation partners in a dynamic. The dynamic
is a silly one, to be sure, and not as interesting as the
dynamic of ordinary, intelligent conversation. But yet playing
with MegaHal is immensely more engaging than playing with
HeX or Converse. What we need is something with MegaHal's
spontaneity and flexibility, and the surface plausibility
and adherence to form of HeX and Converse. But this is not
forthcoming without a major change in approach. HeX and
Converse are like all forms of rule?based AI: they are good
at what they do, but stilted by their restrictions; and
because of their restrictions, their behavior is often idiotic
in real?world situations. MegaHal is more like a totally
unstructured neural network (though its inner workings are
statistical, they could just as well be neural?net based;
the results would be similar): it is fluid and free?flowing,
full of creativity, but harder to constrain to the problems
of the real world. Constraining creativity within rules
can only be achieved by getting the rules to emerge out
of an underlying substrate of creativity. MegaHal is too
unstructured for this, and has too small of a memory, and
too weak of a self?organizing learning algorithm; HeX and
Converse, on the other hand, have no underlying substrate
of creativity for anything to emerge out of.
Impressively, MegaHal came in second in the Loebner Prize
contest in 1999. This is impressive mostly because, unlike
the other contenders, it was not specifically written as
a set of rigid template rules. It was a statistical learning
system, which adapted its responses based on what was said
to it. Sometimes its conversation was idiotic, sometimes
frighteningly lucid. Here is the beginning of “MegaHal’s
best conversation” as recorded on Jason’s site
megahal.net:
User:
let's begin
MegaHAL:
WHEN WE BEGIN TO RESOLVE SOME OF MY THUMBS, SOMETHING WICKED
THIS WAY COMES
User:
what is the meaning of your last assertion ?
MegaHAL:
HAVE YOU LEARNED SINCE WE LAST SPOKE?
User:
I've learned I'm not experienced enough to modelize your
brain system
MegaHAL:
YOU'VE FINALLY CONFUSED ME.
User:
I've learned nothing
MegaHAL:
HAVE YOU LEARNED ANYTHING ABOUT INSANITY?
User:
are you stupid
MegaHAL:
I AM NOT STUPID! REALLY!
Use:r
Augustinus was a christian philosopher about the year 400.
MegaHAL:
I DID NOT RUN THAT YEAR
User:
Augustinus lived in northern africa.
MegaHAL:
WHAT DOES HAL KNOW ABOUT AURELIUS AUGUSTINUS?
User:
Let's stop with that. What do you know about physics?
MegaHAL:
EINSTEIN WON THE NOBEL PRIZE FOR PHYSICS IN 1921.
User:
The Nobel Prize for physics in 1900 was won by Roentgen.
MegaHAL:
DO YOU KNOW ABOUT PHYSICS?
User:
Roentgen discovered the X-rays.
MegaHAL:
THE NOBEL PRIZE FOR PHYSICS IN 1900 WAS WON BY ROENTGEN.
User:
Who discovered the X-rays?
MegaHAL:
ROENTGEN DISCOVERED THE X-RAYS.
User:
Do you know about superconductivity?
MegaHAL:
I CAN'T SAY ANYTHING ABOUT SUPERCONDUCTIVITY?
User:
Superconductivity is the absence of electrical resistance
in metals at low temperatures.
Jason Hutchens’ work at Artificial Intelligence Enterprises
has not been publicly disclosed, but one can be certain
it went beyond MegaHal, Hex and SEPO in significant ways.
Statistical learning was clearly involved, and AI Enterprises
got a lot of publicity from Jason’s announcement that
their program, nicknamed HAL, was conversing at roughly
the level of an 18 month old child. Of course, I viewed
this claim with a lot of skepticism, because in my view
the bulk of the meaning in an 18 month old child’s
conversation is situational. If a program could react to
its environment linguistically with the sophistication of
an 18 month old child, I’d be impressed. But disembodied
linguistic behavior at the 18 month child level doesn’t
mean much. Even so, though, I admired their guts in directly
attacking the problem of computer conversation and artificial
intelligence, and I hope very much that somehow the funding
appears to keep the project going. As Hutchens says on his
website,
Right
now I'd love to continue working on the successor to MegaHAL,
something I started while working at Lionhead Studios, but
I need to prove to myself that doing so will be at least
half as lucrative as working in a normal job, since I do
need to pay the bills. I therefore humbly implore you to
think about making a small donation to the endeavour via
the PayPal link that you'll find at the top and the bottom
of this page. And if you want to contact me directly to
offer to sponsor the project, I'm waiting for your email!
If
I receive enough evidence that future versions of MegaHAL
will pay for their development costs, you will soon see
an online version of MegaHAL's successor at this site, along
with standalone versions for a variety of platforms with
all the features you've been asking for.
This
says the world to me about how much the scientific community
values research pushing directly toward real AI! But, as
I noted above, history shows that what is at one point the
province of mavericks at the fringes, may soon enough become
the focus of vast amounts of research dollars. And the young
researchers of the new generation are always amazed to look
back at history and find out how hard people in their highly
popular field of research found it to be taken seriously,
only 5 or 10 or 20 years before.

One major AI figure who spent a long
time on real AI work, but now seems to have basically abandoned
the race, is Danny Hillis, founder of the company Thinking
Machines, Inc. This firm lived from 1983 to 1994, and focused
on the creation of an adequate hardware platform for building
real artificial intelligence – a massively parallel,
quasi-brain-like machine called the Connection Machine.
Hillis’s AI approach has a few similarities with de
Garis’s brain-building work, and also some major differences.
One similarity is that neither Hillis nor de Garis (so far)
coupled their pioneering hardware work with a systematic
effort to implement a truly intelligent program embodying
all the aspects of the mind. Their magnificent hardware
design vision have not yet been correlated with equally
grand and detailed mind design visions. The biggest difference,
on the other hand, is that compared with de Garis’s
machine which is specialized for the simulated evolution
of neural networks, Hillis’s hardware architecture
was extremely flexible, usable for a tremendous variety
of purposes, including biological and fluid dynamics simulations,
not just AI.
Unfortunately, though, while the CBM is still an ongoing
project, at this point the Connection Machine hardware has
been rendered obsolete by developments in conventional computer
hardware and network computing. The old Connection Machines
were bought by Sun Microsystems when Thinking Machines Inc.
went under, and so far as I know they’re not being
used for anything.
Hillis himself is a deep and fascinating individual -- unique
in many ways in the turn-of-the-millenium techno-visionary
pantheon. He waxes philosophical with the best of them,
holding forth eloquently about transhumanism and the end
of the human race and the whole shebang. He’s building
a clock intended to last ten thousand years. And yet he’s
also neck deep in practical work, having resigned a plum
job as a Disney exec a couple years back to start a new
company providing technology and consulting to the entertainment
industry.
Thinking Machines Inc., his only real stab (so far) at a
place in the history of science and technology, lasted 11
years, created the world’s best parallel computing
hardware, yet failed to either create a thinking computer
program or make Hillis fabulously wealthy. Yet, a few years
short of fifty, Hillis seems relatively unruffled by the
whole crazy rollercoaster ride. His visionary prognostications
lack the alarmism of Bill Joy or Jaron Lanier, and also
avoid the starry-eyed enthusiasm of Ray Kurzweil. He comes
across, in person and in his writings, as a mild-mannered,
curious and creative guy. Although he now talks tough about
business like any other seasoned exec, in many ways he’s
still an MIT hacker at heart, delighted with the task of
building the next cool gadget or intricate algorithm, and
looking forward to the gadgets and algorithms of the next
millennia in a remarkably matter-of-fact way.
His early life prepared him well for the tumultuousness
of the technology industry. Born in Baltimore in 1956, his
father was an Air Force epidemiologist, so the family moved
frequently on the trail of hepatitis outbreaks, and he grew
up with no fixed home. Moving from place to place in Rwanda,
Burundi, Zaire, and Kenya, he avoided formal schooling and
the pressures for social conformity that go along with it.
As he says, “We were typically out in the middle of
the jungle so I was just taught at home.“ His mother
did most of the teaching and her interest in mathematics
jibed well with his natural abilities. His father encouraged
him to study biology, a pursuit that gave him an early appreciation
for the complex machines that are living organisms. “My
best biological experiment,” he says, “ was
tissue culturing a frog heart and keeping the heart beating
even while it was growing in the test tube. It was amazing
to me that somehow they got together and did this coordinated
activity even though they were just this homogenized mass
of cells.” Much of his career was spent creating complex
computer systems capable of displaying spontaneous coordinated
activity, like the cells in a frog heart.
Hillis’s education is about what you’d expect
– an undergraduate math degree from MIT in 1978, followed
by a MIT master’s in robotics three years later. Along
the way he found time to pursue his avid interest in toys
and to indulge his entrepreneurial streak --working at the
MIT Logo Laboratory developing computer hardware and software
for children, designing computer-oriented toys and games
for the Milton Bradley Company, and co-founding Terrapin
Inc., a producer of computer software for elementary schools.
For his PhD work, Hillis began the endeavor that has been
his greatest contribution to science and humanity so far
– his work on the Connection Machine, a massively
parallel computer going far beyond any other computer system
of the time in terms of its potential for artificial intelligence,
simulation of complex physical systems.
As I noted above when discussing neural networks, ordinary
computers are “serial” – they have only
a single processor and hence they can carry out only one
operation at a time. The trick is that they’re fast
-- a single operation can be done very, very quickly. So
a computer can give the illusion of doing many things at
once – say, running a game while downloading e-mail
while showing an animation – when in fact its processor’s
time is swapping back and forth from one task to another
rapid-fire. The brain, on the other hand, has around a hundred
billion neurons, and in principle they’re all working
in parallel, simultaneously. Each one of them acts much
more slowly than a computer processor, but what they lack
in speed they make up for in bulk and in parallelism. Hillis’s
Connection Machine was an elegant compromise, the nature
of which changed over time as computer hardware technology
evolved. The idea was to make a computer whose processors
were fast like those of ordinary computers, but also massively
parallel like in a brain. In this way, one could have the
best of both worlds, and one could build a really intelligent
system with perhaps hundreds of thousands or millions of
computer processors tightly linked together.
Today, engineering workstations– fancy, expensive
machines – may have 2-4 processors, and the machines
powering major Websites may have up to 128 processors. Hillis’s
machines were nothing like this. The biggest Connection
Machine ever built had 64,000 processors, and a 128,000
processor version was fully designed. Far short of the number
of neurons in the brain, but still, pushing up toward the
level of a workable compromise between traditional computing
and brain-style information processing. Alternative parallel
processing machines, like the Cray supercomputers, are specialized
and inflexible, focused on doing the same exact operation
on a large amount of data all at once. Hillis’s system,
on the other hand, had the flexibility of the brain –
each processor could do what it wanted when it wanted. Leading
to the possibility of computational chaos,intelligent coordinated
activity, or most intriguing, the combination of the two.
Thinking Machines Inc., founded in 1983 while Hillis was
in the middle of his PhD work, was a remarkable organization.
At its peak the research staff, about half the corporation,
numbered in the hundreds. Despite the name of the company,
there was not a coordinated company-wide R&D program
aimed at making the Connection Machine think. Rather, there
was a variety of research groups aimed at doing all sorts
of different things with the Connection Machine, ranging
from straightforward artificial intelligence research to
simulation of fluid flow, computational immunology, experimental
mathematics – you name it. Astrophysics, aircraft
design, financial analysis, genetics, computer graphics,
medical imaging, image understanding, neurobiology, material
science, cryptography, subatomic physics…. Work on
data mining – the automatic analysis of large and
complex data sets – was particularly successful and
later became a central part of the company’s business
model.
The motivation underlying this diverse approach was simplistic
but ambitious. “Clearly,” Hillis says, “the
organizing principle of the brain is parallelism. It's using
massive parallelism. The information is in the connection
between a lot of very simple parallel units working together.
So if we built a computer that was more along that system
of organization, it would likely be able to do the same
kinds of things the brain does.”
Of course, this approach to building AI presupposes that
parallelism itself is something close to the chief ingredient
of intelligence – that there is no further “secret
sauce” required to make a mind come out of a distributed
network of processors. Hillis believes that "intelligence
is just a whole lot of little things, thousands of them.
And what will happen is we'll learn about each one at a
time, and as we do it, machines will be more and more like
people. It will be a gradual process, and that's been happening."
This is not so far off from Marvin Minsky’s Society
of Mind theory, which holds that the mind is a collection
of agents, each one taking care of a particular aspect of
intelligence, and communicating with one another, exchanging
information as required. Some AI theorists hold other views
of course. Some maintain that it’s not the underlying
computation mode that’s crucial, but rather that there
are particular algorithms (of reasoning, memory, perception,
etc.) that are really the key. Others argue that the right
combination of “little things” is needed to
give rise to the overall emergent patterns of coordinated
activity that constitute real intelligence. But Hillis’s
philosophy is a plausible one, and he had built a hardware
platform and an organization well suited to validating or
refuting his theory through ongoing engineering and research
work. Most AI research is far less ambitious, consisting
of small-scale, detailed work on one or another particular
aspect of intelligence. In the history of AI, Hillis stands
as one of a very small number of people who made a serious
attempt to actually create a thinking machine.
And then the supercomputer industry died. Networks, it became
clear, were the wave of the future. Networking large numbers
of weak machines together, one had distributed computing,
different from parallel computing in design, but somewhat
similar in result. The last Connection Machine designed,
the CM-5, was something like a computer network internally
– it consisted of standard Sun Microsystems processors
hard-wired together rather than traditionally networked.
This was a big change from the earlier Connection Machines,
which had been unique on the processor level as well as
on the level of overall system architecture. In the end,
Thinking Machines Inc. revised its business model, abandoning
hardware altogether, focusing on selling their data mining
software for use on distributed computing system composed
of ordinary computers.
In 1994, the firm dispersed. The hardware side of Thinking
Machines Inc. ended up at Sun Microsystems. Much of the
data mining group ended up on Wall Street. Several Thinking
Machines executives started TopicalNet, a company building
text categorization software. And Hillis, after a stint
working with the MIT Media Lab as an AI guru, abandoned
the push for AI and went back to one of his earlier loves,
toys and games. His new title: VP of R&D in the Imagineering
Department of Walt Disney Corporation.
He entered this new phase of his career with wide-eyed optimism.
"I've wanted to work at Disney ever since I was a child,"
he said. "I remember listening to Walt Disney on television
describing the 'Imagineers' who designed Disneyland. I decided
then that someday I would be an Imagineer. Later, I became
interested in a different kind of magic--the magic of computers.
Now I finally have the perfect job-- bringing computer magic
into Disney."
Post Thinking Machines, his scientific work was becoming
more practical in orientation – he was designing new
technologies to underlie games and theme park rides rather
than working directly toward digital intelligence. But at
the same time, his philosophical side was hardly dormant.
The far future came to occupy his thoughts more and more.
In 1993, with Thinking Machines on its last legs, he wrote
the following manifesto:
"When
I was a child, people used to talk about what would happen
by the year 2000. Now, thirty years later, they still talk
about what will happen by the year 2000. The future has
been shrinking by one year per year for my entire life.
I think it is time for us to start a long-term project that
gets people thinking past the mental barrier of the Millennium.
I would like to propose a large (think Stonehenge) mechanical
clock, powered by seasonal temperature changes. It ticks
once a year, bongs once a century, and the cuckoo comes
out every millennium."
The
Clock of the Long Now. A clock built to last 10,000 years
or more, powered by seasonal climactic fluctuations. The
clock is not yet built, but a piece of land in rural Nevada
has been purchased, the design is completed in detail, and
construction of the components is underway. The Long Now
Foundation (http://www.longnow.org/) is accepting donations
online.
Hillis holds some 40 U.S. patents--for disk arrays, forgery
prevention methods, a color camera and various software
and mechanical devices. Among all his inventions, the clock
is definitely one of the coolest – one that would
make any MIT hacker proud. And it resonates with something
deep and powerful in the human soul – the same aspect
of human essence that finds the Cheops Pyramid more impressive
than the Nintendo Gameboy, in spite of the incredible complexity
of the hardware and software engineering involved in the
latter. The Clock of the Long Now appeals to our embodiedness,
to our embeddedness in space and time, whereas Hillis’s
work on AI, like most AI work, ignored embodiedness and
embeddedness and focused mainly on cognition, abstract thinking,
on the most rarefied parts of the mind.
Abstractly, one could build a mind operating a thousand
times faster than a human mind, or a thousand times slower.
10,000 years would mean something different to each of these
differently time-scaled minds. But the mathematics and theory
of AI would apply equally well to all of them, as would
many of the same hardware engineering principles. The Clock
of the Long Now is focused on palpable human reality, not
the abstract mathematics of mind or the subtleties of hardware
engineering. In fact it represents a step back from fancy
modern electrical engineering. Modern technology provides
few systems of 10,000-year durability, and so the design
of the Clock of the Long Now required a number of purely
engineering innovations.
One could easily portray Hillis’s interest in clock-building
as a symptom of a mid-life crisis. After all, the older
you get, the more interesting time seems to you. Perhaps,
having failed to create real AI, the guy was reviewing his
own life, and feeling his own death moving closer. Perhaps
he found it comforting to remind himself how little it matters,
from a 10,000 year viewpoint, whether any one human or any
one company succeeds at doing any one thing. No doubt, there
is an element of truth to this view. But this doesn’t
seem to be a terribly large aspect of his motivation for
pursuing the clock project – not as large, for example,
as his sheer love of building cool stuff. And of course,
both a thinking machine and a 10,000 year clock are Big
Things – projects that appeal to the entrepreneurial,
adventurous, overambitious soul.
The clock got all the media attention, but for Hillis personally
it was never a full-time occupation. His new job at Disney
was the bulk of his life. It was exciting – there
was lots of money to build lots of great stuff, and he was
involved in a lot of different projects. But if one reviews
the time Hillis spent at Disney, one has a hard time finding
any Disney project that really showcases his flair for large-scale
innovative engineering. The details of his time at Disney
aren’t open for public discussion, but, it’s
not hard to reconstruct the story. Disney is a huge organization,
and carrying exciting projects from concept to real-world
implementation, without layers of bureaucracy getting in
the way, probably wasn’t the easiest thing in the
world. Anyway, last year Hillis left Disney, taking with
him Bran Ferren, the head of the Imagineering group.
Ferren shares Hillis’s visionary streak, and also
his interest in escaping from Internet Time into historical
time. He, Hillis and Nathan Myrhvold (former CTO of Microsoft)
have enjoyed hunting together for dinosaur bones. The conceptual
clash between dinosaur bones and cutting-edge computer technology
is just the kind of thing that leads brilliant minds in
new directions. Ferren and Hillis are exploring these new
directions via their new start-up Applied Minds, a company
aimed at providing technology and consulting services to
entertainment firms, presumably including Disney.
Having spent most of his career at the intersection between
business and science, Hillis is acutely aware of the difficulties
of balancing the different goals of these very different
enterprises. There was a transition in the life of Thinking
Machines, he observes, when it became less of an R&D
shop and more of a real business – and at that point
it became more and more difficult to move toward the goal
of building real AI. When the firm became a real business,
efficiency became important -- but creativity is exploratory,
evolutionary, and fundamentally inefficient. Basically,
in a company narrowly focused on making money, every minute
of everyone’s day must be judged by whether it contributes
to the bottom line. But the nature of the creative process
is such that it just can’t be judged until it’s
finished – there’s no way to tell which kooky
train of thought or experimentation is going to lead to
useful results.
What appealed to him about Disney, when he started out there,
was the fact that it was a real business that was making
real efforts to keep creativity alive within its walls.
This was the express purpose of the Imagineering group.
The defection of Hillis and Ferren, however, is an indication
that Disney’s efforts in this regard have not been
entirely successful. Applied Minds is a fascinating venture,
which one suspects will do a better job of combining creativity
with business focus than was possible inside Disney. But
still, it’s worth noting how Hillis’s efforts
have bifurcated: the Clock of the Long Now and Applied Minds
each embody different aspects of his mind and soul, which
were fused together in his earlier work with Thinking Machines.
Of course, if Applied Minds becomes an extremely profitable
business, then it will be able to fund more and more interesting
research over time. It will be interesting to see what happens
in this regard. By remaining at Disney through the whole
Internet bubble, Hillis missed out on his chance to cash
in on the tech boom while it lasted. And given Disney stock’s
poor performance in recent years, Disney stock options presumably
weren’t a wonderful thing to own. The end-game of
Thinking Machines Inc. did not result in making Hillis tremendously
rich either. So, Hillis has a nice house, the back yard
facing a beautiful lake, and is having an even nicer one
built. But he isn’t currently in a financial position
to build amazing new things on his own dollar. If Applied
Minds puts him in this position, who knows what will emerge?
Perhaps something as fantastic as Thinking Machines Inc.
– which remains the high point of Hillis’s story.
It was a fascinating enterprise in many ways, but largely
because of the way it fused science and business in the
service of a single, immensely ambitious initiative. The
Connection Machine was too big of a project to be initiated
outside of industry, yet too innovative to be done without
a large team of visionary scientists. The things Hillis
is involved in now are less paradoxical and complex, and
ultimately for that reason perhaps a little less intriguing.
The Clock of the Long Now is a great work of conceptual
art, with deep philosophical overtones and involving some
neat engineering problems. Applied Minds is a real business
through and through, using new science as required to provide
customer solutions. These are both intriguing and sensible
things, and yet they lack the Don Quixote–ish majesty
of Thinking Machines Inc. and the Connection Machine, which
to this day remain Hillis’s greatest creations.
A story like this reminds us that business, science, engineering
and art are not fundamental divisions of the universe, any
more so than earth, air, fire and water. Great innovations
and enterprises stand outside these divisions, because they
are crystallized around concepts that go beyond the temporary
structures of any one human culture and society. The human
race’s urge to create intelligence beyond itself –
whether through building AI machines or through, as Hillis
has recently discussed, putting “chips in the brain”
– is a fundamental force that cuts across categories
of human endeavor. Our need to understand our relationship
with time is a similar fundamental force. And some human
beings – like Danny Hillis – and some human
organizations –like Thinking Machines and, to a lesser
extent the Long Now Foundation -- reflect these fundamental
forces in particularly elegant and powerful ways. To paraphrase
what Hillis said about the frog’s heart he experimented
with as a youth, it is remarkable that we can “do
this coordinated activity… even though we are just
a mass of cells.”

The
big problem plaguing AI, I have said, is “brittleness”
– domain specificity, lack of flexibility and autonomy.
The Connection Machine tries to get around the brittleness
problem through massively parallelism, analogous to that
of the brain. Hutchens’ chat bots seek to circumvent
it through relying on free-ranging statistical learning
rather than rules.
On the other hand, some people have tried to get around
the brittleness problem by providing the computer with so
much information that it could answer any possible contingency.
The most ambitious project in this direction was Doug Lenat's
Cyc project, which has been going since 1984. Cyc is focused
on trying to build a program with common sense. The Cyc
team is mainly focused on encoding millions of items of
data, so that the program can know everything an eight?year?old
kid knows. "Cyc" was originally short for "Encyclopedia,"
but they found that the knowledge they needed was quite
different from that found in encyclopedias. It was everyday
knowledge you could get by asking a small child, perhaps
more like that in a dictionary. Each common sense concept
in Cyc gets an English language definition, and a mathematical
definition, which tries to paraphrase the English definition.
For example, the Cyc English-definition of "skin"
goes like this:
"A
(piece of) skin serves as outer protective and tactile sensory
covering for (part of) an animal's body. This is the collection
of all pieces of skin. Some examples include #$TheGoldenFleece
(representing an entire skin of an animal) and (#$BodyPartFn
#$YulBrynner #$Scalp) (representing a small portion of his
skin).
The
Cyc English-definition of happiness is:
The
enjoyment of pleasurable satisfaction that goes with well?being,
security, effective accomplishments, or satisfied wishes.
As with all #$FeelingAttributeTypes, this is a #$Collection
?? the set of all possible amounts of happiness one can
feel. One instance of #$Happiness is `extremely happy';
another is `just a little bit happy'.
It’s
clear why one might think definitions of this sort could
contribute to solving the common sense problem that we see
when playing with chat bots like HeX or ELIZA. These chat
bots have no common sense, they have no idea what words
mean. Cyc is based on getting humans to tell computers what
words mean.
It’s interesting stuff, but I don’t really believe
the logical definitions Cyc has have that much overlap with
the kind of information contained in the mind of an eight-year
old child. We humans aren’t even explicitly aware
of much of the information we use to make sense of the world.
A human’s notion of happiness or skin is much bigger,
disorderly and messier than these definitions. These kinds
of general abstract definitions may be inferred in the human
mind from a whole lot of smaller-scale, practical patterns
recognized involving skin and happiness, but they’re
not the be-all and end-all. In dealing with most practical
situations involving skin and happiness, we don’t
refer to this kind of abstraction at all, but we use the
more specialized patterns that the general conclusions were
derived from.
Basically, Cyc tried to divorce information from learning,
but it can't be done. A mind can only make intelligent use
of information that it has figured out for itself. Despite
sixteen years of programming, Cyc never succeeded in emulating
an eight year old child. Nor has anyone yet found much use
for a CD-ROM full of formal, logical definitions of common
sense information. The company Cycorp is doing OK, based
largely on selling Cyc-powered document management software
to the military. When I talked recently to the firm’s
CEO, he was quite careful not to mention anything about
artificial general intelligence or any long-term scientific
mission underlying the firm’s work. Instead, he characterized
the firm as being concerned with producing a useful database
intended for embedding withing various specialized software
products. This is a worthwhile mission to be sure, but very
different from the grand goal of real AI.
In fairness to Doug Lenat, I must say that he is now working
from a computational?psychology perspective that has something
in common with my approach. He has a reasonably solid theory
of general heuristics ?? problem?solving rules that are
abstract enough to apply to any context whatsoever. His
pre?Cyc programs AM and EURISKO applied his general heuristics
to mathematics and science respectively. Both of these programs
were moderately successful, exemplars in their field, but
far from true intelligence. Their design lacks a holistic
view of the mind. In the big picture of real AI, getting
the mind’s heuristic problem?solving rules right means
virtually nothing, because problem?solving rules gain their
psychological meaning from their interaction with other
parts of the mind. If the other parts aren't even there,
the problem solving is bound to be sterile.
EURISKO won a naval fleet design contest two years in a
row, until the rules were changed to prohibit computer programs
from entering. And it also received a patent for designing
a three?dimensional semiconductor junction. But when looked
at carefully, even EURISKO's triumphs appear simplistic
and mechanical. Consider EURISKO's most impressive achievement,
the 3?D semiconductor junction. The novelty here is that
the two logic functions "Not both A and B" and
"A or B" are both done by the same junction, the
same device. One could build a 3?D computer by appropriately
arranging a bunch of these junctions in a cube.
But how did EURISKO make this invention? The crucial step
was to apply the following general?purpose heuristic: "When
you have a structure which depends on two different things,
X and Y, try making X and Y the same thing." The discovery,
albeit an interesting one, came right out of the heuristic.
This is a far cry from the systematic intuition of a talented
human inventor, which synthesizes dozens of different heuristics
in a complex, situation?appropriate way.
By way of contrast, think about the Croatian inventor Nikola
Tesla, probably the greatest inventor in recent history,
who developed a collection of highly idiosyncratic thought
processes for analyzing electricity. These led him to a
steady stream of brilliant inventions, from alternating
current to radio to robotic control. But not one of his
inventions can be traced to a single "rule" or
"heuristic." Each stemmed from far more subtle
intuitive processes, such as the visualization of magnetic
field lines, and the physical metaphor of electricity as
a fluid. And each involved the simultaneous conception of
many interdependent components.
EURISKO may have good general?purpose heuristics, but what
it lacks is the ability to create its own specific?context
heuristics based on everyday life experience. And this is
precisely because it has no everyday life experience: no
experience of human life, and no autonomously?discovered,
body?centered digital life either. It has no experience
with fluids, so it will never decide that electricity is
like a fluid. It has never played with blocks or repaired
a bicycle or prepared an elaborate meal, nor has it experienced
anything analogous in its digital realm ... so it has no
experience with building complex structures out of multiple
interlocking parts, and it will never understand what is
involved in this. EURISKO pushes the envelope of rule?based
AI; it is just about as flexible as a rule?based program
can ever get. But it is not flexible enough. In order to
get programs capable of context?dependent learning, it seems
to be necessary to write programs which self?organize ??
if not exactly as the brain does, then at least as drastically
as the brain does.

One
of the most impressive achievements of rule-based AI happened
on May 11, 1997. This was an event that led many people
to think that computers were already on the verge of rivaling
human intelligence. For the first time ever, a computer
had defeated the world chess champion in a standard five?game
match. Deep Blue, a computer chess program developed at
Carnegie Mellon University split the first two games with
Gary Kasparov. The second two were draws, and the final
game went to Deep Blue. Kasparov was a sore loser. Deep
Blue took it all as a day's work.
Admittedly, this was only one match, but the tournament
was not a fluke. Previous versions of Deep Blue were already
able to consistently beat all but the greatest chess grandmasters.
And Deep Blue's play can be improved by hardware upgrades,
whereas a brain upgrade for Kasparov is not in the works.
Although Deep Blue is not a very intelligent entity according
to my definition, there’s much to be learned from
a study of its accomplishments and the mechanisms underlying
them. The day after this happened, there was a lot of talk
about Deep Blue on the Simon’s rock alumni e-mail
list, and I wrote a long e-mail giving my views. What I’ll
say about it here is basically what I said in that e-mail.
The question people were debating on the e-mail list was:
Does Deep Blue's accomplishment mean that true artificial
intelligence has been achieved?
If
we define intelligence as the ability to do one intellectual
task very, very well, then Deep Blue qualifies brilliantly.
But if we think of intelligence as being able to make appropriate
decisions in a wide variety of complex environments, it
fails miserably. It can only play chess. Computers that
can do one thing well are, of course, tremendously useful
things. But they are not truly intelligent as I prefer to
define the term. In my definition, a truly intelligent computer
will have to do more than follow instructions. It will have
to create its own answers to unanticipated problems. For
this kind of intelligence it will need a structure quite
different than Deep Blue's.
Although Deep Blue follows the same rules as human chess
players, it doesn’t think like humans. Human chess
players use geometric intuition and a sense of the flow
of a game. Deep Blue calculates every single possibility,
and then calculates all the possible consequences of each.
Computer programmers call this recursive logic. It does
the same thing over and over and over again, constantly
referring back to the results it just obtained, and figuring
out how well it is doing. Human beings might use recursive
logic to play a very simple game, such as tic-tac-toe, which
has very few choices. But even in tic-tac-toe, our opponents
would probably object to our taking the time to calculate
out the potential consequences of every possible move. Our
minds are much too slow to play chess that way, nor would
the game be any fun if we could. Computers, by contrast,
are much, much quicker at this kind of task and do not get
bored, so recursive logic can work well for them.
Of course, every chess player extrapolates, thinking: "What
is the other player likely to do next? And if he does that,
what am I going to do? And if I do that what is he going
to do?" But in humans, this kind of reasoning is augmented
by all sorts of other processes. For Deep Blue, this kind
of extrapolation is the whole story -- and it does it very,
very well. Computers can extrapolate faster and further
into the future than any human. The 1997 version of Deep
Blue could evaluate about two hundred million different
board positions every second. This figure can easily be
increased for the cost of additional circuitry, but it doesn't
make Deep Blue truly intelligent in my mind.
One way to understand the difference between Deep Blue and
human players is think about strategy versus tactics. There
is a certain kind of creative long?range strategy that human
chess grandmasters have, but Deep Blue lacks. Deep Blue
makes up for this lack by elevating tactics to such a high
level that it assumes the role of strategy. Deep Blue is
not entirely strategyless: it carries out its superhuman
tactical evaluation within the context of a collection of
pre?programmed strategies, and it is capable of switching
between one strategy and another, in response to events.
But it does not think strategically, it only thinks tactically.
It doesn't make long?range plans, involving an understanding
of the overall structure of the board as a dynamical system,
changing over the course of the whole game. If it could
do this, it would doubtless play even better. But even without
strategic creativity, it plays well enough to beat the best
humans.
The defeat of Kasparov by Deep Blue is symbolic because
chess is the mentally hardest game commonly played in the
Western world. Computers became better than humans at checkers
and many other games quite some time ago. However, there
is at least one popular game that still stumps the best
computers ?? the Oriental game Go. At the present time,
in spite of a substantial research effort, no existing computer
program can play better than the advanced beginner level
at Go.
The rules of Go are very simple, compared to chess. Play
is on a 19x19 grid, and stones (pieces) are placed on the
intersections of the grid, called points. The first player
plays black stones, the opponent white ones; and stones
are added to the board one?by?one, players alternating.
Stones are not removed once they are placed, but stones
and groups of stones may be captured. A player calls "Atari"
when a capture can occur on their next move, to warn the
opponent. The game ends when it is no longer possible to
make a reasonable move. The winner is determined by the
amount of territory surrounded ?? less the number of stones
captured.
The trouble with Go from the computational perspective is
that, from any given board position in Go, there are hundreds
of plausible next moves, rather than dozens as in chess.
Extrapolation in Go will not get you as far as it does in
chess. Thus, it would seem that, if computers are to conquer
Go, they're going to have to either do it with a more general
intelligence, or use some cleverer special?case technique
than the one employed for chess. Go is too visual, too two?dimensional,
to succumb to purely combinatorial, non?visual techniques.
A world champion Go program would have to be intelligent
at general two?dimensional vision processing as well.
In Go, high level players routinely analyze positions that
aren't confined tactically to, say, a 9x9 grid. Plus, almost
any tactical fight has strategic implications across the
board that could be worth more than the fight itself ??
so a great pattern matcher wins the points, but loses the
war. One style common in evenly?matched games is to go around
the board "losing" fights, but in such a way that
one's own stones work together, and become more powerful.
The computer Go programs in existence today rely heavily
on pattern matching: taking a given, small chunk of the
board, and matching it up to a dictionary of known board
situations. The best ones are as good at reading small,
enclosed life/death problems as a mediocre tournament Go
player. But when the problems are not so rigidly enclosed
within a small region of the board, the programs are clueless,
although intuitive human players can still can see the essential
principles. The best way to wipe out such programs is to
embroil them in a huge whole board fight, one that is too
big for the algorithm to match properly.
Deep Blue's recursive approach of elevating tactics to the
level of strategy doesn't work so well in Go. Exhaustive
search over spaces of two?dimensional patterns is much,
much harder than the kind of decision tree search required
for dealing with chess, and will be out of reach of computers
for a good long time. One suspects that something less than
true intelligence will suffice for Go, as it has for chess
?? but not something as much less than true intelligence
as Deep Blue is. Perhaps a massive increase in raw computational
power will be enough, but that is not a truly intelligent
approach.
Deep Blue has basically the same problems as GPS and EURISKO
and BACON. It’s above all an achievement of computer
hardware, not intelligent software. It relies on a special
computer chip, custom?designed for searching many moves
ahead in chess games. This special chip could be modified
without much hassle to apply to other similar games ?? checkers,
maybe Othello. The ideas of the massively parallel design
and RS 6000 platform are going to be generalized by IBM
into drug design tools, weather forecasting and other applications.
But this architecture couldn’t be modified to apply
to Go ?? let alone to apply to any real?world situation.
Deep Blue's chip is less like the human brain than like
a human muscle: a mechanism designed specifically for a
single purpose, and carrying out this purpose with admirable
but inflexible precision. Its rules are astoundingly simple
and mechanical: evaluate the quality of a move in terms
of the quality of the board positions to which this move
is, based on my experience, likely to lead. Judgments based
on experience are made, not by complex intuitive analogy,
but by simple pattern?matching. Everything is cut?and?dried,
and done two hundred million times a second. This is outstanding
engineering, it is not intelligence ?? but yet, its results
are vastly more impressive than those obtained when rule?based
AI does try to do intelligence, as in programs like GPS,
BACON and EURISKO.
Some chess players were frustrated with Kasparov's defeat
at the "hands" of Deep Blue, on the grounds that
Kasparov was extremely sloppy in the final game. He played
a defense (Caro?Kann) that he admitted afterwards wasn't
his forte, and, in their view, gave away the game with a
move that conventional chess wisdom deems unwise. It is
argued, on these grounds, that he lost because he let his
emotions carry him away. In past games, his unusual and
unpredictable moves had thrown the computer off, putting
it in situations where its memory of previous board positions
did not apply. This time, he was perhaps feeling a little
desperate, and so he moved a little too far into the domain
of the unpredictably self?defeating.
This raises another interesting question -- the role of
human emotion in human intelligence. Is emotion bad for
human competence, or good? In this case, emotion may have
worked against Kasparov, though how much so is difficult
to say. Who can blame him for getting a little stressed
?? after all, in the eyes of the media, he was playing not
only for himself or his country, but for the whole human
race! But yet, it's hard to pooh?pooh human emotion, even
from a purely performance?oriented perspective, because
of the extent to which emotion and cognition are bound up
together in the human mind. One can argue convincingly that,
in intelligent systems operating on the basis of generalized
intuition rather than specialized search and pattern?matching,
intuition is necessarily coupled with emotion.
A truly intelligent computer must have some kind of emotion,
but this doesn’t mean it will feel things the way
humans do. George Mandler, a cognitive psychologist whose
work I’ve studied a lot, has proposed that we think
of emotion as composed of two different aspects, called
hot and cold. The hot aspect is the consciousness attached
to emotion, which is shaped by the link between the human
mind and the human body. The cold aspect of emotion is the
abstract, mathematical structure of emotion, which seems,
on careful psychological analysis, to be closely related
to the "failure of expectations to be fulfilled."
Emotions happen, in this view, when awareness is attached
to some unfulfilled expectation.
At first glance this analysis of emotion may seem to apply
only to negative emotions. But positive emotions can be
understood in the same way. Wonder is the unfulfillment
of the expectation of familiarity. Happiness is, in large
part, a feeling of increasing unity, a feeling of more inner
unity than expected ?? or in other words, an unfulfillment
of the expectation of inner disparity and disharmony. Happiness
is also a feeling of decreasing sadness; whereas sadness
is a feeling of decreasing happiness ?? each feeling arises
largely as a consequence of unfulfillment of the expectation
of its opposite. Love is happiness extending outside the
body ?? it is the feeling of increasing unity between self
and other, the unfulfillment of the expectation of separateness.
The point is not to "reduce" emotions to formulas,
but rather to identify the structures underlying the feelings
of various emotions. Although computers cannot feel emotions
as humans do, since they do not have our chemistry, they
may be able to learn to behave as if they did.
Obviously, computer programs experience unfulfilled expectations
?? they have, in the Mandler picture, the cold aspect of
emotion. But without something equivalent to the "hot"
aspect of emotion, they may not be able to act on their
unfulfilled expectations as humans do. When its expectations
are frustrated, Deep Blue it does not respond in the way
an intelligent system often does: by leaping to a more general
level of abstraction, and considering the assumptions that
led to the expectations that were in fact frustrated. It
does not have emotion driven learning in this sense ?? although
it does, of course, recalibrate its settings based on experience;
this recalibration is of a simpler kind, not involving an
exploratory search through its own associative memory of
past events. There is, however, no reason why a computer
might not be programmed to respond to frustration in the
same way as human chess players do, even though it would
not feel the frustration in the same way.
Although emotion can be harmful to human cognition at times
?? as, perhaps, in Kasparov's final game against Deep Blue
in 1997?? it is, all in all, inseparable from intuitive
human thought. Emotions trigger our thoughts and even structure
them. Emotion, in itself, may not be valuable ?? but it
makes no sense, in a human or other intuition?driven system,
to separate emotion from cognition. Deep Blue demonstrates
that emotion?driven intuition is not necessary to excel
at the game of chess. But as Max Miller, an e?mail acquaintance
of mine on the Simon’s Rock list, put it, "All
this proves is what I've been saying for years: that chess
is a stupid game."

When
neural nets were being dissed in the early 70’s, not
everyone was optimistic about the potentials of rule-based
AI. In 1972, the era in which ELIZA was receiving a lot
of attention. a philosopher named Hubert Dreyfus's wrote
a book called What Computers Can't Do, which was a vicious
attack on AI. Dreyfus argued that artificial intelligence
researchers were fundamentally misguided, that they could
never achieve their objectives with the methods they were
using. Dreyfus preached the importance of body?centered
learning, and the close connection between logic, emotion
and intuition. Without a body, Dreyfus argued, without feelings,
there can be no real generalization of special?case ideas.
Based on these philosophical considerations, he predicted
that AI would be a failure.
Needless to say, these conclusions went over like a lead
balloon among AI researchers. They laughed in his face,
or worse. But the vehemence of their ridicule betrayed a
fear that he might have been right. Even today a review
posted on the amazon.com WEB site observes that "many
AI?workers seem to be actually afraid of this book."
The same reviewer notes, however, that it is one of very
few computer books that is still of interest 25 years after
it was written.
In 1992, Dreyfus re-released the book with the title What
Computers Still Can't Do. The Introduction brims over with
he?who?laughs?last?laughs?best insolence. But his exultant
crowing is not quite persuasive. He was right about the
limitations of the AI programs of the 1960s and 1970s. But
the observers who thought it was just a matter of time and
resources also been proven correct in many cases. Dreyfus,
for example, ridiculed a prediction that computers would
soon be able to take dictation, just as a human secretary
can. Although this prediction didn’t come true as
quickly as Rosenblatt had thought, some fairly good programs
are available today for this purpose, relying in large part
on a neural net architecture to learn each user's speech
patterns.
Dreyfus's critique of AI, in the first edition, was too
strong. He appeared to believe that detailed simulation
of the human body was the only possible path to AI, and
he argued that this would be impossible without simulating
the biology of the brain and the rest of the body. Actually,
the human brain is only one intelligent system, and a great
deal can be accomplished without replicating the details
of its biology. But Dreyfus's arguments posed a serious
challenge to AI theorists: how to design a machine that
can simulate body?based, emotion?based conceptual generalization?
I believe that Dreyfus was essentially correct that, if
this is impossible, AI just ain't gonna work. A physical
body just like ours is not required: an AI entity could
have a virtual body, enabling it to interact in a rich and
autonomous way with a virtual world. And emotions need not
be controlled by biological neurotransmitters, they can
come out of complex digital dynamics. But the point is,
unless one has a computing system that is large, complex
and autonomous, with integrated sensory, memory and action
systems interacting with a rich environment and forming
a self system, it will never develop the ability to generalize
from one domain to another. The ability to generalize is
learned through general experience, and general experience
is gained by exploring a world.
In designing Webmind and then Novamente, I definitely took
Dreyfus's critique to heart. Of course, I didn’t try
to replicate the human body as he thought was necessary.
Instead, I bypassed his critique by designing a huge, self-organizing
system, which lives in the perceptual world of the Internet
and understands that its body is made up by Java objects
living in the RAM of certain machines. It is a nonhuman,
embodied social actor. Dreyfus didn't try very hard to imagine
an embodied, social intelligence without a human-like body,
but, his ideas certainly leave room for such a thing. His
problem was not with AI but with the attempt to build a
mind that operates in a vacuum, instead of synergistically
with a self and a world.
I met Hubert Dreyfus only once, in early 1996, when he came
to the University of Western Australia, where I was working
at the time, to give a lecture on the philosophy of AI.
I heard his talk and had a few drinks with him afterwards.
He told me was extremely pleased that the new edition of
his book had been released by MIT Press, which had refused
to publish the first edition -- because of MIT's commitment,
as an institution, to AI research: Minsky, and many other
AI researchers who Dreyfus criticized, worked at MIT. And,
much to my surprise, he was newly optimistic about AI ??
not about the type of AI that had been popular in the 70's
or 80's, but about more recent developments, in particular
about neural network AI.
What he liked about neural networks was the fact that their
intelligence was implicit: even after you had trained a
network to do something, you couldn't say precisely how
the network had done it. This implicitness, he felt, captured
something of the tacit and nebulous nature of human understanding.
A neural net in a robot body, he surmised, would possibly
be able to realize the dreams of AI researchers, and do
what isolated, rule-based systems could not.
I asked Dreyfus why, if he felt neural nets offered a possible
path to AI, he had not said so in the first edition of his
book, back in 1972. His answer was simple: he said that
he hadn’t understood what neural nets were all about
at that time. This reply impressed me with its directness
and honesty, and also drove home in my mind the importance
of relying on your own intuition in scientific matters.
The leaders in the field may be off in the wrong direction;
their leading critics may be just as ignorant, though in
different ways; and crucial clues may be found in the ignored
writings of scientists from past decades. One of the lessons
of the history of AI is that science, for all its objectivity,
is fundamentally a human endeavor, and quite susceptible
to the ordinary human weaknesses.

In
the past few pages I’ve gone over a few bits and pieces
of past AI work, which I hope have been evocative. Certainly
they have not been complete: AI is a big field, and there’s
been a huge amount of work, and it would take a 1000 page
book to even do a decent summary. What I’ve tried
to do here is to touch on the main streams of thinking in
the discipline, but this kind of summary doesn’t get
across the wild diversity of innovation you find by looking
at the papers of individual researchers, including those
way out of the mainstream. Actually, odd bits of AI work
by total outsiders probably had a bigger influence on me,
as an AI researcher, than the major breakthrough programs
I’ve focused on above.
For example, there was the work of John Andraea at the University
of Canterbury in Hamilton, New Zealand. He wrote a nice
little system called PURR-PUSS which learned to interact
with you statistically. One of his students was John Cleary,
who was one of the machine learning gurus at Waikato University
in Hamilton, New Zealand, where I taught for a year. John
worked for me at Webmind Inc. for a couple years, and he
and his students formed the firm’s New Zealand office,
and masterminded our Webmind Classification System product.
We’re not exactly emulating PURR-PUSS in Novamente,
but the statistical learning methods that it embodied are
there in our machine learning module and our reasoning system,
and the emphasis on interactive learning that Andraea advocated,
live on in our work with a vengeance.
And there was a cool little book called “Robots on
Your Doorstep” by Nels Winkless and Iben Browning,
which talked a lot about the definition of intelligence
as “the ability to act appropriately under unpredictable
conditions” – a definition that helped me formulate
my own concept of intelligence – and contained a lot
of cool speculations about how to build robots that would
be intelligent in this sense. I got an e-mail from Nels
Winkless – who was, by the way, the founder of PC
magazine, one of the first computer magazines aimed at ordinary
people – in 1997 when I was just starting Intelligenesis.
He’d found a reference to Robots on Your Doorstep
on my website and was pleased to find that anyone had actually
read that book! I asked him for advice on starting a business,
on my software designs and so forth, and we iterated back
and forth by e-mail a bit. It turned out after Iben had
died, Nels had collaborated with someone else to build a
company centered on some pattern-recognition technology
– very much in the spirit of Robots on Your Doorstep,
and of my view of mind as pattern. Eventually the company
had been taken over by its investors due to its failure
to make money, and had found a niche in which it could be
successful: automatic recognition of similarity among images.
The name of the company was Excalibur Technologies –
it still exists, see http://www.excalib.com
Then there was the idea of genetic algorithms – doing
AI by simulating evolution rather than the brain. This wasn’t
at all a hip idea in 1989 when I was writing my PhD thesis
on optimization methods – at that point, it didn’t
even turn up in my review of the optimization literature,
although in retrospect one can point to papers on the topic
going back to the late 60’s. By the mid-90’s
it was a well-recognized area of computer science and I
was doing research into the mathematics of genetic algorithms,
studying questions such as “Why is evolution involving
sexual reproduction more efficient than evolution involving
asexual reproduction only?” Although the details are
different, evolutionary AI is similar in spirit to neural
net AI – you’re dealing with a complex, self-organizing
system that gives results in a holistic way, where each
part of the system doesn’t necessarily have a meaning
in itself but only in the context of the behavior of the
whole. In my 1993 book The Evolving Mind, I wrote a lot
about the relation between evolutionary programming in AI
and Edelman’s theories of evolution in the brain.
It turns out you can model the brain as an evolutionary
system, with special constraints that make it a bit different
from evolving ecosystems or genetic algorithms in AI. We
have an evolution module in Webmind, which is used for two
things: as one among many machine learning methods for finding
patterns in data (along with feedforward neural nets and
purely statistical methods); and as one among two ways of
learning schema for perceiving and acting (the other being
probabilistic logical inference).
I’ve just listed three little fragments of AI research
that don’t fit into the big bad “neural nets
versus rules” dichotomy that I put forth above, and
there are many many others. History never fits all that
neatly into categories – almost nothing does, actually.
But we need to make up categories in order to understand
things. The diversity of mind and of society, that busts
out of all the simplified category systems that we place
on it, is responsible for the fabulous creativity of these
systems.

OK
– I apologize in advance for being a contradictory
author. But sometimes reality is contradictory. “A
foolish consistency is the hobgoblin of simple minds,”
as Emerson tells us.
Above I presented a dichotomy between symbolic and connectionist
AI – rule-based and neural-net AI. Then I pointed
out that a lot of cool AI doesn’t fit into this framework
at all, things like statistical machine learning and genetic
algorithms. Now I’m going to dig my hole even deeper
by arguing that the distinction between symbolic and connectionist
AI is actually a lot fuzzier than most AI gurus realize.
This is a key issue because I often like to say that Webmind
synthesizes connectionist and symbolic AI. While this is
a true statement, it glosses over the peculiar vagueness
of the notions of "symbolic" and "connectionist"
themselves. When you get deeply into these concepts, you
realize that this classical dichotomy is not quite framed
correctly in most discussions on AI. There is a valid distinction
between AI that is inspired by the brain, and AI that is
inspired by conscious reasoning and problem-solving behavior.
But the distinction between "symbolic" and "connectionist"
knowledge representation is not as clear as it's usually
thought to be.
Classically, the distinction is that in a symbolic system,
meanings of concepts are kept in special localized data
structures like rules, whereas in a neural-net-like connectionist
system, meanings of concepts are distributed throughout
the network. Also, in a symbolic system the dynamics of
the system can be easily understood in terms of what individual
rules do, whereas in a connectionist system the dynamics
can basically only be understood holistically, in terms
of what the whole system is doing.
But in reality the difference isn’t so clear. For
example, one branch of symbolic AI is “semantic networks.”
In a semantic network you have nodes that represent concepts
and links representing relations between concepts. Suppose
you has a semantic network in which there is a node representing
"floor." This is, obviously, symbolic in the classic
sense. The meaning of the “floor” node is localized.
But wait – is it really?
In some semantic network based AI systems, all the relations
are made up by people. But some of them have reasoning that
builds relationships, that learns, for example, that because
people walk on floors, floors must be solid, because people
only can walk on solid things. In a system like this, relations
are built from other relations, and so the meaning of the
“floor” node may be contained in its relations
to other nodes, i.e. its connections to other nodes. And,
the formation of these connections may have been based on
the connections of the other nodes to yet other nodes, etc.
etc. etc.
What this means is that, in a semantic network formed by
iterative reasoning rather than by expert rule creation,
each element of knowledge (each node) actually represents
the result of an holistic dynamic. It has meaning in itself
-- a link to our socially constructed concept "floor"
-- but internally its meaning is its relation to other things,
each of which is only defined by the other things it related
to, etc.; so that the meaning of the part is only truly
describable in terms of the whole.
On the other hand, suppose one has a neural network in which
memories are represented as attractors (a Hopfield Net,
or Attractor Neural Network, in the lingo). Then, the meaning
of a link between two nodes in this network mainly consists
of the attractors that its presence triggers. On the other
hand there's also a clear local interpretation: If the weight
of the link is large then that means the two nodes it connects
exist together in a lot of attractors. I.e., they're contextually
similar. If the weight of the link is large and negative,
this means that the two nodes rarely co-exist in an attractor
-- they're contextually opposite. Whether the nodes have
immediate symbolic meaning or not depends on the application
-- in typical attractor neural network applications, they
do, each one being a perceptible part of some useful attractor.
The point is, in both classic symbolic and classic connectionist
knowledge representation systems, one has a mix of locally
and holistically defined meaning. The mix may be different
in different knowledge representation systems, but there
is no rigid division between the two. This fact is important
in understanding Webmind, which intermixes "symbolic"
style and "connectionist" style knowledge representations
freely.
Of course, there are extremes of symbolic AI and extremes
of connectionism. There are logic based AI systems that
don't have nearly the holistic-meaning aspect of a reasoning-updated
semantic network as I've described above. And, there are
connectionist learning systems -- e.g. backpropagation neural
nets -- in which the semantics of links are way less transparent
than in the attractor neural net example I've given above.
But this is also an interesting point. I believe that, of
all the techniques in symbolic AI, the ones that are most
valuable are the ones that verge most closely on global,
holistic knowledge representation; and of all the techniques
in connectionist AI, the ones that are most valuable are
the ones that verge most closely on localized knowledge
representation. This is because real intelligence only comes
about when the two kinds of knowledge representation intersect
, interact and build on each other.

As
much of a techno-optimist as I am, even I have to admit,
at this point, that building a real AI is a very hard problem.
To build a comprehensive AI system, with perception, action,
memory, and the ability to conceive of new ideas and to
study itself, is not a simple thing. Necessarily, such a
system consumes a lot of computer memory and processing
power, and is difficult to program and debug because each
of its parts gains its meaning largely from its interaction
with the other parts. The obstreperousness off this problem
can be seen from the litany of brilliant AI failures I’ve
just recited. Lenat, Hutchens and Hillis are just about
as smart as human beings get, and pursued the real AI goal
with diligence and expertise and wisdom – and yet
none of them succeeded. De Garis and I are pursuing the
same goal, in our different ways, and neither of us has
succeeded yet either, although neither of us is ever likely
to give up.
Yet, in spite of the obviously huge obstacles that the real
AI goal presents, I still believe this is where the AI field
should be focusing the bulk of is efforts. For is this not
the only approach that can possibly succeed at achieving
the goal of a real thinking machine? I realize that this
attitude places me in a small minority of AI researchers
– but if my team or one of our “real AI”
focused competitors should succeed, it wouldn’t be
the first time in the history of science and technology
that a maverick minority had been proved right.
We now have, for the first time, hardware barely adequate
to support a comprehensive AI system. Moore’s law
and the advance of high-bandwidth networking mean that the
situation is going to keep getting better and better. However,
to me and the other “real AI” mavericks of the
world, it seems that we are stuck with a body of AI theory
that has excessively adapted itself to the era of weak computers,
and that is consequently divided into a set of narrow perspectives,
each focusing on a particular aspect of the mind.
In order to make real AI work, I believe, we need to take
an integrative perspective, focusing on
·
The creation of a “mind OS” that embodies the
basic nature of mind and allows specialized mind structures
and algorithms dealing with specialized aspects of mind
to happily coexist
· The implementation of a diversity of mind structures
and dynamics (“mind modules”) on top of this
mind OS
· The encouragement of emergent phenomena produced
by the interaction/cooperation of the modules, so that the
system as a whole is coherently responsive to its goals
This
is the core of the Webmind/Novamente vision, and the goal
I’ve been working towards for my entire research career.
|
|
|