Coauthored with Ted Goertzel

I remember heading home from college for spring break in 1983, toward the end of my freshman year. I’d just recently turned 16, and I’d been thinking about AI a hell of a lot – even more than about my new girlfriend, Rachel Gordon, whom I was pretty darn crazy about at the time. A few days before spring break I’d tried to explain my theories on artificial intelligence to my friend Ken Silverman. Ken couldn't understand what I was talking about, so I promised him I'd work on it over spring break, and that when I got back to school I’d explain how it all worked, I’d give him a complete design for a thinking computer program. I had the idea clear in my head, but I was totally unable to articulate it in a way that Ken or anyone else could understand. I spent the whole break working on it, and during those few days I basically worked out the ideas that I’d later put in my first book, The Structure of ntelligence, six years later. I went through every aspect of the mind - reason, memory, aesthetics, intuition, emotion, etc. - and convinced myself that every one could be expressed in terms of pattern recognition and pattern formation. The mind, I concluded, was a pattern recognition system that recognized patterns in the world around it, and – very crucially -- also recognized patterns in itself. Recognizing patterns in itself, it formed patterns within itself, continually giving rise to new structures.

After the break, I still wasn’t able to explain my realizations to Ken in a way that made sense to him, but at least things made a little more sense to me. I knew I had to find a mathematical language to make sense of my intuitions, or I’d never be able to communicate them to anyone, let alone program them on a computer. My grasp of software design at this stage was extremely weak; it was formed mainly by programming games in BASIC. I was nowhere near having the skills to design a general pattern recognition system that recognized patterns in itself and adjusted itself accordingly.

Ken's dad was an extremely smart guy and a prolific and successful inventor, mostly in the area of electrical engineering; and in our college years, Ken often fantasized about becoming a rich inventor and building a mansion, with a basement laboratory in which we’d putter around day and night, wiring together intelligent robots and time machines and so forth. So it’s pretty funny that 14 years later when I decided to start an AI company (Intelligenesis Corp., later renamed Webmind Inc.) I somehow happened to turn to Ken when I needed someone to take over the job of programming my AI system, the Webmind AI Engine.

I hadn't spoken to him for years – he’d stayed in the New York area, where he had grown up, whereas I’d moved all over the world, teaching in universities in Las Vegas, New Zealand and Australia. After getting his degree in electrical engineering, he’d done a lot of different things, including real estate and computer programming. He was really psyched to finally get the chance to collaborate with me on my thinking machine project. Finally, after a decade and a half, I had figured out how to express my plan for artificial intelligence in a way Ken could understand! Ken was the lead engineer at Webmind Inc. for its first couple years, and VP of Technology for the entire lifetime of the firm. Now I’m working with a different crew of engineers, and Ken is working on his own advanced pattern recognition software, but we’re still good friends, and he definitely played an important role in the evolution of my work.

To articulate my vision of the mind in a comprehensible form was much much harder than I’d ever thought it would be. It turned out that the vocabulary for expressing what I wanted to say didn’t really exist in the field of computer science. To find the language I needed to express my ideas and to work out the details, I had to step a long way back from the world of computers and get deeply into the philosophy of mind. Although I was very young then, and even more naïve than I am now, I realized intuitively that it was necessary to get the philosophy right before proceeding to the computational details. Now, I’m jaded by a fair amount of practical experience – though I don’t have a head full of gray hair yet – and I see this far more clearly than I did then. In implementing a general vision of how the mind works, it’s very easy to be misled by the nature of contemporary computer hardware and programming languages, and to wind up implementing things that subtly deviate from the vision one started out with. The way to avoid this is to have the conceptual, philosophical vision very firmly fixed in one’s mind as one sets about the detailed design work, which is huge and at times confusing.

What I’m going to give you in this chapter is a fairly sketchy, but hopefully evocative, overview of the process of creating Webmind and then Novamente. The two AI systems are very different on a technical level, but on the level of a popular exposition like this one, the differences are really pretty small. Novamente uses more sophisticated mathematics and more efficient software structures to implement the same basic concepts that Webmind did. To keep things from getting confusing, I’ll write mostly about “Novamente” here, except where I’m talking historically about the creation of Webmind in particular; but most of what I’ll say about Novamente also applies to Webmind.

The Novamente project is far from complete. Just like every other AI researcher, I’m an abject failure so far – I haven’t yet created a software program displaying human-level general intelligence. Unlike most other AI researchers, however, I and my colleagues honestly believe we are on a path that will lead us to success at this ambitious goal. I don’t expect to convince you one way or another in these pages – my hope is merely that the story of our quest may be an interesting one … and that some of the lessons we’ve learned along the way may be of general value.

I knew from the start that I didn’t want to build an artificial idiot savant – an overspecialized, brittle system as was typical in the AI field. I wanted to build a mind.

But what is a mind, anyway?

In that spring break, sophomore year, that I spent trying to figure out how to explain my vision of the mind to Ken, I arrived at a basic working definition of the mind: a mind is the set of patterns in an intelligent system.

Your mind is not your brain, nor is it some disembodied soul somehow exchanging messages in your brain. Your mind is the set of patterns in your brain – the structures and processes in your brain, so that knowing these structures and processes allows you to explain the brain more simply than just listing the parts of the brains and their positions and states over time.

Novamente’s mind is not the C++ code that my engineering team and I type in – that’s just a code for creating the mind, a little like DNA is the code for creating a human. Novamente’s mind is the set of patterns in the billions of 0’s and 1’s existing in RAM while Novamente runs, cycling through the machine’s processors and passing through the network cables. These 0’s and 1’s themselves are not Novamente’s mind – it’s the patterns in these 0’s and 1’s, the static and dynamic patterns, that are mind. Mind is a set of patterns in a system that achieves highly patterned goals in a highly patterned environment. Everything is pattern, pattern, pattern!

Mind recognizes and creates patterns in the world and itself, achieving complex goals, goals whose definition involves a great deal of pattern.

Although these ideas were clear to me intuitively in 1983, it wasn’t till 1990 or so that I was able to write them down in a clear and comprehensible way. This is what I did in the first few chapters my first book The Structure of Intelligence. At that point I had gotten my PhD in mathematics and was supposed to be doing mathematical research, but just as I’d always been more interested in my own reading and thinking than in my schoolwork, now I was spending my time thinking about pattern and mind and the nature of the universe, instead of proving math theorems like a good assistant professor. The next step was to ask the question: What are the principles by which a set of patterns, a mind, can actually be intelligent? For sure, the precise structures and dynamics are going to vary from one mind to the next, but are there any general principles, applicable to every kind of intelligent system, be it a human, a dolphin, a computer program, an intelligent gas cloud on Jupiter? It’s not totally obvious that there are such principles, but my belief starting out was that such general principles had to exist. What are the principles by which mind’s core algorithm - pattern recognition and formation in itself and the world -- is self-regulated?

One general principle is what the 19’th-century American philosopher Charles Peirce called the “One Law of Mind”: that things in the mind tend to spread attention to other related things in the mind. This is a basic principle for attention allocation, that we can see in the brain in the diffusion of electricity. Novamente incorporates this via activation spreading similar to that in a neural network. This is what I call a “heterarchical” principle – where a heterarchy just means a sprawling network in which each element connects to a few other elements, without a hierarchical structure. A random network in which each node connects to a set of other nodes at random is a heterarchy.

Hierarchy is another important structure of the mind. We see it in the human brain all over the place, most famously in the visual system, where we have a hierarchy of progressively more abstract processes, starting with rccognition of lines and edges, then shapes, then 3-D forms, and so forth. Hierarchy in the mind has to do with increasing abstraction, and with control that’s aligned with abstraction, so that processes dealing with more abstract things control related processes dealing with more concrete things.

A general principle that I’ve thought a lot about – and that I wrote about in my second book, The Evolving Mind -- is what I call the “dual network” – this refers to the interpenetration of hierarchy and heterarchy. In the mind, hierarchy and heterarchy overlap each other, and the dynamics of the mind is such that they have to work well together or the mind will be all screwed up. The overlap of hierarchy and heterarchy gives the mind a kind of “dynamic library card catalog” structure, in which topics are linked to other related topics heterarchically, and linked to more general or specific topics hierarchically. The creation of new subtopics or supertopics has to make sense heterarchically, meaning that the things in each topic grouping should have a lot of associative, heterarchical relations with each other. In Novamente, this general “dual network” principle is reflected in many ways, when one gets down into the details of its various dynamical processes.

Another general principle is self: that minds contain parts of themselves that mirror the whole. This gives a quasi-fractal structure to the mind.

Another general principle, also discovered by Charles Peirce, is that there are three kinds of reasoning: induction, abduction, and deduction. These are all ways of manipulating hierarchical relationships. Hierarchy is about logic, whereas heterarchy is about the spread of attention and the formation of wholes. Once heterarchy has lead to the formation of new wholes, corresponding to clusters of things that all relate to each other, then these new wholes can be dealt with hierarchically, they can be reasoned about. I was very fortunate, a month after Intelligenesis got our seed funding, to get a job application from Pei Wang, who had worked out a neat computational reasoning system (NARS) based on the three forms of reason that I, following Peirce, had identified as essential to the mind.

There are also two dynamics that I believe are generally part of mind. These correspond to the basic philosophical principles of Being and Becoming.

Becoming corresponds to evolution, considered most generally as the survival of the fittest members of a population, and the reproduction of the survivors to form new population elements. Novamente contains explicitly evolutionary components – variations on the computational technique called “genetic programming.” It also contains other components that aren’t traditionally viewed as evolutionary, but really are. For instance, Novamente’s reasoning module involves logical relations (we call them “links”) that combine with each other to create new logical relations. The facts “pigs are fat” and “fat creatures are ugly” combine to create the new relation “pigs are ugly.” And in the reasoning system, unimportant relations are deleted to save memory. Thus, we have survival of the fittest, where fitness means importance to the system, and we have reproduction of the survivors, via the rules of inference. Reasoning is seen to be a form of evolution, in the general sense.

Being corresponds to what system theorists call “autopoiesis” – an obscure word that has a very useful meaning. It means self-production. Every cell in the body is produced by other cells in the body – so the body is a self-producing system. The mind is also a self-producing system. This is basically the theme of my third book, Chaotic Logic. If you remove part of the mind, the other parts of the mind that relate to it will be able to reproduce it, approximately if not exactly. If you take out the logical relation “pigs are ugly” for example, the system may be able to regenerate it by inference from the other relations “pigs are fat” and “fat creatures are ugly.” It may come out with a different strength than it had before, but it will still be reproduced, perhaps lossily. If you take out all memory of the text “War and Peace” from the mind, but retain a lot of related knowledge, this related knowledge will cause the system to want to read War and Peace, which eventually will likely lead the information about the text to be regenerated. In this case, interaction with the environment is part of the mind’s autopoietic dynamics.

Evolution changes the system in accordance with its goals and its environment; autopoiesis keeps the system the same as it was before. The mind needs both of these forces; they need to be properly balanced. The balance of these leads to productive creativity, and this was the main theme of my fourth book, From Complexity to Creativity.

I arrived at my list of general principles of the mind by a kind of unholy combination of introspection, mathematical analysis, and survey of biology, psychology and computer science. I spent a long time trying to prove mathematically that all these general structures and dynamics, and a few others, were necessary and sufficient for mind – any system having them would have a mind, and any system not having them couldn’t have a mind. But eventually I gave up; I decided that the mathematics of today is not adequate for proving this kind of thing. I gathered my various insights and intuitions and conclusions about how the mind worked, and gave the list a name: the psynet model of mind. Psynet = “mind-network”, a theory of the mind as a network of interacting, intertransforming agents. I realized that the conceptual picture of the mind that I’d developed was of significant value in itself, apart from any mathematical formalization I might give it. No one else working in the AI field seemed to me to have a similarly comprehensive and powerful conceptual analysis of the mind. I still think inventing the needed mathematics to usefully and completely formalize the psynet model is an interesting challenge – but it’s not as interesting to me right now as using my intuitions about the general structures of intelligence to build thinking software.

The general structures and dynamics of the “psynet model” can be manifested in many many different ways, in different systems. The process of building Webmind, and then Novamente, has been in this sense a top-down process. I started out with an idea about what general principles had to emerge from the system to make it intelligent, and this placed a constraint on what the system had to be like. It had to be built so as to make the right general structures and dynamics emerge. Aside from that, I didn’t care very much exactly what the system was like. I had, and still have, an attitude of being willing to learn via experimentation in this regard.

My first serious attempt to build a real AI system (earlier chatbots and abortive experiments not counted) occurred in 1994. I used a programming language called Gofer, which I later benchmarked at 1/10,000 the speed of C (the standard programming language in the commercial world). Gofer was a beautiful language, which matched up nicely to my vision of the mind. This program was called Antimagicians; it was a population of actors called magicians, and antimagician actors that annihilated the magicians in complex patterns. Just about all it ever did was produce a type of error called a “stack overflow.” This was a shame, because my model of mind was very simple and compact in this programming language. But it could only run on one machine, and it ran incredibly slowly; the only thing it did fast was use up all the machine’s memory.

Gofer was a “functional” programming language, meaning not that that it performed useful functions (far from it!), but rather that it was based on the mathematical concept of a “function.” Gofer was basically equivalent to mathematics. It appealed to my sense of formal elegance; it was perfect in the sense of a Bach fugue. Unfortunately, though, functional languages do not match well to the von Neumann computer architecture, so it is very hard to make them efficient without special hardware. After the debacle of my stack-overflowing proto-AI system, I abandoned Gofer and turned back to C++, and then to the new programming language Java. But I restricted myself to more modest programming experiments. I made a C-language version of Antimagicians, which was much simpler and less interesting than the Gofer version. In Java, I made a genetic algorithm that ran on multiple machines (coded together with Rosalind Barr at University of Western Australia), and a simple actors-based search engine (coded together with Mark Messenger, also at UWA). I could see from this experience that, while my AI system in Gofer had small, because Gofer was made for expressing systems that refer to and organize themselves, a comparable system in C++ or Java or any other practical programming language was going to be huge. It took a couple years for me to summon the guts to attempt such a thing.

One thing that occurred to me as I started to think about implementation issues, much more than it had in my days as a pure theorist, was the crucial role of specialization. My Gofer-based mind had been theoretically capable of intelligence; it was a general system for recognizing and forming patterns in itself and its environment. But its generality didn’t allow it to solve any particularly useful problems within practical time and space constraints. In that sense, it had been a miserable failure as an intelligent system. In practice, I concluded, to get reasonably efficient intelligence one needs to code specialized cognition algorithms, aimed at recognizing patterns in particular kinds of data, learning how to carry out particular kinds of actions, and so forth. The brain is very much like this: we have 30% of our brain specialized for visual pattern recognition; regions specialized for language; regions specialized for body sensations; regions specialized for social interaction; etc. etc. And then we have a little bit of general intelligence, which is what makes us uniquely brilliant among the animal kingdom – but this general intelligence relies on all the specialized stuff to give it a meaningful context within which to operate.

Specialization needs to be mediated by rich interaction between specialized parts. The different specialized parts of a system need to learn from each other, and learn about the world together whenever they can. The integration of various specialized pattern recognition subsystems has played a huge role in practical Webmind engineering.

Because of all this specialization, it seemed to me in 1994 and 95 that there was no way to build a thinking computer program on contemporary computer hardware. It seemed to me that some kind of humongous brainlike supercomputer would be necessary. And then I discovered the Internet (unlike Al Gore, I didn’t invent it!). It struck me that the millions, soon billions, of machines around the world, all hooked together on the Net, had enough memory and processor power to create a real computational intelligence. The Java programming language came out in 1995 and it seemed the right tool to use to create a networked AI engine embodying the general principles of mind: recognizing and creating patterns in itself and the world, using a variety of specialized methods integrating together into a whole, an evolving autopoietic whole.

Not only did the Internet give you the computational power to build a thinking machine; it also provided a really rich perceptual environment. A mind can’t exist in isolation; it has to achieve complex goals in a complex environment. The physical world is obviously complex but building a robot body is another huge project, comparable in scope to building a mind. The Internet is arguably rich enough in diverse details to support intelligence, and it’s a lot easier to hook your AI system into the Internet than to build it a robot body. I made up my own complex goal: To build an AI system whose body was part of the Net, and whose perceptual world was the Net itself, the Web. A mind for the Web; a Webmind.

In terms of the conception of intelligence as “achieving complex goals in complex environments,” the goals I had in mind when designing the Webmind system were roughly:



* Conversing with humans in simple English, with the goal not of simulating human conversation, but of expressing its insights and inferences to humans, and gathering information and ideas from them.

* Learning the preferences of humans and AI systems, and providing them with information in accordance with their preferences. Clarifying their preferences by asking them questions about them and responding to their answers.

· * Communicating with other AI systems, in a manner similar to its conversations with humans, but using a mixture of human language and a more formalized and precise computerized language we have created, called Sasha

* Composing knowledge files containing its insights, inferences and discoveries, expressed in Sasha or in simple English.

* Reporting on its own state, and modifying its parameters based on its self-analysis to optimize its achievement of its other goals.



Of course, my ambitions didn’t end there – that would be wimpy. Subsequent versions of the system were intended to offer enhanced conversational fluency, and enhanced abilities at knowledge creation, including theorem proving, scientific discovery and the composition of knowledge files consisting of complex discourses. And then of course the holy grail: progressive self-modification, leading to exponentially accelerating artificial superintelligence!

I remember a particular moment when my diverse ideas about AI crystallized in my mind, with amazing clarity. I could see in my mind exactly how an AI system could be built. Now all that was left was to work out a few pesky details.

At this point, it had been 13 years since I’d first set myself the goal of building a thinking machine. I now had a PhD in math, and had spent countless thousands of hours studying of cognitive science, physics, computer science, neurobiology, philosophy of mind. I’d published four books on the mind, which were idiosyncratic combinations of mathematics, philosophy and science, all pushing in the same direction, toward an understanding of the mind that was both fundamental and precise. I felt I finally had the answer. And it seemed that the hardware was finally getting there too. We had cheap computers with gigabytes of RAM, and we had high-bandwidth Ethernet and Internet, allowing distributed computing among dozens or even millions of these powerful, cheap machines, etc.

It all seemed incredibly clear to me. Mind was exquisitely simple in essence. A mind was a web of patterns, a network of independent mind actors, each one concerned with recognizing patterns in other actors, and patterns emergent between itself and other actors. New actors were created to embody new patterns. The overall network of mind was continually re-making itself via recognizing patterns in itself. The character of a particular sort of mind was determined by the assemblage of pattern recognition/creation actors inside it. The art of mind design – an as yet nonexistent art – would consist of choosing the right assemblage of types of actors so that the emergent self-reconstructing behavior of mind would get into a productive dynamical attractor. From my 13 years of thinking about human and artificial intelligence, I felt I had a good idea how to choose and design the right mind actors, so that when these actors were released to study and transform one another, the self-reconstructing, self-recognizing dynamic characteristic of mind would emerge.

And so in the fall of 1996 I started creating the Webmind AI Engine. As I’ve said, I’d been working on similar things off and on for years; but the actual design of the Webmind system as it is now was something I started in the fall of 1996, when I was in Western Australia, working at UWA as a Research Fellow. Soon enough this got more interesting than anything else I was working on -- I realized that I was on the verge of something really cool, and something that I wasn’t going to be able to implement myself, or with a couple research assistants. John Pritchard, my New York e-mail pal, was convincing me that it was plausible to get funding to start a company building software according to my designs. The idea was appealing.

At the start of 97 I quit my job at UWA and moved to the US to work on Webmind design and coding full time. I didn’t have any clear business plan in mind, but I figured that once I got some clearly intelligent behavior working, the venture capitalists would beat a path to my door. Naively enough, I figured I’d reach that point after a few months hard work. I figured that after I got some basic stuff working, I could raise a few hundred thousand dollars to pay perhaps 5 programmers, and then we’d get the whole thing implemented in 6 months time – presto! a thinking machine. Fame and fortune, and truckloads of beautiful girls, would be mine.

What I had at the end of summer 1997 was ten thousand lines of Java, largely designed as I went along. This system was never completed, and of the parts that were completed only half of them worked. There were lots of details I didn't understand. This first serious attempt at Webmind had too much of my theory of mind in it, and not enough computational practicality. It was beautiful as a mathematical and logical statement, but still horrible as a computer program. I still was too closer to Gofer, and hadn’t come to grips with what I’d have to do to make a useful, efficient implementation of my model of mind.

But still, the ideas, data structures and dynamics underlying this first Webmind were conceptually about the same as the ones underlying Novamente today. The mathematics and the software design have both changed tremendously, but the underlying vision is the same. Novamente, like Webmind before it, is based on the idea that the mind is a collection of patterns that forms and recognizes patterns in itself and the world, and in this way achieves complex goals in the world. It makes this vision concrete by defining some simple software objects corresponding to patterns and goals.

In the mid-90’s, starting out on Webmind design, I had basically a comprehensive knowledge of what was happening in the AI world. It was a mess. It’s basically the same way today. There’s no well-understood, commonly accepted body of scientific knowledge about AI. Instead, there’s a vast diversity of approaches to various aspects of the relationship between computation and mind. Some of the approaches contradict each other and some of them complement each other. Designing Webmind was a process of assembling information from various different perspectives and disciplinary areas into a coherent whole, guided by a set of governing principles.

Many different subdisciplines within the AI umbrella contributed to the structuring of Webmind, and then Novamente. Some of them I was thinking about when I first started designing Webmind, others emerged as being significant more recently, further along in the design process, in some cases only in the transition from Webmind to Novamente. Table 1 gives an overview of the sorts of things that Novamente draws from various disciplines. It may be a bit opaque to the nontechnical reader, but it will mean something to the reader with some computer science background, and perhaps to others it will be at least generally evocative.

Cognitive Psychology

From cog psych we have taken a number of high-level structural principles, for instance the notions of Long-Term Memory, Episodic Memory (memory of your own history), and Short-Term Memory; and the distinction between procedural (knowledge of how to do thing) and declarative knowledge (factual knowledge).

 

Introspective Psychology

Modern cognitive psychology is experimentally focused, but past traditions in psychology have more openly drawn their inspirations from introspection, from what each mind intuitive knows about itself.  The overall structure of Novamente owes something to ideas drawn from these traditions, from Gestaltism to Buddhist psychology and Peircean philosophy.

 

Neuroscience

From neuroscience we have taken the observation that mind can be implemented by a parallel distributed system with activation spreading  around it in complex patterns – i.e. a ‘neural net’, broadly conceived.   We’ve also taken our approach to localization from what’s known about the brain: in Novamente, knowledge is distributed, but not across the whole system; each type of knowledge is distributed across a part of the system, just as is done in the brain.



Complexity Science

The emerging science of complex systems has contributed crucial concepts such as self-organization, evolution, autopoiesis and emergence.  Novamente is a modular system in which the real intelligence emerges from interaction between the modules.  Like many complex systems, it displays behaviors like phase transitions and sensitivity to initial conditions, and evolution-ecology interactions.



Nonlinear Dynamics

One of the more rigorous subsets of complexity science, nonlinear dynamics studies the attractors and transient patterns that emerge as nonlinear systems evolve over time.  Novamente is a highly nonlinear dynamical system whose attention is allocated by complex attractor dynamics, and that specifically studies transients in its own dynamics so as to self-adaptively modify its own structure.



Statistical Pattern Recognition. 

In its analysis of numerical data (e.g. financial forecasting) and its lower-level linguistic processing, Novamente makes use of statistical pattern recognition tools.  What makes it unique is its ability to integrate statistically recognized patterns with other types of knowledge, and to generalize from this knowledge via inference and other mechanisms.



Multi-Agent Systems

With the advent of distributed and parallel computing, there is a substantial body of knowledge about how to make populations of computational agents cooperate to carry out useful activities.  Novamente is a multi-agent system, albeit a very unusual one, and its system architecture makes use of principles from this area of computer science in many ways.



Computational Linguistics

The last decade’s explosion of knowledge in computational language processing has produced many techniques of use within Novamente.  The challenge has been to get all these tools working together in a common framework focused on extracting, creating and producing meaning rather than on syntax analysis



Expert Systems

Novamente allows humans to enter expert knowledge into it via XML, Sasha or other special formal languages, similar to standard AI expert systems.  Unlike expert systems, though, it doesn’t take this knowledge as truth: it takes it as information given to it by another mind, and feels free to forget it or modify it as it sees fit.



Machine Learning and Optimization

Machine learning and optimization algorithms are not real AI systems but they do solve problems that are crucial to the mind.   Novamente uses genetic algorithms, genetic programming, and statistical machine learning techniques for various purposes, internally.



Logic

While Novamente is not a logic system in the traditional sense, it makes use of the reduction of general relationships to a simple relational formalism, which was pioneered by mathematical logicians and logic-inspired AI engineers.  It manipulates relationships using uncertainty-robust, self-organizing reasoning techniques different from those used in the logic or AI literature



Table 1 - Novamente’s Diverse Inspirations

Obviously, this laundry list of component technologies doesn’t really tell you a damn thing about Novamente. That’s because the crux of Novamente lies, not in the component technologies, but in the way these technologies are structured to form a coherent self-organizing system. But still, the presence of all these tools made the process of building Novamente very different than it would have been if none of the tools existed, and you had to build every component technology from scratch. Rather than just “how do you program a mind on current hardware and software?”, the question becomes more like “Given all these wonderful tools, and amazingly powerful distributed hardware on which to implement them, how can we tie them all together in a harmonious and mutually adaptive way to produce a mind?”

Given the general conceptual framework I’ve described, and the practical and conceptual toolbox I’ve listed, the first step toward actually designing Webmind was deciding what the “atomic mental object” should be.

Bigger than a neuron, smaller than a machine, was the first decision. I created a Java object called a Node. A node is the most basic kind of pattern known to Webmind – it’s something Webmind recognizes as a whole. A node says, “This thing is worth distinguishing from its environment as a whole entity. Here it is. It persists and maintains its boundaries over time.” We have some nodes referring to external sensed objects: TextNodes, DataNodes, WordNodes, and so forth. We have some nodes representing patterns recognized in the system itself rather than in the outside world: CategoryNodes of various kinds, AutomatonNodes representing evolved patterns, etc. There are nodes called SubgraphImageNodes that represent parts of the mind, grouped with a boundary drawn around them so as to be considered as a kind of higher-order individual. And so on, and so on, and so on.

But nodes are just the start. Webmind is also wired to recognize certain kinds of patterns involving nodes. Similarity is the most basic kind of pattern: it’s the recognition that two different things, occurring at different points in space or time, are actually a lot like each other, and can be interchanged for many purposes. Inheritance is also basic: it’s the recognition that you can substitute A for -- (though maybe not -- for A) without substantial loss of information.

How many link types to incorporate was a big question. In the AI systems known as semantic networks, you have a different type of link for every relation in the net – a link type for kick, a link type for eat, and so forth. On the other hand, in a typical neural net model you have only one link type; whereas in the brain, there are many types of neurons and synapses – hundreds of link types, if you identify a link type with a synapse that’s reactive to a certain neurostransmitter.

In designing Webmind, we didn’t want to introduce too many types of links, because this just leads to a network that represents data in ways it doesn’t understand. We chose to use a few dozen link types, representing what I think of as archetypal types of relationships.

What kinds of relationships are “archetypal” for Novamente? Here I’ll just give a few important examples. We have similarity links, representing the belief that one actor is similar to another. There are inheritance links, representing the belief that one actor is a special case of another. There are spatiotemporal links, representing the belief that one actor represents something occurring near the other one in time or space. There are containment links, representing the belief that the entity represented by one actor is contained inside another one. There are associative links, representing simply the fact that Webmind's dynamics tend to associate one actor with another. This chart shows the definitions of these links in a bit more systematic way:

Link Type Pointing from A to B

Meaning of the Link

Similarity

A is similar to B

Inheritance:

     by Extension

A is a special case of B

     by Intension

B is a special case of A

SpatioTemporal

A occurs at the same time and place as B

Temporal

A occurs at the same time as B

Before

A occurs before B

After

A occurs after B

Containment:

     Part of

A is a part of B

     Contains

B is a part of A

Associative

B is associated with A

     HaloLink

B is associated with A by Webmind's Dynamics

These link types, and others refining and extending these, are the elemental types of relationships that Webmind “understood.” They are a bit, but not a lot, like the various neurotransmitter receptors in the brain, which make different synapses different. The brain's receptors do not correspond so neatly to logical relations. But Webmind is not a brain; it is a mind that emerges out of digital computer hardware. Digital computer hardware is closer to logic than cells are.

These links are heterarchal in a sense; any node can link to any other node. But they are also organized in hierarchies of composite actors representing, not specific relationships like links, but collections of relationships. Nodes contain links; nodegroups contain nodes, lobes contain nodegroups, and the mother of them all: the Psynet, the whole Webmind, that contains a lobe for each machine in its network. The basis of it all is the node: a node containing a bundle of links expressing its relationship to other nodes, and also some basic data objects and actors and roles. Nodes sending out messages -- information gathering and information carrying actors -- of various types to help them build new links to other nodes. A gigantic network of interlinked actors, constantly rebuilding itself, extending across multiple CPU's and multiple machines.

The nitty-gritty engineering needed to make this all work is considerable indeed. But the basic concepts are elementary. It's nothing but Peirce's network of relations, each spreading attention to the other relations that it stands to in a peculiar relation of affectability. It's nothing but Nietzsche's dynamic quanta, each one defined in terms of other dynamic quanta, each one re-creating itself and each other. It's beautiful and primal -- but it's not intelligent, without more detail, more specialization. It’s like the brain of an infant. All the core abilities are there, but intelligence develops as it incorporates and processes specialized information.

It’s easy to see how both node and links are patterns in the sense that they allow one to compress information. If two parts of something one is describing are similar, one can save effort by not describing the second one in detail and just describing it approximately by reference to the first one. For instance, to describe a picture consisting of two similar heads, you can draw one head and then just say “imagine two of these next to each other.” If one of the parts of the picture inherits from the other, one can save effort by replacing the more specific one with the more general one. Of course, there is a loss of information here. Suppose half of the picture is a general human shape, and the other half is my shape. My shape inherits from the general human shape, obviously. But if you describe the picture by drawing the general human shape and saying “two of these,” you’re losing a fair bit of information, though certainly not all of it.

Similarity and inheritance are logical relations, logical patterns. We also have purely observational patterns, like temporal relatedness, spatial relatedness, and part-whole relatedness. And we look for general association relations: When the system thinks of X, what Y comes to mind? This Y stands in an associative relation to X.

Nodes in Webmind contain links to other nodes, each link embodying one of these basic inter-node relationships: similarity, inheritance, part/whole, spatial, temporal, associative. Nodes and links are the two levels of pattern that are automatically and instinctively recognized by Webmind: nodes representing perceived wholes carved out of the chaos of the world or mind, and links representing patterns perceived among the nodes.

We then have special methods of building links. The method we used most in Webmind (but have basically abandoned in Novamente) was one I came up with in 1996, inspired by Web spidering, called Wandering: we have actors that move around through the network of nodes, traveling from node to node along links, looking for nodes that are strongly related and should be joined by new links. This particular method of link formation may or may not be the best. The key point is that there is some dynamic by which new and relevant links are continually formed.

Relevance is determined by how much “activation” each node has, and activation is spread through the network by Peirce’s Law of Mind, which is the same at to say, by basic neural net activation spreading. The Java object that carries activation through Webmind, we call a Stimulus.

Associative links are built by a process we call “halo spreading,” in which a node gets active and then measure how active other nodes become as a consequence, after a certain period of time. It spreads Stimuli to other nodes and then collects them after a while, observing how stimulated they’d become.

Again, there are a lot of ways of doing these things, and the current ways may or may not be the best. The exact method of spreading activation or halos is not crucial to Webmind, but rather just the overall character of the patterns being recognized and formed.

Halo spreading and reasoning and wandering form new links, but it’s also crucial to form new nodes, and this is done by combining old nodes in various ways (fusing them, splitting them) and also be explicitly evolving new nodes to satisfy various goals using special nodes called EvolverNodes.

The achieving of goals, crucial to intelligence, is done using nodes that we now call SchemaNodes, which contain little programs that control aspects of perception, action and thought. Perceptions from the outside world come into Webmind and are translated into nodes right away. These nodes link to other nodes representing contexts that the system is operating in, and these contexts link to SchemaNodes, representing things that might be desirable to do. The goals as well as the contexts link to the schema, so that the hottest schema will be the ones that are relevant to the current goals in the current contexts. Schema look into the long-term memory of the system and grab out the various nodes and links contained therein.

There’s also a SelfNode, recording the history of the system – what psychologists call “episodic memory” – and predicting the future of the system, and selecting the system’s goals according to the metagoal of maximizing system happiness. Yes, we have a Happiness FeelingNode, and nodes for other basic emotions, complex emotions being considered combinations and mutations of simple ones. What makes the system happy – we get to decide at first, until it mutates and modifies its own HappinessNode just like we do. Right now, it likes to answer questions people ask it, it likes to save memory, and it likes to build a lot of high-strength links – i.e., to discover a lot. Schema look into the SelfNode to get their overall motivation.

Many goals involve making others happy, and for this, models of other minds need to be maintained; this is done in UserNodes.

There is a loose mapping between these data structures and things in the brain. Nodes are a bit like neuronal groups – clusters of 10,000 to 100,000 neurons, that sort of act as a unified whole. Links are sort of like bunches of neural connections between one cluster and another. This intuitive mapping onto the brain can be useful, and it’s surely not a complete fluke that the structure of the brain is a lot like the structure of the mind that emerges from the brain. On the other hand, it’s important not to overblow the very loose neural modeling aspect of Webmind. Webmind was supposed to be a mind, not a model of the human brain, and it’s a definite failure at being a model of the human brain, not surprisingly.

There’s a lot of complexity here, just like in the brain. But basically, Webmind's architecture was that of a massively parallel network, a population of many, many different information actors – nodes, links, wanderers, Stimuli spreading activation and collecting halos. The nodes continually recompute their relationships to other nodes. Queries put to the system are transformed into nodes that take advantage of WebMind's self-evolving structure to produce the needed answers.

All this – plus or minus a few critical details, and a lot of non-critical ones -- was outlined roughly and erratically in some documents I wrote during Spring and Summer 1997. Some things were designed in detail, others just hinted at. Because so many details were left out, it wasn’t quite clear to me, at that point, what a humongous system this was going to become.

This was still pre-Webmind Inc.; I was working in loose collaboration with a friend and programmer named John Pritchard, who liked my thinking in a general way, but never really came to grips with my ideas, except on a philosophical level. He wanted to approach things by first building a general Java infrastructure for dealing with AI, and then implementing my particular AI theories – an approach which makes sense, but only if the infrastructure is deeply informed by the AI theories, which wasn’t the case then.

During summer 1997, John and I parted ways, and my friend Lisa Pazer and I started the company that was initially called Intelligenesis Corp., and later changed its name to Webmind Inc. (because American businesspeople seemed to have too much trouble spelling the orignal name!). At that point I gave up coding 10 hours a day, turning that responsibility over to my newly recruited old friend Ken Silverman, and spending most of my time on design issues. I was still coding a few hours a day at that point, but not like before.

Ken learned Java in a couple weeks, and set to work. We talked on the phone several hours a day, and he coded for the rest of his waking hours. He ended up creating a new Webmind from scratch, based on reading and reinterpreting print-outs of my eccentric, tangled Java code. My first version had been useless, but had followed the concepts of my theory of mind fairly directly. Ken's version followed the structure of Java more so than my theoretical ideas. It was a colossal step backward in conceptual elegance. But it had one fantastic redeeming feature: as of February 1998, it finally worked!

OK, in retrospect, it didn’t really work, but it looked like it worked at the time. It wasn’t made to exploit multiprocessor machines, or networks of machines. It wasn't ready to serve as the infrastructure for the global brain. It was too small to demonstrate any really interesting emergences, any of the structures of mind I’d identified in my theoretical work. But it was our first working prototype, and we rigged it up to do some simple things like read in a bunch of Web pages or numerical data series, and decide which ones were similar to each other. No tremendous intelligence was apparent yet, but we hadn't expected any. We'd built the infrastructure for intelligence, but hadn't put in the specialization that would allow the system to display useful intelligence in particular areas.

It was very simple in concept, but very complex to actually implement. We had a network of mental entities, each one related to other mental entities, and each one constantly revising its collection of relationships. Each node, and each actor, was an "object" in the Java programming language, which proved very well suited to our needs. Writing Webmind meant writing Java "classes" for all the different kinds of nodes, wanderers and other objects we needed. Practical problems kept coming up, problems I had never thought of when I was writing theoretical books and scribbling notes on the back of photocopied research papers. For example, what do you do when the system has recognized too many relationships in itself and has run out of memory? How do you decide which relationships to cull? How does the system manage its time, allocating certain amounts of CPU time to each node to use in building new relationships? How does the system determine how much time to spend loading in new information into new nodes, versus building new relationships among existing nodes? And so on, and so on, and so on.

We also wanted to build up Webmind's thinking power. This meant we had to keep increasing our palette of specialized classes of nodes and links representing particular kinds of relationships and concepts. The real intelligence, I was certain, would then emerge from the interactions of all these specialized nodes and links in the self-organizing network. But before we could get there, there were dozens of mechanical issues to be worked out, debugged, tested, tuned.

In the very early days of Intelligenesis, before we got funding, the work proceeded in pairs, each pair consisting of me and someone else. Lisa and I worked on the business plan and tried to raise money. Ken and I worked on the first Webmind prototype, which ran on a single computer with a single processor; Ken doing nearly all the coding, me giving him designs and suggestions through endless phone calls and meetings. Jeff and I were taking his nonlinear prediction algorithms and making them more intelligent and flexible, integrating them with some of my own AI work. Onar and I were sending back and forth endless e-mails diagramming what would later become the language learning component of Webmind’s natural language system. And Paul, in looser communication with me than the others, was designing and coding the Pods system, a very nice system for doing self-organizing computing on multiple machines and multiprocessor machines.

In the spring of 1998, Ken integrated Webmind with the Pods system, producing the first Webmind that had a prayer of actually running on a lot of machines at once. This was a system which could serve as the foundation for a global mind. It exploited the power of Java even more fully than Ken's first version had -- it was more "object-oriented," and used Java's network-computing facilities more thoroughly.

And then things went completely crazy. In a mostly good way. Lisa finally got us funding, and we started hiring programmers and scientists. People were coding nodes and links embodying specialized kinds of intelligence. The system got smarter, and things got far messier.

The most crucial hire was Pei Wang, a Chinese computer scientist a few years older than Ken and me, who when we hired him had spent the last 12 years developing a system of probabilistic logic called NARS, the Non-Axiomatic Reasoning System. Within a few months, Pei had integrated many of the ideas of his NARS reasoning system into Webmind, providing us with a handy nodes-and-links version of probabilistic logic. He also introduced a lot of ideas into Webmind as a whole, apart from its reasoning component. For instance, it was Pei’s inspiration that every link in Webmind should have four numbers associated with it: a strength telling how significant the pattern represented by the link is; a confidence telling how sure we are of the assessed significance; an importance telling how useful the node is to the system as a whole; and a decay rate telling you how fast importance decays for that particular node.

Toward the end of summer 1998, we also hired Cassio Pennachin, who at that point was just one among a handful of Java hackers around the world whom I’d recruited through job ads on Usenet. Cassio lived in Belo Horizonte, Brasil, and first took on the job of fixing up some code I’d written for evolving new structures in the mind using a variant of genetic programming. This was the beginning of what’s become an Intelligenesis tradition: Brasilian programmers receive American code by e-mail and respond very politely with comments like “Excuse me, but would you be terribly offended if I made a few changes to this code?” Of course, you say yes, and a few days later you receive a completely new version of the software, containing exactly three lines from your original code, but much better designed and also more efficient.

Cassio proved to be an excellent manager as well as an excellent software engineer, and I let him accumulate assistants until, as of now, we have more than half our engineering staff in an office Belo Horizonte, with Cassio as our overall Director of Webmind Development. The Brasilians, so far, have not made any big AI innovations, but the disciplined approach to object-oriented design that they’ve brought us has been just as important as our AI innovations, in terms of getting Webmind, this humongous piece of Java code, to actually work. The real importance of this aspect of their work didn’t become apparent until the end of 1999, with their psycore redesign – but I’m getting ahead of myself.

The rapidly increasing size of the Webmind codebase was inevitable because the core code Ken and Paul and I had written wasn't enough for intelligence in any practical context. It was just a generic intelligence mechanism, a self-organizing, relationship-building network. As we introduced more and more specialized nodes into the system, the system as a whole changed. New problems emerged. We should have anticipated that this would happen, but we hadn't really thought about it. We'd been too busy dealing with the challenges of formulating the psynet model in Java in a network-friendly way.

To deal with this blossoming of the Webmind code, in the summer of 1998, Ken and Paul split Webmind into parts. The central part, the one they had been working on, they called Psycore. This contained the generic mechanisms for dealing with nodes, links and wanderers. In a sense, this was Webmind's operating system, the code that enabled all the parts to work together. Then there were the Psymodules, one for each specialized area of intelligence: natural language, reasoning, numerical data analysis, etc. If we were to decode the DNA code that generates the human brain, we might find that it works in a similar way. The "psycore" would be the DNA code that describes the features that are common to all neurons, synapses and neurotransmitters. The "modules" would be the DNA code which describes the distinct features of the specialized types of neurons (there are dozens) and neurotransmitters (there are hundreds), and the particular patterns of neurons, neurotransmitters and synapses that make up different parts of the brain.

The brain has hundreds of specialized parts devoted to tasks such as visual perception, smell, language, episodic memory, and so forth. Each of these parts is composed of neurons which share certain fundamental features, but each also has its unique features and capabilities that scientists are only beginning to understand. Similarly, when a Webmind is running on a computer, different parts of the computer's memory are assigned to different tasks. Each of these parts of the computer's memory draws on the psycore for its basic organizational framework, and on more specialized modules for advanced capabilities.

Each of Webmind’s modules is specialized for recognizing and forming a particular kind of pattern. And all the different kinds of nodes and links can learn from each other -- the real intelligence of Webmind lies here, in the dynamic knowledge that emerges from the interactions of different species of nodes and links. This is how Webmind builds its own self; it’s the essence of Webmind’s mind, of how Webmind’s patterns create and recognize patterns in themselves and the world to achieve their complex goals.

I’ll give a quick laundry list of modules, without going into great detail on any of them.

There was a numerics module, containing data processing actors that recognize patterns in tables of numbers, using a variety of algorithms, some standard, some innovative. DataNode embodies nonlinear data analysis methods and it recognizes subtle patterns that’ll always be missed by ordinary data mining and financial analysis software.

There was a natlang module, which deals with language processing. The natlang module represents texts as TextNodes, linking down to WordNodes representing words in the text, and other nodes representing facts, concepts and ideas in the text. It has text processing actors that recognize key features and concepts in text, drawing relationships between texts and other texts, between texts and people, between texts and numerical data sets. These actors process vast amounts of text with a fair amount of understanding and a lot of speed.

The natlang module also contained reading actors, which are used to study important texts in detail. They proceed through each text slowly, building a mental model of the relationships in the text just like a human reader does. These reading actors really draw Webmind's full set of semantic relationships into play, every time they read a text.

There was a category module, containing actors that group other actors together according to measures of association, and form new nodes representing these groupings. This, remember, is a manifestation of the basic principle of the dual network.

There were learning actors, that recognized subtle patterns among other actors, and embody these as new actors. These spanned various modules, including the reason module, containing logical inference wanderers, that reasoned according to a form of probabilistic logic based on Pei's Non-Axiomatic Reasoning System; and the automata module, containing AutomatonNodes that carried out evolutionary learning, according to genetic programming, a simulation of the way species reproduce and evolve.

In the user module there were actors that model users' minds, observing what users do, and recording and learning from this information – these are UserNodes and their associated Wanderers. There are actors that moderate specific interactions with users, such as conversations, or interactions on a graphical user interface. And in the self module there are self actors, wanderers and stimuli that help the SelfNode study its own structure and dynamics, and set and pursue its own goals.

Each of these actors involved in the modules had in itself only a small amount of intelligence, sometimes no more than that you might see in competing AI products. The Webmind core – “psycore”, as we sometimes called it -- was a platform in which they can all work together, learning from each other and rebuilding each other, creating an intelligence in the whole that is vastly greater than the sum of the intelligences of the parts.

The version of Webmind we completed in the summer of 1998 – the first multi-module version -- worked fine for about a year. We used it to build the modules essential for Webmind's core intelligence and for several impressive applications. It included a module for text-based market prediction; a natural language module for mapping texts into networks of meanings; several modules for the evolution of concepts according to different methods; a module for Webmind's self-understanding; and so forth. The development of each module was driven by requirements particular to certain application areas. The financial modules were driven by the practical need to predict the markets. The natural language module was driven by the need to parse financial text, and understand human queries. The concept learning modules were driven by the need to learn concepts relevant to financial prediction and to the processing of human queries. The self-understanding module was driven by the need to have the system proactively think about things that humans were likely to ask it about in the future.

At this point, Webmind benefited greatly from the fact that we weren't just implementing a theory, we were hard at work developing practical applications. One of the most profound pieces of advice I’ve ever received about Artificial Intelligence came from Danny Hillis, who I discussed above -- inventor of the Connection Machine parallel processor, founder of Thinking Machines Inc., and an informal advisor for Webmind Inc. throughout its lifetime. As we sat in the South Street Seaport in New York eating dinner one day, he was discussing a major AI company that had worked for 10 years to design an AI system, without considering in detail any particular application of the system. Lo and behold, the system had never done anything useful. Danny’s comment was: “They were brilliant people with good ideas, but they made a serious methodological error. They developed their system for years and years, without any contact with practical applications.” Our software was saved from this fate by the fact that we were committed to producing actual products, simultaneously with working toward the goal of real AI. We were freed up to commit other major errors instead!

The Webmind AI Engine itself was never used inside any production-version software products, but it was used to prototype a number of AI processes that were later re-implemented inside products. One of these products, the Webmind Market Predictor, will be discussed in detail in the following chapter. The reason the Webmind AI Engine wasn’t used directly in products was basically that it was too slow-running, and plagued by hard-to-excise bugs. The Novamente system, as I’ll discuss a little later, is a more mature effort and doesn’t have these problems, and it’s being directly used inside some software products we’re developing for the bioinformatics market.

Working on practical problems in parallel with grandiose long-term goals was valuable – but it had its disadvantages as well. It pushed us to overspecialize the system, hyperdeveloping those portions that were needed for products, rather than developing the whole system in a more evenly-balanced way. Most of our code was good for the specific tasks it specialized in, but we had not gotten to the stage where all the modules, all the different node and link types, were working together in one big multi-machine Webmind. We were producing cool research software, but not the global brain I had dreamed of. We hadn't yet seen the emergence of the dual network, of the self. And we weren’t able to push straight toward it because the particular portions of the system needed for the Webmind Market Predictor – our first product – needed so much attention.

But overspecialization induced by business needs was far from our only problem. The truth, as we sourly discovered, was that our core Java code, implementing the essence of the psynet model of mind, was just barely adequate for building products, let alone building real AI – it had too many bugs and was poorly documented. Ken had implemented this code brilliantly and painstakingly over a year of 15 hour workdays, but, even so, the task had been too big for any one human. We could have fixed up his code to make it product-ready, but we doubted whether we’d ever get it to the point where it could support the global brain.

So, toward the end of summer 1999, we decided to rewrite the Webmind code again. Not the whole system, thankfully -- we were too far along for that -- but this time only psycore, the central core of the system. This time around, Ken was helped out not only by me but by Cassio and several of his colleagues in Belo Horizonte, most notably Andre Senna and Thiago Maia, two masters of data structures and algorithms. At this point, there was a lot of pressure, from some members of staff on both the technical and business sides of Intelligenesis, to give up on the unified AI architecture altogether, and just focus on making individual products as good as they could be, postponing real AI into the future. But Ken and Cassio and I and others focused on building real AI resisted this pressure and plowed ahead with building a new, improved psycore. Among other beloved chunks of code, Paul’s Pods system met its doom in this rewrite.

The reasons for this redesign are somewhat interesting; they reveal a lot about the nasty realities of building big software systems doing complicated, intelligent things. The erratic bugs and lack of documentation in Ken’s code were part of the problem, and made Ken the arch-enemy of the engineering staff for a while. But this stuff was fixable. There was also a more serious problem with the system. It just wasn't flexible enough to enable a huge, multi-module Webmind to be run in a really intelligent way. When the system was only doing one thing – say, reading text, or using text to predict markets – then it was fine. But, it was very bad at regulating several activities at once.

For example, when loading in a series of texts, one would see it get slower and slower at reading. The reason was, the more texts it had in it, the more it had to think about. It had no time to read more text because it was so busy thinking about the texts it had already read! I remember once when Mark Watson, one of our Java AI gurus, noticed this problem in a Webmind demonstration he had written. Jim McLoughlin – one of our early hires who built a lot of Webmind’s numerical and financial analysis components --showed him a way around it. By hacking the code, you could get it to do anything, in any particular situation. But what was needed was intelligent self-control: the system had to know what processes were important to it, and regulate the amount of attention it spent on various things accordingly. Of course, we had always realized this would be necessary. But we hadn’t realized how deeply we’d have to code self-control into the system. Ken’s 1998-99 psycore was built to follow its whims, not to control its dynamics in accordance with goals; and imposing goals on top of this structure was like trying to get a hyper child to sit down and listen to a history lesson.

The system was so complicated that we couldn’t easily make the simple changes we needed to make to turn it into a real global brain platform. We needed to be able to turn on and off the different capabilities of the nodes and links at will -- and have the system do this automatically, adapting to its circumstances. We needed to be able to take collections of nodes and links that were stable, no longer evolving, and "freeze" them into a state that took up very little memory, providing easy access but no adaptability. We needed to be able to observe what was going on in a particular part of the system, and chart its dynamics, to see what structures were emerging.

Over the period 1998-99, psycore had evolved incrementally, getting new features whenever a module author needed them. The natural language team needed psycore to do one thing, the finance team needed it to do another, the categorization team needed it to do another, the reasoning team needed a reasoning module, and so on, and so on. None of these requests fundamentally changed the architecture of nodes, links and wanderers -- mental entities relating to each other and dynamically altering their relationships -- but they changed the details of how nodes, links and wanderers worked, and how they could be accessed and changed. The abundance of new features had made the core code more powerful, but it had made it messier too, and harder to control. Many of the new features had similar structures, and in hindsight could be consolidated into simpler structures. Engineers, charged with building specialized components of Webmind, complained that the system offered so many features and possibilities that it was difficult to figure out how to use it. They wanted something simpler, with a few good features rather than a large number of features of varying quality.

Was it really necessary to go through all these revisions? Why not just figure out everything correctly the first time, and avoid all the reworking and re-reworking? One answer is: We should have, we were just inexperienced, so we kept fucking up. But there’s also another answer, that I prefer because it’s more flattering to me! This answer is: evolution doesn't work that way. Webmind, as a software system, is an engineered system, but it is also an evolved system. It went through several incarnations, each one with some fit aspects and some unfit aspects. The fit aspects survived to the next incarnation; the less fit aspects didn’t. All large software projects evolve through multiple generations; Webmind was not unique in this regard. But the evolution of Webmind had unique aspects because what is evolving is mind itself. In this evolution we had to retain both those features that were most useful for practical applications and those that were in accordance with the abstract structure of mind.

Evolution’s good at figuring out how to make a system that can achieve its goals within a certain environment. In this case, the system was Webmind, and the environment includes the physical structure of modern computer hardware, the universe of software that has evolved to adapt to it, and the practical applications that Webmind was intended for, like market prediction, news filtering, data analysis, text analysis, and conversation. Java, wonderful as it is, wasn’t designed for mind hacking. The von Neumann architecture was designed for repetitive mathematical calculations, not for intelligence. But, by the same token, the brain was designed for sensing and acting, not for abstract thought. Fiber cells were designed for musculature, not for use as neurons. Mind can emerge from any sufficiently flexible substrate, as the features of the substrate gradually adapt themselves to the requirements imposed on them.

The new psycore had a multi-layered structure, which I invented based on some conversations with Youlian Troianov, a Bulgarian software engineer who believes Webmind can never be truly intelligent because it doesn’t make use of the fundamental quantum symmetries of the universe (but he kept working for us anyway, and even now follows Novamente work very closely). I still don’t completely understand what Youlian meant when he suggested psycore should have many layers, but the idea set off a spark in my mind, and the current psycore does indeed have three layers.

The lowest layer was what we called “abstract actors.” It was a general framework for computational actors that group other actors and transform other actors and send messages to other actors. We chose the word “actors” here instead of “actors” because “actors” seems to mean too many things to too many people. Lots of other possibilities were tossed around, including more interesting ones like “cells,” “psells”, “psions”, “psychons” and so forth. Basically, Layer 1 provides a kind of “mind operating system,” suitable to run on a single machine and a single processor, or else on a massively parallel hardware system in which each actor gets its own processing power, like in the brain.

The second layer was “distributed actors” – this deals with all the horrible nastiness of implementing a massively parallel system on a collection of multiprocessor machines networked together by TCP-IP. Scheduling of processes, sending of messages from one machine to another, and so forth. Paul’s Pods system was considered as a structuring principle for this layer, but based on extensive testing by the Brasilians, we chose some other ideas instead, which Paul wasn’t terribly happy about.

The third layer, finally, was nodes and links and wanderers and all the good stuff – all the stuff I invented in 1997 and Ken and I coded up in the beginning. This layer comes out very