2.2.1

Kinematic Self-Replicating Machines

Robert A. Freitas Jr., Ralph C. Merkle, Kinematic Self-Replicating Machines, Landes Bioscience, Georgetown, TX, 2004.

2.2 Subsequent Work on Computational Models of Self-Replication

2.2.1 Cellular Automata Models of Self-Replication

Work on cell-space automata systems in the period following von Neumann’s virtually unique [329, 330] contributions has been both lively and extensive, and has taken many different research directions [331-336, 380]. The underlying cell-space notion of an homogeneous medium with a local transition function that determines global properties has been employed in numerous modeling and simulation projects. For example, weather simulations use the idea of connected cells, the changes of each cell state described by a set of differential equations. Studies of the flow of excitation in heart tissue, the dispersal of medicinal drugs, pattern recognition and cryptography [337], autocatalysis [338], prion infection [1736], biological cell function [339-342], and even the origin of life [343, 344] all have employed the cell-space concept. Cell spaces also have been investigated as abstract mathematical objects where, for instance, one tries to determine whether from every mathematical pattern all other patterns can be attained, and whether there are some patterns not attainable at all by means of the transition function, and various other specialized questions. Even by the late 1960s, the investigation of cell-space systems as abstract mathematical entities or as vehicles for spatial modeling and simulation (especially cell-space imaging applications [345]) was already vigorous and prolific. Work on cellular automata has been reported under many other names including iterative arrays, tessellation automata, cellular spaces [382], modular arrays, and polyautomata [346].

Much work in cellular automata has attempted to carry forth the von Neumann program of cell machine construction and self-replication. For instance, Codd [347] recapitulated the von Neumann results in a simpler cell space requiring only 8 states rather than 29, using “sheathed loops” [358-362] surrounding the replicating device (believed at the time to be essential for indicating growth direction and for discriminating right from left [348]), but his device required about 100,000,000 cells [372, 373] (later reduced to 94,794 cells by Devore [349]). This produced a machine design recognizably closer to that of present-day computing machines. Myhill [350], trying to mitigate the artificiality of the indefinitely extended pre-existing cell space, designed a system in which componentry was drawn into a cell-grid system and was then employed as machine constituents somewhat as biological cell constituents might be drawn through a membrane to be used at an intracellular work site. Myhill [351] and Ostrand [352] generalized von Neumann’s basic result to other configurations and higher dimensional cellular spaces. Arbib [354], attempting to make the movement of cell machines a less cumbersome matter, designed a cell-space system in which cells and blocks of cells might be joined together by a “welding” operation, thus becoming rigid “co-moving” configurations “as if held together by chemical bonds.” Moore [353] established theoretical upper bounds on the rapidity of growth of a population of self-replicating configurations, and many others [200, 201, 357-380] have continued to examine various fundamental definitional issues of replicating cellular automata, and to explore fault-tolerant self-replicating CA structures [385, 386].

How simple can a self-replicating cellular automaton be? Reggia et al [348] point out that although the von Neumann and Codd structures do self-replicate, they generally consist of tens of thousands or millions of components or active cells and as a result have never actually been completely simulated computationally [387, 388] because of their tremendous size and complexity.* In 1970, Smith [389] and Banks [364] introduced additional simplifications to the cell-space notion, showing that the von Neumann program could be recapitulated in underlying cell spaces of an extremely elementary sort. Indeed, the “Game of Life” created by Conway [390-396] is a cell-space system which, despite its very simple transition rules, has been claimed to be capable of expressing both universal computation and construction. The game involves a checkerboard cell array with cells in one of two states, “0” or “1.” A point whose state is “0” will change to state “1” if exactly three of its eight neighbors are in state “1.” A point whose state is “1” will remain in that state if two or three of its neighbors are also in state “1.” In all other cases, the state becomes or remains “0.” In the Fredkin cell space defined by the “two-state Parity rule” each cell goes to “1” if an odd number of neighbors are “1” and goes to “0” if an even number of neighbors are “1” – the Fredkin Parity rule will then make copy after copy of whatever starting pattern you give it, with the number of cycles to obtain a replica depending upon on the seed configuration (the larger the seed, the longer replication takes) [397, 398].

* Umberto Pesavento, a young Italian high school student and software prodigy, developed a simulation [388] of von Neumann’s entire universal constructor, under the direction of Renato Nobili who provided “the graphic environment of the program and all ideas on how to pursue the goal” [399]. The computing power available did not allow Pesavento to simulate either the entire self-replication process (the length of the memory tape needed to describe the automaton would have required too large an array to execute in a reasonable period of time) or the Turing machine necessary to implement the universal computer, but he was able to demonstrate the full functionality of the constructor [308]. For example, according to Tim Hutton [399]: “If you wrap the memory around, [the program] will fit in a reasonable amount of space. The problem comes when executing the CA, since the number of active cells grows hugely. Umberto said this to me: ‘Are you trying to run a full self-reproduction? The tape size should not be a problem in the geometrical sense but sending information up and down the arm that reads the tape may be slow for a tape that long.’ Of course, with a highly parallel hardware implementation of the CA this would not be a problem, but for serial implementations, even those which take advantage of the fact that only active cells need to be processed, the computation time is too great. I left the machine running on my 1.7GHz machine for a week and it went from thousands of iterations per second down to one iteration per five seconds, having only got less than half way down the tape. There are ways of simulating the tape to avoid these problems but they require treating some cells differently from others, which is unsatisfactory.”

A breakthrough in cellular automata theory came in 1984, when Langton [359] realized that the capacity for universal construction is a sufficient but not necessary condition for self-replication,* noting that natural biological systems are not capable of universal construction either. He set out to find the logical organization of the smallest possible structure that could accomplish self-replication (and nothing else). Langton discovered that looplike storage devices, previously present only as modules within earlier self-replicating cell automata designs, could be programmed to replicate on their own (Figure 2.6). These devices (e.g., an 86-cell loop embedded in an 8-state 2-D cellular space) typically consist, first, of a string of components that circulate around a rectangle, and, second, a construction arm that protrudes from a corner of the rectangle into the surrounding cellular space [400]. The instructions are used twice, once for translation and again for transcription. In 1989, Byl [362] discovered a much smaller 12-cell self-replicating loop embedded in a 6-state cellular space, and in 1993 Reggia et al [348] found an even smaller “unsheathed” replicating loop, an information pattern comprising only 5 cells embedded in a 6-state space that can replicate in just 17 cycles. This simple replicator was demonstrated in 2001 using chess pieces on a chess board [190].

* In 1969-1971, Alvy Ray Smith provided a proof [417], based on the famous Recursion Theorem of recursive function theory, that computational universality alone will suffice for nontrivial machine replication (as cellular automata) [346]. Von Neumann required both computational universality and construction universality in his self-reproducing machines.

Chou and Reggia [401, 402] and Pargellis [468] observe that a selected cellular automaton grid “primordial soup” will spontaneously give rise to small self-replicating structures without seeding, and that cellular automata can readily support evolution [3, 403-406]. Simple self-replicating loops for reversible cellular automata (e.g., following the pattern of a retractile cascade in a reversible computer [407]) have been investigated by Morita and Imai [408-412]. Others continue to explore interesting extensions of these loops [403-406, 413], including self-replication [414] and evolution [386] implemented in asynchronous cellular automata. For example, Mange et al [415, 416] present a generalized self-replicating loop called the “Tom Thumb algorithm” which is endowed with universal construction and which verifies the six criteria of autopoietic machines (Section 5.1).

Self-replicating computer programs (often called quines [418, 425]) have been written in more than 50 different software languages [419-427, 473].* Various other forms of self-replicating software have served as the basis for games such as “Core Wars” [428-432], Hauser and Buckley’s “Apple Worm” [429], Wator [438], Flibs [439], and many** artificial life simulations [372, 373, 440-485] including most notably Nils Barricelli’s symbioorganisms [445-447] and Thomas Ray’s “Tierra” virtual organisms [442], digital worms [432-437], computer viruses*** [486-495], and studies of the “biology” of digital organisms by Adami and others [503-513]. (Adami [503, 504] notes that “evolving, self-replicating programs behave just like evolving, self-replicating molecules, and their dynamics are indeed well described by Eigen’s theory (Section 4.1.6) of macromolecular evolution” – although computer viruses do not obey the same rules as biological viruses, such as Koch’s Postulates [514].)

* Merkle prefers to quote the following example of the shortest known (one-line) C program that prints out an exact copy of itself without any user input, originally published by professional C programmers Burger, Brill and Machi [422] in 1980, who crowed that their tiny self-replicating program is a mere 101 bytes in length:

main(){char q=34,n=10,*a=“main(){char q=34,n=10,*a=%c%s%c;printf(a,q,a,q,n);}%c”;printf(a,q,a,q,n);}

Written in English, these instructions would read as follows:

Print the following statement twice, the second time in quotes:
“Print the following statement twice, the second time in quotes:”

As a former aficionado of the now ancient Commodore PET/CBM line of personal computers, Freitas is pleased to pass along the 1981 observation of then-ninth-grade high-school student William Sommerfeld [424] that the simplest known self-replicating program, executable on the PET, technically requires only 2 bytes of storage because of the way the PET BASIC interpreter stores instructions, and may be written simply as: 1 LIST

** Some artificial life simulations don’t involve self-replicating software directly, but simply data that are copied by an “external” program, as in the case of genetic algorithms.

*** As of 16 May 2003, there were 63,791 computer viruses recognized and catalogued by Norton AntiVirus [515]. According to Harold Thimbleby [516], “a computer virus is a piece of program code that attaches copies of itself to other programs, incorporating itself into them so that the modified programs, while still possibly performing their intended function, surreptitiously do other things. Programs so corrupted seek others to which to attach the virus, and so the ‘infection’ spreads. Successful viruses lie low until they have thoroughly infiltrated the system, and only reveal their presence when they cause damage. The effect of a virus is rarely traceable back to its originator, so viruses make attractive weapons for vandals. Computer viruses generally work by altering files that contain otherwise harmless programs. This is infection. When an infected program is invoked, it seeks other programs stored in files to which it has write permission, and infects them by modifying the files to include a copy of itself (replication) and inserting an instruction to branch to that code at the old program’s starting point. Then the virus starts up the original program so that the user is unaware of its intervention. The exact details vary from virus to virus.”

Thimbleby [516] defines other kinds of malicious software as follows: “A logic bomb or time bomb is a destructive program activated by a certain combination of circumstances, or on a certain date. A Trojan horse is any bug inserted into a computer program that takes advantage of the trusted status of its host by surreptitiously performing unintended functions. A worm is a distributed program that invades computers on a network. It consists of several processes or ‘segments’ that keep in touch through the network; when one is lost (e.g., by a server being rebooted), the others conspire to replace it on another server – they search for an ‘idle’ server and load it with copies of themselves. Like viruses, worms spread by replication; unlike them, they may run as independent processes rather than as part of a host program.”

Thimbleby [516] briefly recounts the prehistory of the computer virus as follows: “In 1962, V.A. Vyssotsky, a researcher at Bell Telephone Laboratories (New Jersey), devised a game called Darwin, played on an IBM 7090 computer. The object of the game was survival, for ‘species’ of programs to fight it out in an arena set up in the 7090’s memory. Successful programs would be able to make more copies of themselves, and perhaps go on to take over the entire arena. The idea was taken up ten years later [517] and converted to run on other computers. Within the somewhat arbitrary rules of the game, Doug McIlroy invented what he called a virus: an unkillable organism that could still fight. The idea of a maliciously self-propagating computer program originated in Gerrold’s 1972 novel When Harlie Was One [518], in which a program called telephone numbers at random until it found other computers into which it could spread. Worms were also presaged in science fiction, by Brunner’s 1975 novel The Shockwave Rider [519]; the first published report of a worm program, done in 1982 by Shoch and Hupp [433] as an experiment in distributed computing, included a quotation from this book.”

Still others have demonstrated, for example, the spontaneous emergence of self-replicating LISP programs [372, 373, 520, 521], although Koza [372, 373] estimates that “the probability of creating a self-reproducing computer program entirely at random must be exceedingly small.”* There are also mimetic replicators [522] such as ideas, rumors,** jokes, advertisements as in “viral marketing” [523], sentences [524], musical tunes [602], or poems,*** collectively called “memes,” that spread throughout a population by being copied again and again as people pass them on to their friends, thus replicating themselves on the social substrate of active human minds [525, 526]. Thus memes may be regarded as a species of self-replicating automata.

* Koza [372] estimates that the probability of finding of the classical cellular automata programs in a blind random search as follows:

** An amusing example [527]: “In the reformatory at Caserta, Italy, as guards watched a movie, five youthful prisoners escaped. The movie the guards were watching was about guards who watched a movie while youthful prisoners escaped.”

*** There is a short story written by David Moser [528] in which every sentence in the story is self-referential.

Although cellular automata replicators often do not readily lend themselves to physical realization, physical arrays of parallel processors such as the field-programmable gate array (FPGA) and other pieces of specialized hardware [529] have been built that exhibit properties such as self-repair (healing) and self-replication [530]. A good example is the “Embryonics project” [531-542] – e.g., the “Bio Wall” (Figure 2.7) unveiled in February 2002 by the Logical Systems Laboratory of the Swiss Federal Institute of Technology Lausanne (EPFL or Ecole Polytechnique Fédérale de Lausanne) [543-546]) for fault-tolerant computing [547]. The main goal of the Embryonics project [540] is to implement, in an integrated circuit, a system inspired by the development of multicellular organisms and capable of self-test and self-repair. In Embryonics, an artificial organism (a group of artificial cells that executes a given task) is realized by a set of cells distributed at the nodes of a regular two-dimensional grid. Each cell contains a small processor coupled with a memory used to store the program (identical for all the cells) that represents the organism’s genome. In the organism, each cell realizes a unique function, defined by a subprogram, called the gene, which is a part of the genome. The gene to be executed in a cell is selected depending on the cell’s position within the organism as defined by a set of X and Y coordinates.

The first kind of self-replication in Embryonics systems is that of the organism. An artificial organism is capable of replicating itself when there is enough free space in the silicon circuit to contain the new daughter organism and if the calculation of the coordinates produces a cycle. In fact, as each cell is configured with the same information (the genome), the repetition of the vertical coordinate pattern causes the repetition of the same pattern of genes and therefore also, in a sufficiently large array, the self-replication of the organism for any number of specimens in the X and/or Y axes.

There is also a second replication process, corresponding to the cellular division process in biological entities, that is used to put in place the initial array of cells that will then be differentiated to obtain the organism. The need to build cells of different size and structure depending on the application naturally led to the use of programmable logic (FPGAs) as a physical substrate. Each of the computational elements of the Embryonics FPGA can thus be seen as “molecules” assembled in a precise configuration to form a cell. Since all cells are identical, the development process is analogous to the replication of this configuration for as many iterations as there are cells in the organism.

To implement this replication, Embryonics splits the process into two phases: (1) the structural phase, where a “skeleton” is created in order to divide the physical space in a collection of groups of molecules which are empty cells; and (2) the configuration phase, where the configuration is sent in parallel into all the empty cells created during the structural phase. The structural phase is implemented by a small cellular automaton, integrated into the molecular grid, capable of transforming a one-dimensional string of states (analogous to a configuration bitstream stored in a memory chip) into a two-dimensional structure (i.e., the blocks that will host the cells). This replication process is quite different from biological cellular division because all cells are created in parallel. A new approach [415], designed to be integrated with Embryonics, implements a cellular division much closer to real biology.

In 2000, an FPGA embodying a single 29-state cell of von Neumann’s original cellular automata model was implemented in hardware by Beuchat and Haenni [412]. The Cell Matrix^TM [550] architecture represents another physical architecture that could support cell automata self-replication and self-organization, and other novel electronics architectures based on cellular automata have been discussed [548, 549]. It may even be possible to self-assemble FPGAs [550-552].

Wilke and Adami [553] summarize the current state of their field as follows: “The main focus of digital life research is a sort of comparative biology, which attempts to extricate those aspects of simple living systems that are germane to the type of chemistry used, from those that are not [554]. Additionally, digital life can help to refine mathematical theories and aid in developing and quickly testing new hypotheses about ecological and evolutionary processes. Current state-of-the-art digital life research platforms create essentially single-niche ecosystems, and the dynamics that unfold are similar to bacterial or viral evolution in chemostats. However, more complex ecosystem structures can be realized within the paradigm of self-replicating computer programs, and we can expect research to increase in this area. Apart from their usefulness as a tool of understanding evolution in general, it is important to study the biology of artificial life forms in their own right. Recently, Lipson and Pollack [927] showed that the principles of simple self-replicating robots are within reach of current technology (Section 3.20). Eventually, such robots, and the software that directs them, might evolve without human interaction, at which point they would become part of the ecosystem in which we live.”

It is worth noting that robots cannot really be considered part of the ecosystem unless they can forage for raw materials (in said ecosystem) as well as replicate themselves. Such an ability to forage is unlikely to be developed, deployed, or even to be made legal because it would violate the Foresight Guidelines [271] and could threaten public safety (Section 5.11).

Last updated on 1 August 2005