5.1.9.B

Kinematic Self-Replicating Machines

Robert A. Freitas Jr., Ralph C. Merkle, Kinematic Self-Replicating Machines, Landes Bioscience, Georgetown, TX, 2004.

5.1.9.B Replication Information

B1. Replication Information Autonomy. Most autonomous is the autocatalytic replication of spreading fire [264] and other reaction-diffusion systems [2463] that require no stored information to be present anywhere, although the information might be regarded as being embedded, or implicit, in the physical process.

B2. Replicator Information Redundancy. How redundant is the information stored in the data cache? In biology, the radiation-resistant bacterium Deinococcus radiodurans includes a great deal of genetic redundancy [2464] to achieve high-fidelity DNA repair following severe radiation damage.

B3. Replicator Information Centralization. How and where are the copies (if any) of the replication information stored in the replicator? Description-based reproduction with local memory will permit open-ended evolution with semantic closure, whereas template-based reproduction with distributed memory (i.e., copying via material self-inspection of its parts and identification of parts existing in the environment) allows only restricted evolution [2378-2382].

B4. Replication Information Abstractivity.

B5. Replication Information Type. Very simple procedural rules can sometimes produce seemingly complex results [336], in effect encoding vastly more information than it would first appear. For example, a 100 kg human body mass would contain ~10²⁸ atoms and hence would require ~10²⁹ bits of raw structural information to describe, assuming random atom placement and random atom type. Yet a 100 kg human body (excluding data content of brain) can be adequately described by a chromosome set of developmental information containing just under ~10¹⁰ bits of encoded information, and estimates of the data storage capacity of the human brain have ranged from 10⁹ - 10¹⁸ bits (Section 5.10). The 11-19 orders-of-magnitude differential between the encoded structural information and the raw structural information is a measure of the degree of redundancy inherent in the human body structure, and suggests that the design is very robust against most minor structural changes.

B6. Replicator Description Size. This is intended to be a specific quantitative measure, e.g., a bit count, as shown in Table 5.1 and elsewhere (Merkle [210]; Nanomedicine [228], Table 2.1). (See also dimension A6.) For example, Paul and Joyce [1372] observe that the maximum possible information content of a nucleic acid species of length n (assuming a 4-base alphabet) is log₂(4ⁿ), whereas that of a protein species (assuming a 20-residue alphabet) is log₂(20ⁿ).

Many people believe that self-replicating systems must be extremely complicated, but the data show that this is not necessarily so. The data comprising almost any known replicator could easily be stored on a modern PC or laptop computer, and the descriptions of many replicators are less than 1 megabit in size. Note that although Mycoplasma genitalium has the smallest known genome of any free-living organism, not all of these genes are needed [2465] and the minimum possible genome may be only about one-third of this size [1865]. (M. genitalium is a 0.3-µm obligate parasite that lacks many functions needed for independent saprophytic life.) Many electromechanical devices that have been built over the years which demonstrate the ability to feed, metabolize, learn, respond to stimuli, recognize the self, and move about in physical space with goal-oriented behavior [178-181] are quite modest in complexity, possibly requiring as few as 30 bits [2466] to 90 bits [2467] for their physical description. (For example, machining the Penrose ratcheting 2-blocks [680] to 1% dimensional accuracy requires 7 bits per Cartesian vertex coordinate. With 2 coordinates (x, y) per vertex, and 11 and 15 distinct vertices on the two block types, respectively, the block pair can be described in 364 bits.) Joseph Jacobson at the MIT Media Lab suggests [1930] that the goal for the next decade of manufacturing technology development should be to “maximize complexity per unit volume – [we have] 6 orders of magnitude left to go in complexity of engineered systems.”

Interestingly, in 1961 engineer Marcel Golay [2468], evidently having von Neumann’s kinematic replicator freshly in mind, wrote: “Suppose we wanted to build a machine capable of reaching into bins for all its parts, and capable of assembling from these parts a second machine just like itself. What is the minimum amount of structure or information that should be built into the first machine? The answer comes out to be of the order of 1500 bits – 1500 choices between alternatives which the machine should be able to decide. This answer is very suggestive, because 1500 bits happens also to be of the order of magnitude of the amount of structure contained in the simplest large protein molecule which, immersed in a bath of nutrients, can induce the assembly of those nutrients into another large protein molecule like itself, and then separate itself from it.”

B7. Replication Information Sharing. If replication information is present in the replicator, is it always present or only temporarily resident?

B8. Replicator Information Lockout. Replication-related information, while present within the replicator, is encrypted or locked and thus not immediately accessible to the device. External signals may be sent to the device to unlock, de-encrypt, relock or re-encrypt all or part of the information, either in blocks or in total, or data accessibility may be enabled by other means – a crude analog of RNA interference in medicine [2498].

B9. Replicator Description Alphabet. In the case of biopolymer (e.g., DNA) synthesis, from both theory [2042] and experiment [2043] one expects higher fidelity of replication with smaller genetic alphabets than with large genetic alphabets. Multiple alphabets are also possible. For example, Rocha [2381] notes that if in addition to symbols standing for actions to be performed as in the DNA-based genetic code, the genetic system were to be allowed a second class of symbols standing for contextual or environmental measurements, “then a richer semiotics can be created which may have selective advantage in rapidly changing environments, or in complicated, context dependent, developmental processes.” Rocha [2499] observes that RNA editing [2500-2502] (wherein mRNA molecules contain information not coded in the DNA) may be seen as a mechanism for this type of contextual input that is already found in biology, at least for certain well known living organisms like the African trypanosomes [2503-2507] and as a potentially important mechanism in the morphogenesis of highly evolved animals [2508].

B10. Replicator Triviality or Complexity. One simple definition of replicator triviality would be the ratio of replicator description bit count to the average parts description bit count, which compares parts complexity to the complexity of assembly operations. A low ratio implies that the embedding universe (including its physical laws of operation) or substrate contains most of the information required for replication to occur. A high ratio means that most of the information content of the replicator resides in its own organization, and not in its parts, or its surroundings, or in natural physical laws. (See also our discussion of the Fallacy of the Substrate, Section 5.5.) Sayama [2509] suggests this dimension may include two nearly orthogonal subdimensions: (1) the complexity of assembly tasks (e.g., complexity comparison between parts and final products) and (2) responsibility for assembly tasks (e.g., complexity comparison between how much the replicator does and how much the environment does, during replication).

A similar “typography of self-reproducers” has been proposed by Luksha [2388], who distinguishes replicators based on the degree of complexity of self-reproducer S, quantified as c(S), as compared to the degree of complexity of the environment, quantified as c(E) (see Table 5.2).

Burks [4] addressed replicator triviality by proposing that “what is needed is a requirement that the self-reproducing automaton have some minimal complexity,” and suggested that a minimum bound on the complexity might be satisfied by the requirement that a “self-reproducing automaton also be a Turing machine.” Langton [359] rejects such a stringent requirement which he says not only “eliminates the trivial cases, but it also ... eliminates all naturally occurring self-reproducing systems as well” because none have been shown equivalent to a Turing machine. As noted by Adams and Lipson [272], “though claims to the contrary exist [2510], it does suggest that a view not reliant on a requirement for universal-computation could be worthwhile.”

Currently there are at least 30 different mathematical descriptions of “complexity“ [496-504] and the concept is clearly of great importance in defining what is meant by “replication” [2387]. The Efstathiou group [2511] at the University of Oxford has developed theoretical models and simulations of the complexity of manufacturing systems. Future theoretical work should endeavor to distill all of these measures into a workable quantitative measure most applicable to kinematic replicators. Such a detailed distillation is a highly specialized topic worthy of investigation but unfortunately lies beyond the scope of this book.

B11. Replicator Pattern Instantiation Specificity. New et al [2512] note a distinction between structure-replicators and function-replicators. That is, can the replicator exist only as a specific physical instantiation, or is the replicator a pattern that is imposed on physical matter and can replicate independently of the precise composition of that solid matter while preserving all replicative functions (e.g., see also KCA-based replicators; Section 3.8)?

B12. Replication Information Context Tolerance. In biological replicators based on DNA, the same codon-based nucleotide sequence can often be read with a “frameshift” of one or two nucleotide bases and still code for viable, indeed essential, proteins. This concept may be generalized to include the viability of a replication-related descriptive sequence when it is read or executed in different ways. Replication information may also be made more compact by making the translation, interpretation or execution of replication instructions dependent upon the previous instruction(s) that have been processed, their order of processing, or the manner in which they were processed. An example of this in biology is rule context sensitivity in embryogeny [2401]. See also E9 and F7.

Last updated on 1 August 2005