Kinematic Self-Replicating Machines

© 2004 Robert A. Freitas Jr. and Ralph C. Merkle. All Rights Reserved.

Robert A. Freitas Jr., Ralph C. Merkle, Kinematic Self-Replicating Machines, Landes Bioscience, Georgetown, TX, 2004.


4.4 Artificial Biological Replicators (1965-present)

Schneiker [1012] reports that as long ago as 1965, the synthesis of life was publicly proposed as a national goal by professor Charles Price, then-president of the American Chemical Society. In 1967, the first synthesis of infectious viral PhiX174 DNA was reported by Goulian et al [1858]. In 1968 Taylor [1859] cited predictions that bacteria would soon be programmed. In 1970, Jeon et al [1860] reassembled an Amoeba proteus from its major components – nucleus, cytoplasm, and cell membrane – taken from three different cells, demonstrating the physical possibility of manually assembling biological replicators from more primitive parts. Morowitz [1861] suggested cooling cells to cryogenic temperatures in order to analyze and determine their structure. Artificial cells might then be assembled at such temperatures and set in motion by thawing. Morowitz also reported that microsurgery experiments on amoebas “have been most dramatic. Cell fractions from four different animals can be injected into the eviscerated ghost of a fifth amoeba, and a living functioning organism results.” In 1972, Danielli [1862] described several possibilities for generating new lifeforms via genetic engineering and “life-synthesis.”

In 1988, Richard J. Feldmann, NIH computer scientist and founder of Integrated Genomics Inc., argued that building a biological organism was a reasonable goal for the scientific community [1863]: “With exponentially increasing computer power, it will take far less than 80 years to be able to design and implement a biological system. The issue seems to be simply one of deciding we want to do a project like this, not the technological complexity of the project per se.” Others [1723, 1864] have more recently echoed this goal. By 2004, more than 100 laboratories were studying processes involved in the creation of life [3107]. “The ability to make new forms of life from scratch – molecular living systems from chemicals we get from a chemical supply store – is going to have a profound impact on society, much of it positive, but some of it potentially negative. Aside from the vast scientific insights that will come, there will be vast commercial and economic benefits, so much so that it’s hard to contemplate in concrete detail what many of them will be,” recently observed Mark Bedau [3107], professor of philosophy and humanities at Reed College in Portland, Oregon, and editor-in-chief of the Artificial Life Journal.

Mushegian [1865, 1866] examined the genes present in the genomes of fully sequenced microbes to see which ones are always conserved in nature. He concluded that as few as 256 [1866] to 300 [1865] genes are all that may be required for life, constituting the minimum possible genome for a functional microbe. (A comparable analysis by Fraser et al [1867] found a similar minimal number, 254 distinct essential proteins.) A single-cell organism containing this minimal gene set would be able to perform the dozen or so functions required for life – manufacturing cellular biomolecules, generating energy, repairing damage, transporting salts and other molecules, responding to environmental chemical cues, and replicating. Thus the minimal artificial microbe – a basic cellular chassis – could be specified by an artificial genome only 150,000 to 320,000 (assuming 1.25 kb/gene [1867]) nucleotide bases in length. By early 2000, Glen Evans had already produced made-to-order DNA strands up to 10,000 nucleotide bases in length [1868] and was striving to increase this length by at least a factor of ten (see below). For years, custom DNA and peptide sequences have been available for purchase online [1869], and whole gene synthesis has been actively investigated at least since the mid-1970s [1870] and throughout the 1980s [1871-1873] and beyond [1874].

An engineered full-genome DNA, once synthesized, could be placed inside an empty cell membrane – most likely a living cell from which the nuclear material had been removed. Used in medicine, these artificial biorobots could be designed to produce useful vitamins, hormones, enzymes or cytokines in which a patient’s body was deficient, or to selectively absorb and metabolize into harmless end products harmful substances such as poisons, toxins, or indigestible intracellular detritus, or even to perform useful mechanical tasks – although natural biological systems appear incapable of true universal construction [358-361]. One private company, engeneOS [1875], was formed in 2000 to pursue the construction of these artificial biological devices; in 2001 a nascent company, Robiobotics LLC [1876], tried to put forward a business plan to pursue “whole genome engineering” and to seek funding; and physicist Norman Packard has established another company, ProtoLife [3109], to capitalize on the new field of living technology: “The goal of the company is to realize the vision of producing living artificial cells, and also producing other forms of living chemistry, and then programming them to do useful chemical applications. The range of useful chemical functions we ultimately envision is vast,” says Packard [3107]. Luisi [1877] has discussed top-down and bottom-up approaches to the engineering of synthetic minimal living cells, and several other groups may be even further along in wetware [1878] engineering. The European Union’s Programmable Artificial Cell Evolution (PACE) project, recently established with a grant of about $9 million, was scheduled in March 2004 to open the first institution devoted exclusively to creating artificial life, called the European Center for Living Technology, in Venice, staffed by European and U.S. researchers, according to John McCaskill, professor of theoretical biochemistry at Friedrich-Schiller University in Jena, Germany, who is overseeing the European Union's artificial life program [3107].

In November 2002, J. Craig Venter, of human genome-sequencing fame, and Hamilton O. Smith, a Nobel laureate, announced [1879] that their new company, Institute for Biological Energy Alternatives (IBEA), had received a $3 million, three-year grant from the Energy Department to create a minimalist organism, starting with the M. genitalium microorganism. Working with a research staff of 25 people, the scientists planned to remove all genetic material from the organism, then synthesize an artificial string of genetic material resembling a naturally occurring chromosome that they hoped would contain the minimum number of M. genitalium genes needed to sustain life [1879]. In this process, the artificial chromosome is inserted into the hollowed-out cell which is then tested for its ability to survive and reproduce. To ensure safety, Smith and Venter said the cell will be deliberately hobbled to render it incapable of infecting people; it also will be strictly confined and designed to die if it does manage somehow to escape into the environment [1879]. In late 2003, Venter reported the first fruits of this work – synthesis of the complete genome of a small (5386 base pairs) phi X bacteriophage virus in just 14 days [1880].

In early 2003, Glen Evans’ new company Egea Biosciences [1881] may have vaulted into the lead, having been granted “the first [patent] [1132] to include broad claims for the chemical synthesis of entire genes and networks of genes comprising a genome, the ‘operating system’ of living organisms.” According to the company, Egea’s proprietary GeneWriter™ and Protein Programming™ technology has been validated in extensive proof of concept studies and has: (1) produced libraries of more than 1,000,000 programmed proteins, (2) produced over 200 synthetic genes and proteins, (3) produced the largest gene ever chemically synthesized of over 16,000 bases, (4) engineered proteins for novel functions, (5) improved protein expression through codon optimization, and (6) developed custom genes for protein manufacturing in specific host cells. Egea’s software allows researchers to author new DNA sequences that the company’s hardware can then manufacture to specification with a base-placement error of only ~10-4, which Evans calls “word processing for DNA” [1882].

According to Egea’s patent [1132], one “preferred embodiment of the invention” would include the synthesis of “a gene of 100,000 bp ... from one thousand 100-mers. The overlap between ‘pairs’ of plus and minus oligonucleotides is 75 bases, leaving a 25 base pair overhang. In this method, a combinatorial approach is used where corresponding pairs of partially complementary oligonucleotides are hybridized in the first step. A second round of hybridization then is undertaken with appropriately complementary pairs of products from the first round. This process is repeated a total of 10 times, each round of hybridization reducing the number of products by half. Ligation of the products then is performed.” The result would be a strand of DNA 100,000 base pairs in length, long enough to make a very simple bacterial genome [1882]. Evans says his prototype machine can link up 10,000 bases in two days, and that 100,000 bp strands might require “a matter of weeks” to synthesize using a future next-generation machine [1882]. “Pretty soon, we won’t have to store DNA in large refrigerators,” says Tom Knight. “We’ll just write it when we need it.” [1882]

The Synthetic Biology Lab at MIT [1883] is also investigating minimal organisms. One of their goals: “Dissect Mesoplasma florum and create a minimal organism. Mesoplasma florum is a Mycoplasma species with a genome size of 860,000 bp. It has been sequenced in collaboration with the Whitehead Institute and is being annotated. This organism was chosen because it is non-pathogenic, convenient to grow, and has a relatively small genome. The next step will be to build a minimal organism starting from this species by removing genes that are not needed and not well understood. The goal is to create a well-understood bare-bones organism to begin engineering in new desired functionality.” A few of the many useful metabolic pathways that might be added have been compiled at the BioCyc website [1884].

Mushegian’s results [1865] suggest that the minimal autonomous artificial biological replicator consists of at least 300 different (protein) “nanoparts”, multiple copies of each being assembled into a working biological robot. However, engineered replicators: (1) need not be designed for operation on diverse organic substrates, (2) need not be capable of self-assembly but may employ positional assembly techniques, (3) need not be capable of surviving mutation or evolution, and (4) need not contain (and by necessity maintain to high fidelity) their own description within their own onboard structure. Accordingly, Szostak et al [1885] have proposed a vastly simpler replicating protocell which would include principally an RNA replicase – an RNA molecule that can act both as a template for genetic information storage and transmission and as an RNA polymerase that can replicate its own sequence [1886, 2051]. A single molecule cannot be both template and polymerase at the same time, so replication needs two RNA molecules [1885]: “A replicase that acts as the polymerase, and another molecule, which could be either an unfolded replicase or an RNA complementary in sequence to the replicase, to act as a template.” The replicase (possibly 200-300 nucleotides in size) would replicate inside a replicating membrane vesicle, which would also contain a ribozyme that synthesizes amphipathic lipids (allowing membrane growth and maintenance). With these simplifications available to designers, it is likely that a positionally-assembled mechanical replicator can be at least as simple as Mushegian’s minimal microbe and can possibly be designed using far fewer different kinds of parts than are found in this biological proof-of-concept. Similarly, there has been discussion of “chemical reaction automata” as precursors for synthetic organisms [1887], synthetic lifeforms [1888] and “nanobiology” [1889], artificial cells [1890] and synthetic cell biology [1891], lymphocyte engineering [1892], microbial engineering [1893-1898], cell surface engineering [1899-1901] and cell metabolic engineering [1902-1907].

Engineered self-replicating bacteria are being pursued as gene delivery vectors by many research groups [1908-1917], and also as anti-tumor agents, as for example by Vion Pharmaceuticals in collaboration with Yale University [1918]. In their “Tumor Amplified Protein Expression Therapy” or TAPET program [1918], antibiotic-sensitive Salmonella typhimurium (food poisoning) bacteria were attenuated by removing the genes that produce purines vital to bacterial growth. The tamed strain could not survive very long in healthy tissue, but quickly multiplied 1000-fold inside tumors that are rich in purines. These engineered bacterial self-replicating machines were available in multiple serotypes to avoid potential immune response in the host, and Phase I human clinical trials were underway in 2002 using clinical dosages. The next step would be to add genes to the bacterium to produce anticancer proteins that can shrink tumors, or to modify the bacteria to deliver various enzymes, genes, or prodrugs for tumor cell growth regulation.

Such additions can include “unnatural” components. For instance, Schultz’s group has engineered new strains of E. coli that can incorporate into bacterial proteins: (1) externally-supplied unnatural amino acids [1919] including p-azido-L-phenylalanine [1920], the photocrosslinking amino acid p-benzoyl-l-phenylalanine [1921], O-methyl-L-tyrosine [1922], and L-3-(2-naphthyl)alanine [1923], (2) unnatural amino acid functional groups such as keto groups [1924], and even (3) non-amino acids such as alkenes [1925]. In 2003 Shultz’s team reported [1926] the first creation of an E. coli strain that could not only incorporate but also internally synthesize another unnatural amino acid, p-aminophenylalanine – making, in essence, a completely autonomous bacterium having a 21 amino acid genetic code, including an artificial tRNA-synthetase/tRNA pair [1927]. Later in 2003, Shultz’s team announced similar results for the first eukaryotic species, a type of yeast called Saccharomyces cerevisiae [1928, 1929].

Joseph Jacobson [1930], scientific advisory board member for engeneOS, observes that a key to successful replication of complex biological systems is error correction. Biology employs many techniques for error correction. For example, protein coding uses fault-tolerant translation codes – e.g., in the genetic code of almost all organisms on Earth, polar residues are encoded by the “degenerate” (see below) DNA codon NAN, and non-polar residues are encoded by the degenerate codon NTN (where A = adenine, T = Thymine, and “N” denotes a mixture of DNA bases). Hecht [1931] reports that experiments with combinatorial libraries of de novo proteins show that sequences describing an alternating pattern of polar and non-polar amino acid residues (e.g., NANNTNNANNTN...) produce aggregative fibrillar structures resembling b-amyloid. This aggregation competes with (and may prevent) proper intramolecular folding, hence this pattern should be disfavored by evolutionary selection and in fact does occur significantly less often in the genome than other patterns with similar compositions [1931]. Another example is frameshift error tolerance, wherein two nonhomologous proteins may share the same DNA sequence – that is, reading the same sequence with a single base-pair frameshift may produce a second valid protein product [1693]. As yet another example, the entire genetic code is “degenerate” – that is, almost every amino acid is represented by several codons, with a slight tendency for more common amino acids to be represented by more codons [1693]. More importantly, this degeneracy minimizes the effects of mutations. A mutation of CUC to CUG has no effect, since both codons represent leucine; similarly, a mutation of CUU to AUU replaces leucine with isoleucine, a closely related amino acid [1693].* The synthesis of proteins from amino acids by ribosomes (giant ribozymes) has an error rate of only one residue per 2,000 (error rate ~5 x 10-4) largely due to mistaken tRNA recognition [1693], which means that the majority of synthesized proteins (which average fewer than 1000 residues per protein) may include not a single error. DNA copying relies on extensive error detection and correction (e.g., presynthetic and proofreading error-correcting polymerases [1932], thymine dimer bypass [1933], base excision repair [1934], base-pair mismatch correction systems [1935, 1936]) along with built-in redundancy (the molecule has two complementary strands) to achieve a net error rate of only one base per billion (~10-9) [1693] when replicating itself.

* Each amino acid can be assigned a hydropathic index on the basis of hydrophobicity and charge characteristics [1937], as follows: isoleucine (+4.5), valine (+4.2), leucine (+3.8), phenylalanine (+2.8), cysteine/cystine (+2.5), methionine (+1.9), alanine (+1.8), glycine (-0.4), threonine (-0.7), serine (-0.8), tryptophan (-0.9), tyrosine (-1.3), proline (-1.6), histidine (-3.2), glutamate (-3.5), glutamine (-3.5), aspartate (-3.5), asparagine (-3.5), lysine (-3.9), and arginine (-4.5). Certain amino acids may be substituted by other amino acids having a similar (i.e., within 0.5-2 units) hydropathic index and still result in a protein with similar biological activity, i.e., still obtain a biologically functional-equivalent protein [1132].

“Error correction in biological systems has extremely high fidelity, but error is cumulative,” notes Jacobson [1930]. “Can we create an organism which does not have cumulative error?” His solution, echoing the consensus-copy correction approach described in 1986 by Drexler ([199], pp. 107-108), is to build living cells containing M duplicate strands of N bases per strand in a consensus voting system to create a self-excising error-correcting code. “For above [some] threshold M value [and] combining error correcting polymerase and error correcting codes, one can replicate a genome of arbitrary complexity,” he claims. Similarly, the radiation-resistant bacterium Deinococcus radiodurans may preserve its genome following severe radiation damage in part by employing a high copy number of DNA [1938, 1939].

During the 1990s, bioengineered self-replicating viruses of various types [1940-1942] and certain other vectors routinely were being used in experimental genetic therapies as “devices” to target and penetrate certain cell populations, with the objective of inserting therapeutic DNA sequences into the nuclei of human target cells in vivo. Inserting new sequences into viral genomes, or combining components from two different viruses to make a new hybrid or chimeric virus [1943, 1951], is now routine. Efforts at purely rational virus design are underway but have not yet borne much fruit. For example, Endy et al [1944] computationally simulated the growth rates of bacteriophage T7 mutants with altered genetic element orders and found one new genome permutation that was predicted to allow the phage to grow 31% faster than wild type; unfortunately, experiments failed to confirm the predicted speedup. Better models are needed [1891, 1945]. Nevertheless, combinatorial experiments on wild type T7 by others [1946-1948] have produced new but immunologically indistinguishable T7 variants which have 12% of their genome deleted and which replicate twice as fast as wild type [1948]. The Synthetic Biology Lab at MIT [1883] seeks to “build the next generation T7...a bacteriophage with a genome size of about 40,000 bp and 56 genes. With DNA synthesis becoming cheap, we wish to redesign and rebuild the entire genome, to create the next, and hopefully better, version of T7. Considerations in the redesign process include: adding or removing restriction sites to allow for easy manipulation of various parts, reclaiming codon usage, and eliminating parts of the genome that have no apparent function. Synthesizing a phage from scratch will allow us to better understand how Nature has designed the existing organism.” Notes Endy [3111]: “We’ve rebuilt T7 – not just resynthesized it but reengineered the genome and synthesized that.” The MIT scientists are separating overlapping genes, editing out redundancies, and so on, with about 11.5 kilobases completed so far and the remaining 30,000 base pairs expected to be completed by the end of 2004 [3111].

In a three-year project [1949] culminating in 2002, the 7500-base polio virus was rationally manufactured “from scratch” in the laboratory by synthesizing the known viral genetic sequence in DNA, enzymatically creating an RNA copy of the artificial DNA strand, then injecting the synthetic RNA into a cell-free broth containing a mixture of proteins taken from cells. The synthetic polio RNA then directed the synthesis of complete (and fully infectious) polio virion particles [1949], allowing the researchers to claim that the virus was made without the use of any living cells. The rational design and synthesis of completely artificial viral sequences, leading to the manufacture of completely synthetic viral replicators, should eventually be possible – but the rational design and synthesis of chimeric viral replicators is already possible today [1950-1952].

Attempts to create synthetic nonmicrobial biological replicators also are underway, For example, in 2000 Demin Zhou’s group [1746] reported that they were trying to create a synthetic viroid consisting of “a naked RNA virus capable of infecting a cell, replicating, and modulating cellular function. The RNA molecule in the library is viroid-like, capable of rolling circle replication, self-splicing and ligation. Replication is performed by an RNA-dependent RNA polymerase (RdRp) that can replicate RNA by internal initiation on an RNA template. The designed circular RNA includes a ribozyme that leads to self-splicing and ligation, a promoter for an RdRp, an antisense sequence, and an infectivity site. We are now testing replication of the viroid by an RNA dependent RNA polymerase which has well defined activity in vitro and in vivo.” It may also be possible to design artificial general-purpose bacteriophages that are capable of disabling or destroying all bacterial DNA, or are capable of replicating more bacteriophage particles to which only the targeted pathogenic bacterial cells are susceptible [228, 1757]. Bacterial artificial chromosomes (BACs) and yeast artificial chromosomes (YACs) – essentially, synthetic plasmids – have been in wide use in biotechnology for years, e.g., as cloning vectors [1953-1955]. Chromosome engineering [1956] is beginning, and both mammalian artificial chromosomes (MACs) [1957] and human artificial chromosomes [1958] are actively being investigated. Rasmussen et al [1959] describe an organic artificial protocell that represents a simple self-replicating physicochemical system.


Last updated on 1 August 2005