Evolution Icon Evolution
Intelligent Design Icon Intelligent Design

Adam and the Genome and Doug Axe’s Research on the Evolution of New Protein Folds

Doug Axe

We’ve been reviewing, at admittedly epic length, the somewhat less-than-epic book Adam and the Genome, by biologist Dennis Venema. The title of this volume seems to promise an extended treatment of Adam, as in Adam and Eve, and the human genome. However, repeatedly Venema launches into other matters, irrelevant to whether Adam and Eve existed.

On yet another excursion, he criticizes the research of Douglas Axe, a protein scientist who has published work on the rarity of new protein folds by doing research on beta-lactamase enzymes. Publishing in the Journal of Molecular Biology, Axe found that only about 1 in 1077 sequences produce the stably folded structure needed for beta lactamase to work.

But Venema’s main target in this section isn’t really Doug Axe. It’s Stephen Meyer. Specifically, Venema goes after Meyer’s use of Axe’s research, claiming that Axe’s work only applies to the evolvability of beta-lactamase and not the evolvability of proteins generally. He claims Meyer over-extrapolates the implications of Axe’s work:

Now these results are not controversial for Axe’s engineered beta-lactamase. What is controversial, however, is Meyer’s claim that these results apply to the evolution of proteins in general. … The average layperson who reads Meyer’s works, however, may simply take him at his word that scientists have concluded that functional, folded proteins in general are exceedingly rare and thus agree with his assessment that they cannot be produced by natural mechanisms.

(Adam and the Genome, pp. 83-84)

Venema writes that “If scientists could observe such an event [the evolution of a new protein fold], then it would indicate that Axe’s math (and Meyer’s use of it) is not a reliable estimate for the prevalence of functional protein folds.” By trying to split “Axe’s math” from “Meyer’s use of it,” Venema seems to imply that Axe’s arguments are relatively sound (since Axe supposedly restricted his conclusions to the specific evolvability of beta-lactamase), while Meyer’s extrapolation to the general case is the real problem, constituting a misuse of Axe’s paper.

Here, Venema makes a critical error: Apparently he does not realize that Axe’s peer-reviewed paper explicitly argues for and justifies extending his results on beta-lactamase mutagenesis experiments to the rarity of new protein folds in general — not just to the origin of beta-lactamase. This is nothing surprising. Many papers on protein structure, function, and evolvability consider specific cases and then discuss the general implications. Meyer did not misuse Axe’s paper. Meyer simply restated the direct conclusions of a paper published in a prestigious mainstream biology journal.

The title of Axe’s paper, “Estimating the Prevalence of Protein Sequences Adopting Functional Enzyme Folds,” by itself suggests that it is aimed at discussing a general result. In the first words of the Abstract, Axe indicates that his results can apply to the “overall prevalence of sequencing adopting functional folds”:

Proteins employ a wide variety of folds to perform their biological functions. How are these folds first acquired? An important step toward answering this is to obtain an estimate of the overall prevalence of sequences adopting functional folds.

(Douglas D. Axe, “Estimating the Prevalence of Protein Sequences Adopting Functional Enzyme Folds,” Journal of Molecular Biology, 341:1295-1315 (2004).)

Reading on in the paper shows that Axe deliberately chose to work on beta-lactamase because it has a fold of typical complexity, and thus can serve as “a model system for assessing the requirements for functional formation of a moderately complex fold.” Axe explains:

[T]he larger of the two domains forming beta-lactamases of the class A variety (henceforth, the large domain) is used as a model system for assessing the requirements for functional formation of a moderately complex fold .Although predominantly composed of alpha-helices, this domain contains small sheet regions and significant loop structure which, along with its size (just over 150 amino acid residues), make its complexity more representative of known domain folds. Another typical feature of domains, the ability to form specific associations with other domains, is ensured by the location of the beta-lactamase active-site cleft at the interface between the large and small domains.

Near the end of his paper, Axe extrapolates his results on beta-lactamase enzymes to the general case:

[I]t is not obvious that fold diversity is as easily explained as sequence diversity, if functionally folded sequences are as rare as this analysis indicates. A commonly accepted view is that new folds are pieced together from small parts of existing folds. But to the extent that a new fold is really new, its formation must require the joint solution of at least a considerable number of new local stabilization problems of the kind described above. How likely is it that sequences that carry the hydropathy signatures of other folds and provide joint solutions to the stabilization problems for those folds may be pieced together in such a way that they satisfy a new set of constraints, equally demanding but substantially different? The analysis provided here, bearing in mind the uncertainties, calls for careful examination of such piecing scenarios.

This is just a technical way of saying the results suggest that new folds are very difficult to produce by natural selection. His experiment set out to measure the sensitivity of those local stabilization problems to perturbation, and thus the sensitivity of the beta-lactamase fold to destabilization as a whole. When proteins are destabilized they lose function. This loss of function gives a measure of the rarity of stable functional folds. From the Abstract of the paper:

Combined with the estimated prevalence of plausible hydropathic patterns (for any fold) and of relevant folds for particular functions, this implies the overall prevalence of sequences performing a specific function by any domain-sized fold may be as low as 1 in 1077, adding to the body of evidence that functional folds require highly extraordinary sequences.

What all this means is that Meyer never misused Axe’s paper. He merely restated Axe’s own peer-reviewed conclusions.

Turnabout Is Fair Play

Now turn the logic around. Venema, in his book, repeatedly uses specific examples of proteins that he thinks are evolvable in order to insist that, as a general matter, new complex protein structures are evolvable. He often argues from the specific to the general, tacitly conceding that Axe (and Meyer’s) form of argument is legitimate. (Note, however, as we saw earlier, that Venema has not shown that the specific proteins he discusses are evolvable, nor has he demonstrated the origin of any complex features. His specific examples fail to support his general argument that complex features can evolve.)

The point is that it is perfectly legitimate in science to use one case as a model for analysis of a general problem. This is done all the time. Axe’s generalization of results follows the tradition of many similar papers, which came to similar conclusions about the rarity of functional protein sequences, and applied their results broadly. For example:

  • Reidhaar-Olson and Sauer 1990 (published in the journal Proteins), mutated the λ-repressor in coli and found that only one in 1063 sequences yield a functional repressor fold. They generalized the implications of their results for how we predict protein structure in other cases, writing: “The high level of degeneracy involved in protein folding suggests that the most fruitful approaches to structure prediction will concentrate on those residues that are informationally rich.”
  • Yockey 1977 (published in the Journal of Theoretical Biology) calculated that the likelihood of generating a functional cytochrome c sequence is one in 1065. He generalized this result to conclude that many proteins are not evolvable, and even concluded that standard mechanisms of abiogenesis could not produce such features on a reasonable timescale. He wrote that “belief in currently accepted scenarios of spontaneous biogenesis is based on faith, contrary to conventional wisdom.”
  • Hayashi et al. 2006 (published in PLOS ONE) determined that 1070 trials would be necessary to acquire the wild-type function of the g3p minor coat protein of the fd phage. They generalized their inferred fitness landscape results to other cases, and wrote: “The landscape structure has a number of implications for initial functional evolution of proteins and for molecular evolutionary engineering.” However, because reaching higher fitness levels required scaling much steeper fitness functions (i.e., functional sequences were very rare), thus concluded, as a general matter: “In molecular evolutionary engineering, larger library size is generally favorable for reaching higher stationary fitness.”

So Axe’s paper generalizes its results much as prior papers have done. Indeed, he reports that one paper found the likelihood of a sequence generating chorismate mutase was much higher — 10-40 — than his result of 10-64 for beta-lactamase. However, he notes that “there is no reason to think the two estimates are inconsistent” since the fold studied by the chorismate mutase paper was a much simpler kind of fold. Axe writes that “it is important that enzyme folds of more typical complexity be examined,” which is how he designed his study.

Thus, Axe justifies the extrapolation of his results to other proteins by noting the similar level of complexity to what we typically see:

Although predominantly composed of a-helices, this domain contains small sheet regions and significant loop structure which, along with its size (just over 150 amino acid residues), make its complexity more representative of known domain folds. [Emphasis added.]

Forward and Reverse

Axe is aware of Venema’s sort of criticisms and addresses them in his paper. From the outset Axe observes that there are two types of mutagenesis experiments:

  1. “Forward” studies, which start with a library of random sequences and then seek to determine whether there are any that have the functions or properties of functional proteins.
  2. “Reverse” studies, which start with a natural functional protein and then mutate it to determine its tolerance to changes.

Axe points out that both types of experiments have advantages and drawbacks. Type (1) (“forward”) experiments usually fail to produce sequences that clearly resemble natural proteins. But Type (2) (“reverse”) experiments “may fail to take account of sequences having the relevant functional properties in a very rudimentary form.”

Venema apparently thinks that Axe is performing naïve Type (2) experiments and thus makes the same criticism of Type (2) experiments that Axe himself makes, as if Axe weren’t aware of the criticism and had not addressed it. Venema thus writes:

Axe’s experiment starts with one defined protein and modifies that. The issue here is that there are other beta-lactamases that are not similar to the one Axe chose as his test subject. These other proteins function as beta-lactamases with an amino acid sequence very different from the test enzyme. So there were surely sequences in the experiment that could function as a beta-lactamase, but not as part of the beta-lactamase structure Axe chose. In other words, his experiment would miss sequences even for the function under consideration.

(Adam and the Genome, p. 84)

But as we’ve already seen, Axe is well aware of these issues, for he discusses them and even designed his experiments to accommodate them. Axe writes in his paper:

[S]ince many different folds might be comparably suited to any given enzymatic function, it is important that we have some way to factor this in. In other words, if the prevalence of sequences performing a particular function enzymatically is our primary interest, then our analysis must not presume the necessity of any particular fold.

In fact, Axe’s mutagenesis experiments specifically accommodated the fact that other beta-lactamases might have very different sequences. He writes:

By making use of sequence information from numerous related beta-lactamases, it is possible to frame the analysis of this single fold in such a way that it illuminates the key aspects of the sequence-function relationship that must be explored in order to assess the overall prevalence of enzymatic function.

Axe adopted an approach similar to a prior mutagenesis study of chorismate. After recounting some of the problems facing Type (2) experiments, Axe writes:

How might the other difficulties be avoided? A recent study of the requirements for chorismate mutase function in vivo demonstrates a promising approach. Chorismate mutase gene libraries prepared in that work were constrained to preserve all active-site residues and the sequential arrangement of hydrophobic and hydrophilic side-chains present in a natural version of the enzyme. Within these constraints, though, specific residue assignments were essentially random, resulting in numerous disruptive changes throughout the encoded proteins. This is an example of the reverse approach, in that it uses a natural sequence as a starting point but, because the produced variants carry extensive disruption throughout the structure rather than just local disruption, they provide reliable information on the stringency of functional requirements.

Axe’s mutagenesis experiments on beta-lactamase follow a similar approach:

As in the chorismate mutase study, disruptive substitutions throughout the large domain will provide a marginally adequate sequence context in which to assess the requirements for low-level function.

Thus, it is Venema’s criticisms of Axe’s experimental design that fall short, not Axe’s experiments.

Common and False Objections

But the most common (though false) objections that Axe routinely faces show up in Venema’s book as well. Venema writes:

First off, recall that Axe’s experiment used a mutated, barely functional protein because it is well known that natural proteins are able to tolerate many mutations without losing their function. It’s not controversial to state that had Axe used a normal, stable protein, he would have found far more “functional” proteins in his experiment.

(Adam and the Genome, pp. 83-84)

This is a misguided critique. Axe didn’t want to measure how many mutations a wild-type protein could tolerate. He wanted to determine how hard it is to go from a non-functional but signature-compliant fold to a functional one. In other words, he was asking about the boundary conditions for having a functional fold — what proportion of proteins with the right hydrophobic/hydrophilic profile were actually functional? This can only be measured at the border between the two conditions.

Second Venema complains that Axe mutated his proteins in groups.

Also, recall that Axe substituted several amino acids at a time in his tests. There simply wouldn’t be time to try every possible single mutation, double mutation, triple mutation, and so on. Many of those small mutations would not be expected to remove the residual function of the engineered protein as well. Evolution, as we have seen, typically works via single mutations, not numerous simultaneous ones. Axe’s experiment needed to use multiple, simultaneous mutations in order to reduce the number of samples, but this means his setup is less relevant to the question of how evolution might build new protein structures over time. In fact, biologists expect that simultaneous mutation of multiple amino acids will diminish function — especially for a protein that is already mutated to the point of barely functioning in the first place. (p. 84)

Here, Venema appears to have totally missed the point of Axe’s experiment. (He’s not the only one.) Axe wasn’t trying to determine how to build a protein. He wasn’t measuring how evolution worked. He simply wanted to see what proportion of mutated proteins was stable enough to carry out a beta-lactamase reaction. To do this, he measured how many proteins could function after a small associated groups of amino acids were changed in them. Since protein folds are stabilized by a network of side chain interactions, destabilizing one small part of that network should give an accurate read on how critical that region is to stability. The groups were randomized, while preserving key hydrophobic and hydrophilic resides, thus giving the best chance at preserving the overall structure. Once the results for each of four groups were determined, an estimate for the entire domain was calculated. In essence, the experiment determined the threshold for a stable functional fold capable of carrying out the beta-lactamase reaction — like crawling out of a non-functional sea onto the functional shore. If anything, the use of multiple mutations would give a boost to the transition.

Lastly, it’s worth taking a look at Venema’s closing words in his section on Axe’s research:

If scientists could observe such an event [the mechanistic origin of a new protein fold], then it would indicate that Axe’s math (and Meyer’s use of it) is not a reliable estimate for the prevalence of functional protein folds. Interestingly, there are many known cases of exactly this, though Meyer does not seem to be aware of them, or of the implications they hold for his line of argument. Let’s examine a few in detail. (pp. 84-85)

The main example Venema then gives is the origin of nylonase, which, as we’ve already seen in this series, is NOT an example of the spontaneous evolution of a new protein fold. Venema misinterpreted the data.

Venema also mentions the origin of p24-2, even though there’s no evidence that p24-2 involved the origin of a new protein fold.

What’s really curious is that Venema frames his whole discussion as another attempt to impugn Meyer’s credibility. He writes, “Meyer does not seem to be aware” of the supposedly many examples that show new proteins evolving. Yet in his 2013 book Darwin’s Doubt, Meyer spends an entire chapter going over alleged examples of the evolution of new genes that are similar to nylonase and p24-2. In Chapter 11, “Assume a Gene,” Meyer examines the following:

  • The origin of an antifreeze protein in Antarctic fish.
  • The origin of RNASE1B, a digestive protein in colobine monkeys.
  • The origin of Cid, a gene encoding a histone protein in fruit flies.
  • The origin of FOXP2, a gene involved in regulating brain development in various mammals.

In so doing, Meyer evaluates virtually all of the standard mechanisms invoked for evolving new genes, including gene duplication, exon shuffling, retropositioning of mRNA, lateral gene transfer, mobile genetic elements, gene fission/fusion, and de novo gene origination. He does not ignore the literature pertaining to the evolution of new genes, and even discusses about as many examples of gene evolution in his book Darwin’s Doubt as Venema discusses in Adam and the Genome.

Venema doesn’t just disagree with Meyer’s arguments. He wants to paint Meyer as ignorant, incompetent, and disingenuous. But the relevant science is on Meyer’s side, not Venema’s. This, of course, leaves aside the question of what Steve Meyer is doing at all in a book about the existence of Adam and Eve, a topic he doesn’t even write about.

Photo: Doug Axe, in “The Problem with Theistic Evolution,” via Crossway Books.