PEDEL: Caveats

Things to be wary of when using PEDEL:

PEDEL uses a generic Poisson model of sequence mutations. There are a couple of simplifications that you should be aware of:

All base substitution are assumed equally likely. In reality, under error-prone conditions, the polymerase favours some substitutions over others. This has the effect of reducing the expected number of distinct sequences compared with the PEDEL predictions. This is in fact not as big an issue as you might expect. Using the notation from the 'detailed statistics' page (see link on base PEDEL server page), this is not an issue when the number of possible variants Vx is much greater than the sub-library size Lx (i.e. large x values), since here there are so many possible variants that there is little duplication within the sub-library even if there is strong bias. Conversely, if Lx is much greater than Vx (i.e. small x values) then, unless the bias is very strong, nearly all the possible variants will still be sampled. Note that it is now possible, by using sequential PCR amplifications with two different polymerases that have opposite substitution biases, to produce unbiased libraries.
Inherent to the PCR process used to produce epPCR libraries, is amplification bias: any mutation introduced in an early PCR cycle, will be present in a significant fraction of the final library. In practice, researchers use a variety of techniques to reduce amplification bias - e.g. reduce the number of epPCR cycles and combine a number of individual libraries. For example, one might start with 10^9 identical parent sequences; amplify them in an epPCR to 10^15 sequences; and, after ligation and transformation of E. coli, end up with a library of 10^7 sequences. Any amplification bias would have a maximum frequency of only 1 in 10^9 so would not show up in the final library.
During the PCR cycles, different parent sequences may be amplified a different number of times. However, empirically, the end result is a library with a Poisson distribution of mutations (e.g. Cadwell R.C., Joyce G.F., 1992, Randomization of genes by PCR mutagenesis, PCR Methods Appl., 2, 28-33). But see also this note.
Any biases in library construction will decrease the actual number of distinct variants represented in the library. In such cases, PEDEL provides the user with a useful upper bound on the diversity present in the library.

Please refer to Patrick W. M., Firth A. E., Blackburn J.M., 2003, User-friendly algorithms for estimating completeness and diversity in randomized protein-encoding libraries, Protein Eng., 16, 451-457 for further discussion of PEDEL.

A good review of the sources of bias in epPCR (and other directed evolution protocols) can be found in Neylon C., 2004, Chemical and biochemical strategies for the randomization of protein encoding DNA sequences: library construction methods for directed evolution, Nucleic Acids Res., 32, 1448-1459.

Return to PEDEL server page.
Return to PEDEL software download page.