Things to be wary of when using PEDEL:
PEDEL uses a generic Poisson model of sequence mutations.
There are a couple of simplifications that you should be aware of:
- All base substitution are assumed equally likely. In reality,
under error-prone conditions, the polymerase favours some
substitutions over others. This has the effect of reducing the
expected number of distinct sequences compared with the PEDEL
predictions. This is in fact not as big an issue as you might
expect. Using the notation from the 'detailed statistics' page (see
link on base PEDEL server page), this is not an issue when the
number of possible variants Vx is much greater than the
sub-library size Lx (i.e. large x values), since here
there are so many possible variants that there is little duplication
within the sub-library even if there is strong bias. Conversely, if
Lx is much greater than Vx (i.e. small x
values) then, unless the bias is very strong, nearly all the
possible variants will still be sampled. Note that it is now
possible, by using sequential PCR amplifications with two different
polymerases that have opposite substitution biases, to produce
unbiased libraries.
- Inherent to the PCR process used to produce epPCR libraries, is
amplification bias: any mutation introduced in an early PCR cycle,
will be present in a significant fraction of the final library. In
practice, researchers use a variety of techniques to reduce
amplification bias - e.g. reduce the number of epPCR cycles and
combine a number of individual libraries. For example, one might
start with 10^9 identical parent sequences; amplify them in an epPCR
to 10^15 sequences; and, after ligation and transformation of
E. coli, end up with a library of 10^7 sequences. Any
amplification bias would have a maximum frequency of only 1 in 10^9
so would not show up in the final library.
- During the PCR cycles, different parent sequences may be
amplified a different number of times. However, empirically, the
end result is a library with a Poisson distribution of mutations
(e.g. Cadwell R.C., Joyce G.F., 1992, Randomization of genes by PCR
mutagenesis, PCR Methods Appl., 2, 28-33). But see
also this note.
- Any biases in library construction will decrease the actual
number of distinct variants represented in the library. In such
cases, PEDEL provides the user with a useful upper bound on the
diversity present in the library.
Please refer to Patrick W. M., Firth A. E., Blackburn J.M., 2003,
User-friendly algorithms for estimating completeness and diversity in
randomized protein-encoding libraries, Protein Eng., 16,
451-457 for further discussion of PEDEL.
A good review of the sources of bias in epPCR (and other directed
evolution protocols) can be found in Neylon C., 2004, Chemical and
biochemical strategies for the randomization of protein encoding DNA
sequences: library construction methods for directed evolution,
Nucleic Acids Res., 32, 1448-1459.
Return to PEDEL server page.
Return to PEDEL software download page.