**Poisson versus PCR distributions:**

In our paper (Firth & Patrick, 2005), we assumed a Poisson
distribution to determine the fraction of sequences in an epPCR
library that contain exactly 0, 1, 2, 3, ... mutations, given the mean
number of mutations, *m*, per sequence.

Since publication of Firth & Patrick (2005), however, Drummond et
al. (2005) have revisited the pioneering work of Sun (1995) and
provided experimental evidence in support of his more accurate
equation describing the distribution of *m*. This 'PCR
distribution' takes into account the number of PCR thermal cycles
*ncycles* and the PCR efficiency *eff* (i.e. the probability
that any particular sequence is duplicated in a given PCR cycle). We
have therefore now included the PCR distribution as an optional
alternative to the Poisson distribution in PEDEL.

For large *m*, small *ncycles*, or low *eff*, the PCR
distribution is broader than the Poisson distribution. For low
*m*, large *ncycles* and large *eff*, the PCR
distribution approximates the Poisson distribution.
In a 'typical' epPCR (e.g. *ncycles* = 30, *eff* = 0.6,
*m* = 4), the estimated total number of distinct sequences in a
library typically agrees to within 5% for the two distributions, though
the sub-library statistics can show more variation.

If you know *ncycles* and *eff*, then we recommend that you
use the PCR distribution instead of the Poisson distribution.
Drummond et al. (2005) use the formula *d* =
*ncycles* × *eff*, where *d* is the number of
doublings. For example, if you start with 10^9 identical parent
sequences and amplify them in an epPCR to 10^15 sequences, then you
have had about *d* = 20 doublings (10^9 × 2^20 ~= 10^15),
and you can calculate *eff* = *d* ÷ *ncycles*.
Actually the *d* = *ncycles* × *eff* formula is
wrong. The correct formula is 2^d = (1+eff)^ncycles, so that the
efficiency is given by eff = 2^(d/ncycles) -
1 (PCR efficiency calculator).

**References**

Drummond D.A., Iverson B.L., Georgiou G., Arnold F.H. (2005). Why
high-error-rate random mutagenesis libraries are enriched in
functional and improved proteins, *J. Mol. Biol.*,
**350**, 806-816.
Firth A.E., Patrick W.M., (2005). Statistics of protein library
construction, *Bioinformatics*, **21**, 3314-3315.
Sun F. (1995). The polymerase chain reaction and branching
processes, *J. Comput. Biol.*, **2**, 63-86.

Return to
PEDEL,
PEDEL-AA
server page.

Return to PEDEL software download page.

Return to library statistics home page.