Poisson versus PCR distributions:

In our paper (Firth & Patrick, 2005), we assumed a Poisson distribution to determine the fraction of sequences in an epPCR library that contain exactly 0, 1, 2, 3, ... mutations, given the mean number of mutations, m, per sequence.

Since publication of Firth & Patrick (2005), however, Drummond et al. (2005) have revisited the pioneering work of Sun (1995) and provided experimental evidence in support of his more accurate equation describing the distribution of m. This 'PCR distribution' takes into account the number of PCR thermal cycles ncycles and the PCR efficiency eff (i.e. the probability that any particular sequence is duplicated in a given PCR cycle). We have therefore now included the PCR distribution as an optional alternative to the Poisson distribution in PEDEL.

For large m, small ncycles, or low eff, the PCR distribution is broader than the Poisson distribution. For low m, large ncycles and large eff, the PCR distribution approximates the Poisson distribution. In a 'typical' epPCR (e.g. ncycles = 30, eff = 0.6, m = 4), the estimated total number of distinct sequences in a library typically agrees to within 5% for the two distributions, though the sub-library statistics can show more variation.

If you know ncycles and eff, then we recommend that you use the PCR distribution instead of the Poisson distribution. Drummond et al. (2005) use the formula d = ncycles × eff, where d is the number of doublings. For example, if you start with 10^9 identical parent sequences and amplify them in an epPCR to 10^15 sequences, then you have had about d = 20 doublings (10^9 × 2^20 ~= 10^15), and you can calculate eff = d ÷ ncycles. Actually the d = ncycles × eff formula is wrong. The correct formula is 2^d = (1+eff)^ncycles, so that the efficiency is given by eff = 2^(d/ncycles) - 1 (PCR efficiency calculator).


References

  • Drummond D.A., Iverson B.L., Georgiou G., Arnold F.H. (2005). Why high-error-rate random mutagenesis libraries are enriched in functional and improved proteins, J. Mol. Biol., 350, 806-816.
  • Firth A.E., Patrick W.M., (2005). Statistics of protein library construction, Bioinformatics, 21, 3314-3315.
  • Sun F. (1995). The polymerase chain reaction and branching processes, J. Comput. Biol., 2, 63-86.


    Return to PEDEL, PEDEL-AA server page.
    Return to PEDEL software download page.
    Return to library statistics home page.