Note:This page is a work in progress; currently only the Poisson distribution option works.
Version 2008-01-20/06:37:22.
The programme is introduced in Wayne M. Patrick, Andrew E. Firth, Jonathan M. Blackburn, 2003, User-friendly algorithms for estimating completeness and diversity in randomized protein-encoding libraries, Protein Engineering, 16, 451-457, and Andrew E. Firth, Wayne M. Patrick, 2005, Statistics of protein library construction, Bioinformatics, 21, 3314-3315.
Return to library statistics home.
Problem: Given a library of L sequences, comprising variants of a sequence of N nucleotides, into which random point mutations have been introduced, we wish to calculate the expected number of distinct sequences in the library. (Typically assuming L > 10, N > 5, and the mean number of mutations per sequence m < 0.1 x N).
Click here for a worked example.
Click here for some caveats.
See also: