TY - JOUR
T1 - Efficient Algorithms for Counting and ReportingSegregating Sites in Genomic Sequences
AU - Christodoulakis, Manolis
AU - Golding, G. Brian
AU - Iliopoulos, Costas S.
AU - Ardila, Yoan José Pinzón
AU - Smyth, William F.
PY - 2007/9
Y1 - 2007/9
N2 - The number of segregating sites provides an indicator of the degree of DNA sequence variation that is present in a sample, and has been of great interest to the biological, pharmaceutical and medical professions. In this paper, we first provide linear- and expected-sublinear-time algorithms for finding all the segregating sites of a given set of DNA sequences. We also describe a data structure for tracking segregating sites in a set of sequences, such that every time the set is updated with the insertion of a new sequence or removal of an existing one, the segregating sites are updated accordingly without the need to re-scan the entire set of sequences.
AB - The number of segregating sites provides an indicator of the degree of DNA sequence variation that is present in a sample, and has been of great interest to the biological, pharmaceutical and medical professions. In this paper, we first provide linear- and expected-sublinear-time algorithms for finding all the segregating sites of a given set of DNA sequences. We also describe a data structure for tracking segregating sites in a set of sequences, such that every time the set is updated with the insertion of a new sequence or removal of an existing one, the segregating sites are updated accordingly without the need to re-scan the entire set of sequences.
KW - Segregating sites
KW - Single nucleotide polymorphisms (SNPs)
UR - https://www.scopus.com/pages/publications/34548770719
UR - https://journals.sagepub.com/doi/epdf/10.1089/cmb.2006.0136?src=getftr&utm_source=scopus&getft_integrator=scopus
U2 - 10.1089/cmb.2006.0136
DO - 10.1089/cmb.2006.0136
M3 - Article
C2 - 17803376
AN - SCOPUS:34548770719
SN - 1066-5277
VL - 14
SP - 1001
EP - 1010
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 7
ER -