References & Citations
Condensed Matter > Statistical Mechanics
Title: Optimal Detection of Sequence Similarity by Local Alignment
(Submitted on 6 Dec 1997 (v1), last revised 12 Feb 1998 (this version, v2))
Abstract: The statistical properties of local alignment algorithms with gaps are analyzed theoretically for uncorrelated and correlated DNA sequences. In the vicinity of the log-linear phase transition, the statistics of alignment with gaps is shown to be characteristically different from that of gapless alignment. The optimal scores obtained for uncorrelated sequences obey certain robust scaling laws. Deviation from these scaling laws signals sequence homology, and can be used to guide the empirical selection of scoring parameters for the optimal detection of sequence similarities. This can be accomplished in a computationally efficient way by using a novel approach focusing on the score landscape. Furthermore, by assuming a few gross features characterizing the statistics of underlying sequence-sequence correlations, quantitative criteria are obtained for the choice of optimal scoring parameters: Optimal similarity detection is most likely to occur in a region close to the log side of the log-linear phase transition.
Submission history
From: Terence Hwa [view email][v1] Sat, 6 Dec 1997 05:22:38 GMT (229kb)
[v2] Thu, 12 Feb 1998 09:32:52 GMT (241kb)
Link back to: arXiv, form interface, contact.