References & Citations
Condensed Matter > Statistical Mechanics
Title: Statistical linguistic study of DNA sequences
(Submitted on 7 Aug 2003)
Abstract: A new family of compound Poisson distribution functions from statistical linguistic is used to study the n-tuples and nucleotide composition features of DNA sequences. The relative frequency distribution of the 6-tuples and 7- tuples occurrence studies suggest that most of the DNA sequences follow the general shape of the compound Poisson distribution. It is also noted that the $\chi$-square test indicated that some of the sequences follow this distribution with a reasonable level of goodness of fit. The compositional segmentation study fits quite well using this new family of distribution functions. Furthermore, the absolute values of the relative frequency come out naturally from the linguistic model without ambiguity. It is suggesting that DNA sequences are not random sequences and they could possibly have subsequence structures.
Link back to: arXiv, form interface, contact.