We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DS

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Data Structures and Algorithms

Title: Analysis of DNA sequences through local distribution of nucleotides in strategic neighborhoods

Abstract: We construct a compact vector representation on $\mathbb{R}24$ of a DNA sequence of arbitrary length. Each component of this vector is obtained from a representative sequence, the elements of which are the values realized by a function $\Gamma$. The function $\Gamma$, so defined, acts on neighborhoods of arbitrary radius that are located at strategic positions within the DNA sequence. $\Gamma$ carries complete information about the local multiplicity of the nucleotides as a consequence of the uniqueness of prime factorisation of integer. The two parameters characterizing the radius and location of the neighbourhoods are fixed by comparing the phylogenetic tree we find through our algorithm with standard results for the $\beta$ -globin gene sequences of eleven different species. Remarkably, the time complexity for this similarity analysis turns out to be $\mathcal{O}(n)$. Using the values of the two fitting parameters so obtained, the method is further applied to analyze mitochondrial genome sequences.
Comments: 9 pages, 4 figures
Subjects: Data Structures and Algorithms (cs.DS)
Cite as: arXiv:2303.14994 [cs.DS]
  (or arXiv:2303.14994v1 [cs.DS] for this version)

Submission history

From: Probir Mondal [view email]
[v1] Mon, 27 Mar 2023 08:39:14 GMT (244kb,D)

Link back to: arXiv, form interface, contact.