We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: SARS-CoV-2 virus RNA sequence classification and geographical analysis with convolutional neural networks approach

Authors: Selcuk Yazar
Abstract: Covid-19 infection, which spread to the whole world in December 2019 and is still active, caused more than 250 thousand deaths in the world today. Researches on this subject have been focused on analyzing the genetic structure of the virus, developing vaccines, the course of the disease, and its source. In this study, RNA sequences belonging to the SARS-CoV-2 virus are transformed into gene motifs with two basic image processing algorithms and classified with the convolutional neural network (CNN) models. The CNN models achieved an average of 98% Area Under Curve(AUC) value was achieved in RNA sequences classified as Asia, Europe, America, and Oceania. The resulting artificial neural network model was used for phylogenetic analysis of the variant of the virus isolated in Turkey. The classification results reached were compared with gene alignment values in the GISAID database, where SARS-CoV-2 virus records are kept all over the world. Our experimental results have revealed that now the detection of the geographic distribution of the virus with the CNN models might serve as an efficient method.
Subjects: Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
Cite as: arXiv:2007.05055 [cs.LG]
  (or arXiv:2007.05055v1 [cs.LG] for this version)

Submission history

From: Selçuk Yazar [view email]
[v1] Thu, 9 Jul 2020 20:43:22 GMT (1148kb)

Link back to: arXiv, form interface, contact.