We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

q-bio.QM

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Quantitative Biology > Quantitative Methods

Title: IBM Functional Genomics Platform, A Cloud-Based Platform for Studying Microbial Life at Scale

Abstract: The rapid growth in biological sequence data is revolutionizing our understanding of genotypic diversity and challenging conventional approaches to informatics. With the increasing availability of genomic data, traditional bioinformatic tools require substantial computational time and the creation of ever-larger indices each time a researcher seeks to gain insight from the data. To address these challenges, we pre-computed important relationships between biological entities spanning the Central Dogma of Molecular Biology and captured this information in a relational database. The database can be queried across hundreds of millions of entities and returns results in a fraction of the time required by traditional methods. In this paper, we describe \textit{IBM Functional Genomics Platform} (formerly known as OMXWare), a comprehensive database relating genotype to phenotype for bacterial life. Continually updated, IBM Functional Genomics Platform today contains data derived from 200,000 curated, self-consistently assembled genomes. The database stores functional data for over 68 million genes, 52 million proteins, and 239 million domains with associated biological activity annotations from Gene Ontology, KEGG, MetaCyc, and Reactome. IBM Functional Genomics Platform maps all of the many-to-many connections between each biological entity including the originating genome, gene, protein, and protein domain. Various microbial studies, from infectious disease to environmental health, can benefit from the rich data and connections. We describe the data selection, the pipeline to create and update the IBM Functional Genomics Platform, and the developer tools (Python SDK and REST APIs) which allow researchers to efficiently study microbial life at scale.
Subjects: Quantitative Methods (q-bio.QM); Databases (cs.DB)
Cite as: arXiv:1911.02095 [q-bio.QM]
  (or arXiv:1911.02095v3 [q-bio.QM] for this version)

Submission history

From: Gowri Nayar [view email]
[v1] Tue, 5 Nov 2019 21:32:25 GMT (2980kb,D)
[v2] Sun, 15 Mar 2020 19:29:18 GMT (1398kb,D)
[v3] Mon, 30 Mar 2020 23:14:33 GMT (1400kb,D)

Link back to: arXiv, form interface, contact.