We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.AP

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Applications

Title: Learning Seasonal Phytoplankton Communities with Topic Models

Abstract: In this work we develop and demonstrate a probabilistic generative model for phytoplankton communities. The proposed model takes counts of a set of phytoplankton taxa in a timeseries as its training data, and models communities by learning sparse co-occurrence structure between the taxa. Our model is probabilistic, where communities are represented by probability distributions over the species, and each time-step is represented by a probability distribution over the communities. The proposed approach uses a non-parametric, spatiotemporal topic model to encourage the communities to form an interpretable representation of the data, without making strong assumptions about the communities. We demonstrate the quality and interpretability of our method by its ability to improve performance of a simplistic regression model. We show that simple linear regression is sufficient to predict the community distribution learned by our method, and therefore the taxon distributions, from a set of naively chosen environment variables. In contrast, a similar regression model is insufficient to predict the taxon distributions directly or through PCA with the same level of accuracy.
Subjects: Applications (stat.AP); Computational Engineering, Finance, and Science (cs.CE)
Cite as: arXiv:1711.09013 [stat.AP]
  (or arXiv:1711.09013v2 [stat.AP] for this version)

Submission history

From: Arnold Kalmbach [view email]
[v1] Sun, 19 Nov 2017 21:20:26 GMT (5375kb,D)
[v2] Tue, 12 Dec 2017 15:35:30 GMT (5375kb,D)

Link back to: arXiv, form interface, contact.