We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Statistics Theory

Title: Nonlinear Models Using Dirichlet Process Mixtures

Abstract: We introduce a new nonlinear model for classification, in which we model the joint distribution of response variable, y, and covariates, x, non-parametrically using Dirichlet process mixtures. We keep the relationship between y and x linear within each component of the mixture. The overall relationship becomes nonlinear if the mixture contains more than one component. We use simulated data to compare the performance of this new approach to a simple multinomial logit (MNL) model, an MNL model with quadratic terms, and a decision tree model. We also evaluate our approach on a protein fold classification problem, and find that our model provides substantial improvement over previous methods, which were based on Neural Networks (NN) and Support Vector Machines (SVM). Folding classes of protein have a hierarchical structure. We extend our method to classification problems where a class hierarchy is available. We find that using the prior information regarding the hierarchical structure of protein folds can result in higher predictive accuracy.
Subjects: Statistics Theory (math.ST); Quantitative Methods (q-bio.QM)
MSC classes: 62H30
Cite as: arXiv:math/0703292 [math.ST]
  (or arXiv:math/0703292v1 [math.ST] for this version)

Submission history

From: Radford M. Neal [view email]
[v1] Sat, 10 Mar 2007 19:46:51 GMT (63kb)

Link back to: arXiv, form interface, contact.