Sampling Method for Fast Training of Support Vector Data Description

Chaudhuri, Arin; Kakde, Deovrat; Jahja, Maria; Xiao, Wei; Jiang, Hansi; Kong, Seunghyun; Peredriy, Sergiy

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1606

Computer Science > Machine Learning

Title: Sampling Method for Fast Training of Support Vector Data Description

Authors: Arin Chaudhuri, Deovrat Kakde, Maria Jahja, Wei Xiao, Hansi Jiang, Seunghyun Kong, Sergiy Peredriy

(Submitted on 16 Jun 2016 (v1), revised 24 Jun 2016 (this version, v2), latest version 25 Sep 2016 (v3))

Abstract: Support Vector Data Description (SVDD) is a machine learning technique used for single class classification and outlier detection. The SVDD model for normal data description builds a minimum radius hypersphere around the training data. A flexible description can be obtained by use of Kernel functions. The data description is defined by the support vectors obtained by solving quadratic optimization problem which minimizes the volume enclosed by the hypersphere. The time required to solve the quadratic programming problem is directly related to the number of observations in the training data set. This leads to very high computing time for large training datasets. In this paper we propose a new iterative sampling-based method for SVDD training. The method incrementally learns the training data set description at each iteration by computing SVDD on an independent random sample selected with replacement from the training data set. The experimental results indicate that the proposed method is extremely fast and provides near-identical data description as compared to training using the entire data set in one iteration. Proposed method can be easily implemented as a wrapper code around the core module for SVDD training computations either in a single machine or a multi-machine distributed environment.

Subjects:	Machine Learning (cs.LG); Applications (stat.AP); Machine Learning (stat.ML)
Cite as:	arXiv:1606.05382 [cs.LG]
	(or arXiv:1606.05382v2 [cs.LG] for this version)

Submission history

From: Arin Chaudhuri [view email]
[v1] Thu, 16 Jun 2016 23:18:23 GMT (1950kb,D)
[v2] Fri, 24 Jun 2016 17:30:57 GMT (1950kb,D)
[v3] Sun, 25 Sep 2016 22:15:38 GMT (1950kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1606.05382v2

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Sampling Method for Fast Training of Support Vector Data Description

Submission history