We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: A generalized flow for multi-class and binary classification tasks: An Azure ML approach

Abstract: The constant growth in the present day real-world databases pose computational challenges for a single computer. Cloud-based platforms, on the other hand, are capable of handling large volumes of information manipulation tasks, thereby necessitating their use for large real-world data set computations. This work focuses on creating a novel Generalized Flow within the cloud-based computing platform: Microsoft Azure Machine Learning Studio (MAMLS) that accepts multi-class and binary classification data sets alike and processes them to maximize the overall classification accuracy. First, each data set is split into training and testing data sets, respectively. Then, linear and nonlinear classification model parameters are estimated using the training data set. Data dimensionality reduction is then performed to maximize classification accuracy. For multi-class data sets, data centric information is used to further improve overall classification accuracy by reducing the multi-class classification to a series of hierarchical binary classification tasks. Finally, the performance of optimized classification model thus achieved is evaluated and scored on the testing data set. The classification characteristics of the proposed flow are comparatively evaluated on 3 public data sets and a local data set with respect to existing state-of-the-art methods. On the 3 public data sets, the proposed flow achieves 78-97.5% classification accuracy. Also, the local data set, created using the information regarding presence of Diabetic Retinopathy lesions in fundus images, results in 85.3-95.7% average classification accuracy, which is higher than the existing methods. Thus, the proposed generalized flow can be useful for a wide range of application-oriented "big data sets".
Comments: 10 pages, 7 figures, Conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Journal reference: Big Data (Big Data), 2015 IEEE International Conference on, Santa Clara, CA, 2015, pp. 1728-1737
DOI: 10.1109/BigData.2015.7363944
Cite as: arXiv:1603.08070 [cs.CV]
  (or arXiv:1603.08070v1 [cs.CV] for this version)

Submission history

From: Sohini Roychowdhury [view email]
[v1] Sat, 26 Mar 2016 03:55:53 GMT (428kb)

Link back to: arXiv, form interface, contact.