We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

astro-ph.GA

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Astrophysics > Astrophysics of Galaxies

Title: Efficient Identification of Broad Absorption Line Quasars using Dimensionality Reduction and Machine Learning

Abstract: Broad Absorption Line Quasars (BALQSOs) displaying distinct blue-shifted broad absorption lines. These serve as invaluable probes for unraveling the intricate structure and evolution of quasars, shedding light on the profound influence exerted by supermassive black holes on galaxy formation. The proliferation of large-scale spectroscopic surveys such as LAMOST, SDSS, and DESI has exponentially expanded the repository of quasar spectra at our disposal. In this study, we present an innovative approach to streamline the identification of BALQSOs, leveraging the power of dimensionality reduction and machine learning algorithms. Our dataset is curated from the SDSS DR16, amalgamating quasar spectra with classification labels sourced from the DR16Q quasar catalog. We employ a diverse array of dimensionality reduction techniques, including Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), Locally Linear Embedding (LLE), and Isometric Mapping (ISOMAP), to distill the essence of the original spectral data. The resultant low-dimensional representations serve as inputs for a suite of machine learning classifiers, including XGBoost and Random Forest models. Through experimentation, we unveil PCA as the most effective dimensionality reduction methodology, adeptly navigating the intricate balance between dimensionality reduction and preservation of vital spectral information. Notably, the synergistic fusion of PCA with the XGBoost classifier emerges as the pinnacle of efficacy in the BALQSO classification endeavor, boasting impressive accuracy rates of 97.60% by 10-cross validation and 96.92% on the outer test sample. This study not only introduces a novel machine learning-based paradigm for quasar classification but also offers invaluable insights transferrable to a myriad of spectral classification challenges pervasive in the realm of astronomy.
Comments: 17 pages, 6 figures, accepted for publication in PASJ
Subjects: Astrophysics of Galaxies (astro-ph.GA)
Cite as: arXiv:2404.12270 [astro-ph.GA]
  (or arXiv:2404.12270v1 [astro-ph.GA] for this version)

Submission history

From: Wei-Bo Kao [view email]
[v1] Thu, 18 Apr 2024 15:42:52 GMT (1851kb,D)

Link back to: arXiv, form interface, contact.