We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Age Group Classification with Speech and Metadata Multimodality Fusion

Abstract: Children comprise a significant proportion of TV viewers and it is worthwhile to customize the experience for them. However, identifying who is a child in the audience can be a challenging task. Identifying gender and age from audio commands is a well-studied problem but is still very challenging to get good accuracy when the utterances are typically only a couple of seconds long. We present initial studies of a novel method which combines utterances with user metadata. In particular, we develop an ensemble of different machine learning techniques on different subsets of data to improve child detection. Our initial results show a 9.2\% absolute improvement over the baseline, leading to a state-of-the-art performance.
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Journal reference: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2017
Cite as: arXiv:1803.00721 [cs.CL]
  (or arXiv:1803.00721v1 [cs.CL] for this version)

Submission history

From: Denys Katerenchuk [view email]
[v1] Fri, 2 Mar 2018 05:07:33 GMT (26kb)

Link back to: arXiv, form interface, contact.