We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SD

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Sound

Title: Cross-modal Music Emotion Recognition Using Composite Loss-based Embeddings

Abstract: Most music emotion recognition approaches use one-way classification or regression that estimates a general emotion from a distribution of music samples, but without considering emotional variations (e.g., happiness can be further categorised into much, moderate or little happiness). We propose a cross-modal music emotion recognition approach that associates music samples with emotions in a common space by considering both of their general and specific characteristics. Since the association of music samples with emotions is uncertain due to subjective human perceptions, we compute composite loss-based embeddings obtained to maximise two statistical characteristics, one being the correlation between music samples and emotions based on canonical correlation analysis, and the other being a probabilistic similarity between a music sample and an emotion with KL-divergence. Experiments on two benchmark datasets demonstrate the superiority of our approach over one-way baselines. In addition, detailed analysis show that our approach can accomplish robust cross-modal music emotion recognition that not only identifies music samples matching with a specific emotion but also detects emotions expressed in a certain music sample.
Comments: 12 pages, 5 figures
Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
ACM classes: H.5.5; H.5.1; H.3.1
Cite as: arXiv:2112.07192 [cs.SD]
  (or arXiv:2112.07192v1 [cs.SD] for this version)

Submission history

From: Naoki Takashima [view email]
[v1] Tue, 14 Dec 2021 06:54:08 GMT (1942kb,D)
[v2] Fri, 29 Jul 2022 16:33:38 GMT (2703kb,D)
[v3] Mon, 5 Sep 2022 06:42:52 GMT (6450kb,D)
[v4] Wed, 8 Feb 2023 15:43:56 GMT (4930kb,D)
[v5] Sat, 8 Apr 2023 06:26:50 GMT (4995kb,D)

Link back to: arXiv, form interface, contact.