Sound

Authors and titles for cs.SD in Jan 2023, skipping first 75

[ total of 104 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-104 ]
[ showing 25 entries per page: fewer | more | all ]

[76] arXiv:2301.11757 (cross-list from cs.CL) [pdf, other]: Title: Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion

Authors: Flavio Schneider, Ojasv Kamal, Zhijing Jin, Bernhard Schölkopf

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[77] arXiv:2301.11975 (cross-list from cs.LG) [pdf, other]: Title: Byte Pair Encoding for Symbolic Music

Authors: Nathan Fradet, Nicolas Gutowski, Fabien Chhel, Jean-Pierre Briot

Comments: EMNLP 2023, source code: this https URL

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[78] arXiv:2301.12331 (cross-list from cs.CL) [pdf, other]: Title: Time out of Mind: Generating Rate of Speech conditioned on emotion and speaker

Authors: Navjot Kaur, Paige Tuttosi

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[79] arXiv:2301.12686 (cross-list from cs.LG) [pdf, other]: Title: GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration

Authors: Naoki Murata, Koichi Saito, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[80] arXiv:2301.13003 (cross-list from cs.CL) [pdf, other]: Title: Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation

Authors: Minglun Han, Feilong Chen, Jing Shi, Shuang Xu, Bo Xu

Comments: Accepted by INTERSPEECH 2023

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[81] arXiv:2301.13507 (cross-list from cs.IR) [pdf, ps, other]: Title: An Analysis of Classification Approaches for Hit Song Prediction using Engineered Metadata Features with Lyrics and Audio Features

Authors: Mengyisong Zhao, Morgan Harvey, David Cameron, Frank Hopfgartner, Valerie J. Gillet

Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[82] arXiv:2301.00448 (cross-list from eess.AS) [pdf, other]: Title: Unsupervised Acoustic Scene Mapping Based on Acoustic Features and Dimensionality Reduction

Authors: Idan Cohen, Ofir Lindenbaum, Sharon Gannot

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[83] arXiv:2301.00646 (cross-list from eess.AS) [pdf, other]: Title: Addressing the Selection Bias in Voice Assistance: Training Voice Assistance Model in Python with Equal Data Selection

Authors: Kashav Piya, Srijal Shrestha, Cameran Frank, Estephanos Jebessa, Tauheed Khan Mohd

Subjects: Audio and Speech Processing (eess.AS); Multiagent Systems (cs.MA); Robotics (cs.RO); Sound (cs.SD)
[84] arXiv:2301.00833 (cross-list from eess.AS) [pdf, other]: Title: Hyperuniform disordered parametric loudspeaker array

Authors: Kun Tang, Yuqi Wang, Shaobo Wang, Da Gao, Haojie Li, Xindong Liang, Patrick Sebbah, Yibin Li, Jin Zhang, Junhui Shi

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Applied Physics (physics.app-ph)
[85] arXiv:2301.01361 (cross-list from eess.AS) [pdf, other]: Title: Modeling the Rhythm from Lyrics for Melody Generation of Pop Song

Authors: Daiyu Zhang, Ju-Chiang Wang, Katerina Kosta, Jordan B. L. Smith, Shicen Zhou

Comments: Published in ISMIR 2022

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[86] arXiv:2301.01595 (cross-list from quant-ph) [pdf, other]: Title: Quantum Representations of Sound: from mechanical waves to quantum circuits

Authors: Paulo V. Itaborai, Eduardo R. Miranda

Comments: 29 pages,26 figures. Accompanying Python package is available: this https URL

Subjects: Quantum Physics (quant-ph); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[87] arXiv:2301.02214 (cross-list from eess.AS) [pdf, other]: Title: Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks

Authors: Zifan Jiang, Adrian Soldati, Isaac Schamberg, Adriano R. Lameira, Steven Moran

Comments: This paper is published as: Jiang, Zifan, Adrian Soldati, Isaac Schamberg, Adriano R. Lameira and Steven Moran. Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks. In Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS 2023), 3100-3104, Prague, Czech Republic

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[88] arXiv:2301.02262 (cross-list from eess.AS) [pdf, other]: Title: Singing voice synthesis based on frame-level sequence-to-sequence models considering vocal timing deviation

Authors: Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda

Comments: 5 pages, 4 figures

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[89] arXiv:2301.02736 (cross-list from eess.AS) [pdf, other]: Title: Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition

Authors: David M. Chan, Shalini Ghosh, Ariya Rastrow, Björn Hoffmeister

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[90] arXiv:2301.04606 (cross-list from eess.AS) [pdf, other]: Title: Modelling low-resource accents without accent-specific TTS frontend

Authors: Georgi Tinchev, Marta Czarnowska, Kamil Deja, Kayoko Yanagisawa, Marius Cotescu

Comments: The first two authors contributed equally to this work. In Review. Samples available on this https URL

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[91] arXiv:2301.05025 (cross-list from math.HO) [pdf, other]: Title: Topological data analysis hearing the shapes of drums and bells

Authors: Guo-Wei Wei

Comments: 4 pages, 2 figures

Subjects: History and Overview (math.HO); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[92] arXiv:2301.05295 (cross-list from eess.AS) [pdf, other]: Title: Rock Guitar Tablature Generation via Natural Language Processing

Authors: Josue Casco-Rodriguez

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[93] arXiv:2301.05868 (cross-list from eess.AS) [pdf, other]: Title: Modulation spectral features for speech emotion recognition using deep neural networks

Authors: Premjeet Singh, Md Sahidullah, Goutam Saha

Comments: Accepted for publication in Elsevier's Speech Communication Journal

Journal-ref: Volume 146, January 2023, Pages 53-69

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[94] arXiv:2301.06458 (cross-list from eess.AS) [pdf, other]: Title: Multi-resolution location-based training for multi-channel continuous speech separation

Authors: Hassan Taherian, DeLiang Wang

Comments: Submitted to ICASSP 23

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[95] arXiv:2301.07173 (cross-list from eess.AS) [pdf, other]: Title: Towards Voice Reconstruction from EEG during Imagined Speech

Authors: Young-Eun Lee, Seo-Hyun Lee, Sang-Ho Kim, Seong-Whan Lee

Comments: 9 pages, 4 figures, accepted paper of AAAI 2023 in main track

Subjects: Audio and Speech Processing (eess.AS); Human-Computer Interaction (cs.HC); Sound (cs.SD); Signal Processing (eess.SP)
[96] arXiv:2301.08925 (cross-list from eess.AS) [pdf, other]: Title: New Challenges for Content Privacy in Speech and Audio

Authors: Jennifer Williams, Karla Pizzi, Shuvayanti Das, Paul-Gauthier Noe

Comments: Accepted for publication in ISCA SPSC Symposium 2022

Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Sound (cs.SD)
[97] arXiv:2301.09198 (cross-list from eess.AS) [pdf, other]: Title: Estimation of Source and Receiver Positions, Room Geometry and Reflection Coefficients From a Single Room Impulse Response

Authors: Wangyang Yu, W. Bastiaan Kleijn

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[98] arXiv:2301.10210 (cross-list from eess.AS) [pdf, ps, other]: Title: Perceptual evaluation of listener envelopment using spatial granular synthesis

Authors: Stefan Riedel, Matthias Frank, Franz Zotter

Comments: Submitted to the Journal of the Audio Engineering Society (JAES)

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[99] arXiv:2301.11176 (cross-list from eess.AS) [pdf, ps, other]: Title: A simple model for pink noise from amplitude modulations

Authors: Masahiro Morikawa, Akika Nakamichi

Comments: 12 pages, 9 figures

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Classical Physics (physics.class-ph)
[100] arXiv:2301.11276 (cross-list from eess.AS) [pdf, other]: Title: BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition

Authors: Will Rieger

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)

[ total of 104 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-104 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2404, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for cs.SD in Jan 2023, skipping first 75