Sound

Authors and titles for cs.SD in Dec 2017

[ total of 54 entries: 1-54 ]
[ showing 54 entries per page: fewer | more ]

[1] arXiv:1712.00166 [pdf, other]: Title: Audio Cover Song Identification using Convolutional Neural Network

Authors: Sungkyun Chang, Juheon Lee, Sang Keun Choe, Kyogu Lee

Comments: NIPS 2017 Workshop on Machine Learning for Audio (ML4A), Long Beach, CA, USA

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[2] arXiv:1712.00171 [pdf, other]: Title: Speaker identification from the sound of the human breath

Authors: Wenbo Zhao, Yang Gao, Rita Singh

Comments: 5 pages, 3 figures

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[3] arXiv:1712.00254 [pdf, other]: Title: Utilizing Domain Knowledge in End-to-End Audio Processing

Authors: Tycho Max Sylvester Tax, Jose Luis Diez Antich, Hendrik Purwins, Lars Maaløe

Comments: Accepted at the ML4Audio workshop at the NIPS 2017

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[4] arXiv:1712.00866 [pdf, other]: Title: Raw Waveform-based Audio Classification Using Sample-level CNN Architectures

Authors: Jongpil Lee, Taejun Kim, Jiyoung Park, Juhan Nam

Comments: NIPS, Machine Learning for Audio Signal Processing Workshop (ML4Audio), 2017

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[5] arXiv:1712.00917 [pdf, ps, other]: Title: A text-independent speaker verification model: A comparative analysis

Authors: Rishi Charan, Manisha.A, Karthik.R, Rajesh Kumar M

Comments: presented and accepted by 2017 International Conference on Intelligent Computing and Control (I2C2)

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[6] arXiv:1712.01011 [pdf, ps, other]: Title: Chord Generation from Symbolic Melody Using BLSTM Networks

Authors: Hyungui Lim, Seungyeon Rhyu, Kyogu Lee

Comments: 18th International Society for Music Information Retrieval Conference (ISMIR 2017)

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[7] arXiv:1712.02116 [pdf, ps, other]: Title: Enabling Early Audio Event Detection with Neural Networks

Authors: Huy Phan, Philipp Koch, Ian McLoughlin, Alfred Mertins

Comments: Published version available at this https URL

Journal-ref: Published in Proceedings of 43rd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 141-145, 2018

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[8] arXiv:1712.02898 [pdf, ps, other]: Title: Representations of Sound in Deep Learning of Audio Features from Music

Authors: Sergey Shuvaev, Hamza Giaffar, Alexei A. Koulakov

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS); Neurons and Cognition (q-bio.NC)
[9] arXiv:1712.03228 [pdf, other]: Title: Music Transcription by Deep Learning with Data and "Artificial Semantic" Augmentation

Authors: Vladyslav Sarnatskyi, Vadym Ovcharenko, Mariia Tkachenko, Sergii Stirenko, Yuri Gordienko, Anis Rojbi

Comments: 4 pages, 3 figures

Journal-ref: International Journal of Systems Applications Engineering and Development, 11, 212-215 (2017)

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[10] arXiv:1712.03439 [pdf, other]: Title: Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models

Authors: Chanwoo Kim, Ehsan Variani, Arun Narayanan, Michiel Bacchiani

Comments: Published at INTERSPEECH 2018. (this https URL)

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[11] arXiv:1712.03569 [pdf, ps, other]: Title: The organization of a three-manual keyboard for 53-tone tempered and other tempered systems

Authors: Vladimir P. Burskii

Comments: 16 pages, in Russian, 10 tables

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[12] arXiv:1712.03579 [pdf, ps, other]: Title: Prodorshok I: A Bengali Isolated Speech Dataset for Voice-Based Assistive Technologies - A comparative analysis of the effects of data augmentation on HMM-GMM and DNN classifiers

Authors: Mohi Reza, Warida Rashid, Moin Mostakim

Comments: 4 pages, accepted for oral presentation at the 5th IEEE R10 HTC 2017

Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS)
[13] arXiv:1712.03603 [pdf, other]: Title: A Cascade Architecture for Keyword Spotting on Mobile Devices

Authors: Alexander Gruenstein, Raziel Alvarez, Chris Thornton, Mohammadali Ghodrat

Comments: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[14] arXiv:1712.04276 [pdf, other]: Title: Multi-Speaker Localization Using Convolutional Neural Network Trained with Noise

Authors: Soumitro Chakrabarty, Emanuël A. P. Habets

Comments: Presented at Machine Learning for Audio Processing (ML4Audio) Workshop at NIPS 2017

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[15] arXiv:1712.04371 [pdf, other]: Title: Music Generation by Deep Learning - Challenges and Directions

Authors: Jean-Pierre Briot, François Pachet

Comments: 17 pages. arXiv admin note: substantial text overlap with arXiv:1709.01620. Accepted for publication in Special Issue on Deep learning for music and audio, Neural Computing & Applications, Springer Nature, 2018

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[16] arXiv:1712.04382 [pdf, other]: Title: auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks

Authors: Michael Freitag, Shahin Amiriparian, Sergey Pugachevskiy, Nicholas Cummins, Björn Schuller

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[17] arXiv:1712.05119 [pdf, other]: Title: DLR : Toward a deep learned rhythmic representation for music content analysis

Authors: Yeonwoo Jeong, Keunwoo Choi, Hosan Jeong

Subjects: Sound (cs.SD)
[18] arXiv:1712.05274 [pdf, other]: Title: A Hierarchical Recurrent Neural Network for Symbolic Melody Generation

Authors: Jian Wu, Changran Hu, Yulong Wang, Xiaolin Hu, Jun Zhu

Comments: 9 pages

Subjects: Sound (cs.SD); Multimedia (cs.MM)
[19] arXiv:1712.06340 [pdf, other]: Title: Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

Authors: Santiago Pascual, Maruchan Park, Joan Serrà, Antonio Bonafonte, Kang-Hun Ahn

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[20] arXiv:1712.07065 [pdf, other]: Title: Joint model-based recognition and localization of overlapped acoustic events using a set of distributed small microphone arrays

Authors: Rupayan Chakraborty, Climent Nadeu

Comments: Computational acoustic scene analysis, microphone array signal processing, acoustic event detection

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[21] arXiv:1712.07799 [pdf, ps, other]: Title: Towards a Deep Improviser: a prototype deep learning post-tonal free music generator

Authors: Roger T. Dean, Jamie Forth

Comments: 13 pages, 1 Figure, 3 Tables

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[22] arXiv:1712.07814 [pdf, ps, other]: Title: Indoor Sound Source Localization with Probabilistic Neural Network

Authors: Yingxiang Sun, Jiajia Chen, Chau Yuen, Susanto Rahardja

Comments: 10 pages, accepted by IEEE Transactions on Industrial Electronics

Journal-ref: IEEE Transactions on Industrial Electronics, vol. 65, no. 8, pp. 6403-6413, Aug. 2018

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[23] arXiv:1712.07941 [pdf, ps, other]: Title: Rate-Distributed Spatial Filtering Based Noise Reduction in Wireless Acoustic Sensor Networks

Authors: Jie Zhang, Richard Heusdens, Richard C. Hendriks

Comments: submitted to IEEE Transactions on Audio, Speech and Language Processing

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[24] arXiv:1712.08363 [pdf, other]: Title: On Using Backpropagation for Speech Texture Generation and Voice Conversion

Authors: Jan Chorowski, Ron J. Weiss, Rif A. Saurous, Samy Bengio

Comments: Accepted to ICASSP 2018

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[25] arXiv:1712.08370 [pdf, other]: Title: Music Genre Classification with Paralleling Recurrent Convolutional Neural Network

Authors: Lin Feng, Shenlan Liu, Jianing Yao

Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Audio and Speech Processing (eess.AS)
[26] arXiv:1712.08708 [pdf, other]: Title: Variational Autoencoders for Learning Latent Representations of Speech Emotion: A Preliminary Study

Authors: Siddique Latif, Rajib Rana, Junaid Qadir, Julien Epps

Comments: Proc. Interspeech 2018

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[27] arXiv:1712.09668 [pdf, other]: Title: Eventness: Object Detection on Spectrograms for Temporal Localization of Audio Events

Authors: Phuong Pham, Juncheng Li, Joseph Szurley, Samarjit Das

Comments: 5 pages, 3 figures, accepted to ICASSP 2018

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[28] arXiv:1712.09673 [pdf, other]: Title: Multiple Instance Deep Learning for Weakly Supervised Small-Footprint Audio Event Detection

Authors: Shao-Yen Tseng, Juncheng Li, Yun Wang, Joseph Szurley, Florian Metze, Samarjit Das

Comments: 5 pages, 3 figures

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[29] arXiv:1712.09680 [pdf, other]: Title: A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging

Authors: Juncheng Li, Yun Wang, Joseph Szurley, Florian Metze, Samarjit Das

Comments: 5 pages, 3 figures, Accepted and to appear at ICASSP 2018

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[30] arXiv:1712.01456 (cross-list from cs.LG) [pdf, other]: Title: Learning to Fuse Music Genres with Generative Adversarial Dual Learning

Authors: Zhiqian Chen, Chih-Wei Wu, Yen-Cheng Lu, Alexander Lerch, Chang-Tien Lu

Comments: International Conference on Data Mining - New Orleans, 2017

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[31] arXiv:1712.01769 (cross-list from cs.CL) [pdf, other]: Title: State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Authors: Chung-Cheng Chiu, Tara N. Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J. Weiss, Kanishka Rao, Ekaterina Gonina, Navdeep Jaitly, Bo Li, Jan Chorowski, Michiel Bacchiani

Comments: ICASSP camera-ready version

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[32] arXiv:1712.01864 (cross-list from cs.CL) [pdf, other]: Title: No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models

Authors: Tara N. Sainath, Rohit Prabhavalkar, Shankar Kumar, Seungji Lee, Anjuli Kannan, David Rybach, Vlad Schogol, Patrick Nguyen, Bo Li, Yonghui Wu, Zhifeng Chen, Chung-Cheng Chiu

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[33] arXiv:1712.05197 (cross-list from cs.IR) [pdf, other]: Title: Towards Deep Modeling of Music Semantics using EEG Regularizers

Authors: Francisco Raposo, David Martins de Matos, Ricardo Ribeiro, Suhua Tang, Yi Yu

Comments: 5 pages, 2 figures

Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Neurons and Cognition (q-bio.NC)
[34] arXiv:1712.05608 (cross-list from cs.CL) [pdf, other]: Title: A Novel Approach for Effective Learning in Low Resourced Scenarios

Authors: Sri Harsha Dumpala, Rupayan Chakraborty, Sunil Kumar Kopparapu

Comments: Presented at NIPS 2017 Machine Learning for Audio Signal Processing (ML4Audio) Workshop, Dec. 2017

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[35] arXiv:1712.05901 (cross-list from cs.LG) [pdf, other]: Title: Automatic Music Highlight Extraction using Convolutional Recurrent Attention Networks

Authors: Jung-Woo Ha, Adrian Kim, Chanju Kim, Jangyeon Park, Sunghun Kim

Subjects: Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Machine Learning (stat.ML)
[36] arXiv:1712.06086 (cross-list from cs.CL) [pdf, other]: Title: Deep Learning for Distant Speech Recognition

Authors: Mirco Ravanelli

Comments: PhD Thesis Unitn, 2017

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[37] arXiv:1712.06651 (cross-list from cs.CV) [pdf, other]: Title: Objects that Sound

Authors: Relja Arandjelović, Andrew Zisserman

Comments: Appears in: European Conference on Computer Vision (ECCV) 2018

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[38] arXiv:1712.07101 (cross-list from cs.CL) [pdf, other]: Title: Improving End-to-End Speech Recognition with Policy Learning

Authors: Yingbo Zhou, Caiming Xiong, Richard Socher

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[39] arXiv:1712.07108 (cross-list from cs.CL) [pdf, other]: Title: Improved Regularization Techniques for End-to-End Speech Recognition

Authors: Yingbo Zhou, Caiming Xiong, Richard Socher

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[40] arXiv:1712.08992 (cross-list from cs.CL) [pdf, other]: Title: Leveraging Native Language Speech for Accent Identification using Deep Siamese Networks

Authors: Aditya Siddhant, Preethi Jyothi, Sriram Ganapathy

Comments: Published in ASRU 2017

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[41] arXiv:1712.01120 (cross-list from eess.AS) [pdf, other]: Title: Wavenet based low rate speech coding

Authors: W. Bastiaan Kleijn, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Florian Stimberg, Quan Wang, Thomas C. Walters

Comments: 5 pages, 2 figures

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[42] arXiv:1712.01340 (cross-list from eess.AS) [pdf, other]: Title: Precision Scaling of Neural Networks for Efficient Audio Processing

Authors: Jong Hwan Ko, Josh Fromm, Matthai Philipose, Ivan Tashev, Shuayb Zarar

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[43] arXiv:1712.01541 (cross-list from eess.AS) [pdf, other]: Title: Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model

Authors: Bo Li, Tara N. Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao

Comments: submitted to ICASSP 2018

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[44] arXiv:1712.01742 (cross-list from eess.AS) [pdf, ps, other]: Title: Multi-speaker Recognition in Cocktail Party Problem

Authors: Yiqian Wang, Wensheng Sun

Comments: the 6th International Conference on Communications, Signal Processing and Systems (CSPS)

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[45] arXiv:1712.01996 (cross-list from eess.AS) [pdf, other]: Title: An analysis of incorporating an external language model into a sequence-to-sequence model

Authors: Anjuli Kannan, Yonghui Wu, Patrick Nguyen, Tara N. Sainath, Zhifeng Chen, Rohit Prabhavalkar

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Sound (cs.SD)
[46] arXiv:1712.02567 (cross-list from eess.AS) [pdf, other]: Title: On Musical Onset Detection via the S-Transform

Authors: Nishal Silva, Chathuranga Weeraddana, Carlo Fischione

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[47] arXiv:1712.04555 (cross-list from eess.AS) [pdf, other]: Title: Classification vs. Regression in Supervised Learning for Single Channel Speaker Count Estimation

Authors: Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël A. P. Habets

Comments: Accepted in ICASSP 2018

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[48] arXiv:1712.04753 (cross-list from eess.AS) [pdf, other]: Title: Learning Spontaneity to Improve Emotion Recognition In Speech

Authors: Karttikeya Mangalam, Tanaya Guha

Comments: Accepted at Interspeech 2018

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Sound (cs.SD)
[49] arXiv:1712.08034 (cross-list from eess.AS) [pdf, other]: Title: On the Use of a Spectral Glottal Model for the Source-filter Separation of Speech

Authors: Olivier Perrotin, Ian Vince McLoughlin

Comments: 8 pages, 4 figures

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[50] arXiv:1712.08336 (cross-list from q-bio.NC) [pdf, ps, other]: Title: Music of Brain and Music on Brain: A Novel EEG Sonification approach

Authors: Sayan Nag, Shankha Sanyal, Archi Banerjee, Ranjan Sengupta, Dipak Ghosh

Comments: 6 pages, 4 figures; Presented in the International Symposium on Frontiers of Research in speech and Music (FRSM)-2017, held at NIT, Rourkela in 15-16 December 2017

Subjects: Neurons and Cognition (q-bio.NC); Sound (cs.SD); Audio and Speech Processing (eess.AS); Data Analysis, Statistics and Probability (physics.data-an)
[51] arXiv:1712.09117 (cross-list from eess.AS) [pdf, other]: Title: Overcomplete Frame Thresholding for Acoustic Scene Analysis

Authors: Romain Cosentino, Randall Balestriero, Richard Baraniuk, Ankit Patel

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Machine Learning (stat.ML)
[52] arXiv:1712.09382 (cross-list from eess.AS) [pdf, other]: Title: Audio to Body Dynamics

Authors: Eli Shlizerman, Lucio M. Dery, Hayden Schoen, Ira Kemelmacher-Shlizerman

Comments: Link with videos this https URL

Journal-ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[53] arXiv:1712.10252 (cross-list from eess.AS) [pdf, ps, other]: Title: Spectral analysis for nonstationary audio

Authors: Adrien Meynard (I2M), Bruno Torrésani (I2M)

Comments: IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, In press

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Statistics Theory (math.ST)
[54] arXiv:1712.09382 (cross-list from eess.AS) [pdf, other]: Title: Audio to Body Dynamics

Authors: Eli Shlizerman, Lucio M. Dery, Hayden Schoen, Ira Kemelmacher-Shlizerman

Comments: Link with videos this https URL

Journal-ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)

[ total of 54 entries: 1-54 ]
[ showing 54 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2405, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for cs.SD in Dec 2017