We gratefully acknowledge support from
the Simons Foundation and member institutions.

Electrical Engineering and Systems Science

Authors and titles for recent submissions, skipping first 319

[ total of 455 entries: 1-25 | ... | 245-269 | 270-294 | 295-319 | 320-344 | 345-369 | 370-394 | 395-419 | ... | 445-455 ]
[ showing 25 entries per page: fewer | more | all ]

Thu, 6 Jun 2024 (continued, showing 25 of 83 entries)

[320]  arXiv:2406.02652 [pdf, other]
Title: RepCNN: Micro-sized, Mighty Models for Wakeword Detection
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[321]  arXiv:2406.02649 [pdf, other]
Title: Keyword-Guided Adaptation of Automatic Speech Recognition
Comments: Accepted to InterSpeech 2024
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[322]  arXiv:2406.02640 [pdf, other]
Title: Ghost imaging-based Non-contact Heart Rate Detection
Comments: 4 pages, 6 figures
Subjects: Image and Video Processing (eess.IV); Medical Physics (physics.med-ph); Optics (physics.optics)
[323]  arXiv:2406.02626 [pdf, ps, other]
Title: A Brief Overview of Optimization-Based Algorithms for MRI Reconstruction Using Deep Learning
Authors: Wanyu Bian
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Optimization and Control (math.OC)
[324]  arXiv:2406.02608 [pdf, other]
Title: PPINtonus: Early Detection of Parkinson's Disease Using Deep-Learning Tonal Analysis
Authors: Varun Reddy
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[325]  arXiv:2406.02572 [pdf, other]
Title: Selfsupervised learning for pathological speech detection
Comments: in Intersection of Book Chapter in Machine Leanring and Computational Social Sciences CRC (in progress) 2024
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[326]  arXiv:2406.02569 [pdf, other]
Title: Cluster-to-Predict Affect Contours from Speech
Comments: 8 pages, 3 figures
Subjects: Audio and Speech Processing (eess.AS); Human-Computer Interaction (cs.HC)
[327]  arXiv:2406.02566 [pdf, other]
Title: Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[328]  arXiv:2406.02563 [pdf, other]
Title: A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system
Comments: 5 pages, 4 figures
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[329]  arXiv:2406.02562 [pdf, other]
Title: Gated Low-rank Adaptation for personalized Code-Switching Automatic Speech Recognition on the low-spec devices
Comments: Table 2 is revised
Journal-ref: ICASSP 2024 Workshop(HSCMA 2024) paper
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[330]  arXiv:2406.02561 [pdf, ps, other]
Title: Breaking Walls: Pioneering Automatic Speech Recognition for Central Kurdish: End-to-End Transformer Paradigm
Comments:
Subjects: Audio and Speech Processing (eess.AS)
[331]  arXiv:2406.02560 [pdf, other]
Title: Less Peaky and More Accurate CTC Forced Alignment by Label Priors
Comments: Accepted by ICASSP 2024. Github repo: this https URL
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[332]  arXiv:2406.02557 [pdf, other]
Title: EVAN: Evolutional Video Streaming Adaptation via Neural Representation
Comments: accepted by ICME (conference)
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[333]  arXiv:2406.02555 [pdf, ps, other]
Title: PhoWhisper: Automatic Speech Recognition for Vietnamese
Comments: Accepted to ICLR 2024 Tiny Papers Track
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[334]  arXiv:2406.02554 [pdf, other]
Title: Hear Me, See Me, Understand Me: Audio-Visual Autism Behavior Recognition
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[335]  arXiv:2406.03472 (cross-list from cs.LG) [pdf, other]
Title: Solving Differential Equations using Physics-Informed Deep Equilibrium Models
Comments: Accepted at CASE 2024
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY)
[336]  arXiv:2406.03461 (cross-list from cs.CV) [pdf, other]
Title: Polarization Wavefront Lidar: Learning Large Scene Reconstruction from Polarized Wavefronts
Comments: Accepted at CVPR 2024; Project Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[337]  arXiv:2406.03438 (cross-list from cs.IT) [pdf, other]
Title: CSI-GPT: Integrating Generative Pre-Trained Transformer with Federated-Tuning to Acquire Downlink Massive MIMO Channels
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
[338]  arXiv:2406.03407 (cross-list from cs.LG) [pdf, other]
Title: Physics and geometry informed neural operator network with application to acoustic scattering
Comments: 20 pages of main text, 9 figures
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Computational Physics (physics.comp-ph)
[339]  arXiv:2406.03405 (cross-list from cs.LG) [pdf, ps, other]
Title: Amalgam: A Framework for Obfuscated Neural Network Training on the Cloud
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Systems and Control (eess.SY)
[340]  arXiv:2406.03344 (cross-list from cs.SD) [pdf, other]
Title: Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
Comments: Code is available at this https URL
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[341]  arXiv:2406.03270 (cross-list from math.OC) [pdf, other]
Title: A Successive Gap Constraint Linearization Method for Optimal Control Problems with Equilibrium Constraints
Comments: Forthcoming, (Accepted to the 2024 IFAC Conference on Nonlinear Model Predictive Control (NMPC))
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)
[342]  arXiv:2406.03251 (cross-list from cs.SD) [pdf, other]
Title: ASoBO: Attentive Beamformer Selection for Distant Speaker Diarization in Meetings
Comments: 5 pages, 2 figures, 2 tables, accepted at Interspeech 2024
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[343]  arXiv:2406.03247 (cross-list from cs.SD) [pdf, other]
Title: Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection
Comments: Accepted by INTERSPEECH 2024
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[344]  arXiv:2406.03240 (cross-list from cs.SD) [pdf, other]
Title: Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy
Comments: Accepted by INTERSPEECH 2024
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[ total of 455 entries: 1-25 | ... | 245-269 | 270-294 | 295-319 | 320-344 | 345-369 | 370-394 | 395-419 | ... | 445-455 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, new, 2406, contact, help  (Access key information)