Speech recognition for medical conversations

Chiu, Chung-Cheng; Tripathi, Anshuman; Chou, Katherine; Co, Chris; Jaitly, Navdeep; Jaunzeikare, Diana; Kannan, Anjuli; Nguyen, Patrick; Sak, Hasim; Sankar, Ananth; Tansuwan, Justin; Wan, Nathan; Wu, Yonghui; Zhang, Xuedong

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 1711

Computer Science > Computation and Language

Title: Speech recognition for medical conversations

Authors: Chung-Cheng Chiu, Anshuman Tripathi, Katherine Chou, Chris Co, Navdeep Jaitly, Diana Jaunzeikare, Anjuli Kannan, Patrick Nguyen, Hasim Sak, Ananth Sankar, Justin Tansuwan, Nathan Wan, Yonghui Wu, Xuedong Zhang

(Submitted on 20 Nov 2017 (v1), last revised 20 Jun 2018 (this version, v2))

Abstract: In this work we explored building automatic speech recognition models for transcribing doctor patient conversation. We collected a large scale dataset of clinical conversations ($14,000$ hr), designed the task to represent the real word scenario, and explored several alignment approaches to iteratively improve data quality. We explored both CTC and LAS systems for building speech recognition models. The LAS was more resilient to noisy data and CTC required more data clean up. A detailed analysis is provided for understanding the performance for clinical tasks. Our analysis showed the speech recognition models performed well on important medical utterances, while errors occurred in causal conversations. Overall we believe the resulting models can provide reasonable quality in practice.

Comments:	Interspeech 2018 camera ready
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
Cite as:	arXiv:1711.07274 [cs.CL]
	(or arXiv:1711.07274v2 [cs.CL] for this version)

Submission history

From: Chung-Cheng Chiu [view email]
[v1] Mon, 20 Nov 2017 12:07:22 GMT (20kb)
[v2] Wed, 20 Jun 2018 17:54:30 GMT (31kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1711.07274

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Speech recognition for medical conversations

Submission history