New Approach Enhances Elderly Speech Recognition Accuracy

Global AI Watch··2 min read·arXiv cs.CL (NLP/LLMs)
New Approach Enhances Elderly Speech Recognition Accuracy

Recent research addresses challenges in automatic speech recognition specific to the elderly population, which is often hampered by limited training data and unique acoustic properties. This study introduces a novel data augmentation pipeline that utilizes large language models (LLMs) for creating elderly-contextual paraphrased transcripts, which are then synthesized into speech using text-to-speech technology. The augmented dataset enables fine-tuning of the Whisper ASR model, demonstrating notable improvements in performance metrics when applied to English and Korean elderly speech datasets.