Why Aren't We NER Yet? Artifacts of ASR Errors in Named Entity Recognition in Spontaneous Speech Transcripts
Piotr Szymanski and Lukasz Augustyniak and Mikolaj Morzy and Adrian Szymczak and Krzysztof Surdyk and Piotr Zelasko
ACL
ποΈπ₯ Ever wondered why AI still messes up names in conversations? This paper reveals the spectacular failure of name detection in spontaneous speech, showing it's not just speech recognition errors - spoken language is inherently messy and breaks traditional AI models! Bonus: they prove everyone's been measuring success wrong this whole time. π€π
View full abstract
Transcripts of spontaneous human speech present a significant obstacle for traditional NER models. The lack of grammatical structure of spoken utterances and word errors introduced by the ASR make downstream NLP tasks challenging. In this paper, we examine in detail the complex relationship between ASR and NER errors which limit the ability of NER models to recover entity mentions from spontaneous speech transcripts. Using publicly available benchmark datasets (SWNE, Earnings-21, OntoNotes), we present the full taxonomy of ASR-NER errors and measure their true impact on entity recognition. We find that NER models fail spectacularly even if no word errors are introduced by the ASR. We also show why the F1 score is inadequate to evaluate NER models on conversational transcripts.