Understandably, performance lags behind that of text applications. Attempts to improve speech NER have included transcript normalization [2], incorporating speech recognition confidence features [3, 4], or tagging LVCSR word lattices [5].A difficult unaddressed problem comes from out-of-vocabulary (OOV) terms: words that are missing from the LVCSR vocabulary.Since many OOVs are proper names (66% of the OOVs in our corpus are named entities,) OOV recognition errors are particularly damaging for NER
đang được dịch, vui lòng đợi..
