Skip to main content
SHARE
Publication

Dialect Topic Modeling for Improved Consumer Medical Search...

by Steven Crain, Shuang-hong Yang, Hongyuan Zha, Yu Jiao
Publication Type
Conference Paper
Publication Date
Conference Name
AMIA 2010 Annual Symposium
Conference Location
Washington DC, District of Columbia, United States of America
Conference Date

Access to health information by consumers is ham- pered by a fundamental language gap. Current attempts to close the gap leverage consumer oriented health information, which does not, however, have good coverage of slang medical terminology. In this paper, we present a Bayesian model to automatically align documents with different dialects (slang, com- mon and technical) while extracting their semantic topics. The proposed diaTM model enables effective information retrieval, even when the query contains slang words, by explicitly modeling the mixtures of dialects in documents and the joint influence of dialects and topics on word selection. Simulations us- ing consumer questions to retrieve medical information from a corpus of medical documents show that diaTM achieves a 25% improvement in information retrieval relevance by nDCG@5 over an LDA baseline.