Not Logged In

Recognition of Patient-related Named Entities in Noisy Tele-health Texts

Full Text: 2651444.pdf PDF

We explore methods for effectively extracting information from clinical narratives, which are captured in a public health consulting phone service called HealthLink. Our research investigates the application of state of the art natural language processing and machine learning to clinical narratives to extract information of interest. The currently available data consists of dialogues constructed by nurses while consulting patients by phone. Since the data are interviews transcribed by nurses during phone conversations, they include a significant volume and variety of noise. When we extract the patient-related information from the noisy data, we have to remove or correct at least two kinds of noise: the first is explicit noise, which includes spelling errors, unfinished sentences, omission of sentence delimiters, variants of terms, etc. Second is implicit noise, which includes non-patient’s information and patient’s untrustworthy information. To filter explicit noise, we propose our own biomedical term detection/normalization method: it resolves misspelling, term variations, and arbitrary abbreviation of terms by nurses. In detecting temporal terms, temperature, and other types of named entities (which show patients’ personal information such as age, and sex), we propose a bootstrapping-based pattern learning process to detect a variety of arbitrary variations of named entities. To address implicit noise, we propose a dependency path-based filtering method. The result of our de-noising is the extraction of normalized patient information, and we visualize the named entities by constructing a graph which shows the relations between named entities. The objective of this knowledge discovery task is to identify associations between biomedical terms, and to clearly expose the trends of patients’ symptoms and concern; the experimental results show that we achieve reasonable performance with our noise reduction methods.


M. Kim, Y. Xu, O. Zaiane, R. Goebel. "Recognition of Patient-related Named Entities in Noisy Tele-health Texts". ACM Transactions on Intelligent Systems and Technology, 6(4), pp 1-59, August 2015.

Keywords: Text analysis, Design, Algorithms, Performance, Tele-health mining, named entity recognition, biomedical text mining, effective information retrieval
Category: In Journal
Web Links: DOI
  ACM Digital Library


  author = {Mi-Young Kim and Ying Xu and Osmar R. Zaiane and Randy Goebel},
  title = {Recognition of Patient-related Named Entities in Noisy Tele-health
  Volume = "6",
  Number = "4",
  Pages = {1-59},
  journal = {ACM Transactions on Intelligent Systems and Technology},
  year = 2015,

Last Updated: February 13, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo