The prevalence and deaths of cancer cases in China stand at 21.8% and 27.1%, respectively, in consistency with those worldwide. Breast cancer ranks first among female, urging a rapid promotion of precision medicine. Unstructured text in clinical notes of the electronic health records (EHRs) systems contain the richest phenotypic and genotypic test information with potential for precision medicine. It’s indispensable to use natural language processing (NLP) methods for information extraction from EHRs. Despite that China has a large scale of EHR data from a representative population, the development of NLP-based tools and its applications for precision medicine is at its infant stage. This study takes the initiative to build NLP-based IE systems for breast cancer. Experimental results demonstrated the feasibility of using clinical NLP tools to extract rich genetic and phenotypic information from Chinese clinical narratives automatically.
Learning Objective: Learning objective one: Concept model built for breast cancer about phenotypic and genotypic test information
Learning objective two: Performance of deep learning methods for breast cancer information extraction
Learning objective three: Differences between medical sublanguages of Chinese and English
Xiaohui Zhang (Presenter)
Peking Union Medical College Hospital
Yaoyun Zhang, Digital China Health Technologies Co. Ltd.
Qin Zhang, Digital China Health Technologies Co. Ltd.
Yuankai Ren, Digital China Health Technologies Co. Ltd.
Jianhui Ma, Peking Union Medical College Hospital
Qiang Sun, Peking Union Medical College Hospital