Familial risk criteria for common cancers rely on the age of cancer onset, which is documented inconsistently using structured and unstructured formats in electronic health records (EHRs). We operationalized a natural language processing (NLP) system to extract disease onset information from free-text EHR fields with recall and precision ranging from 98% to 100%.

Learning Objective: Understand the age of onset data type of FHIR's FamilyMemberHistory resource.
Learn how and how well the rule-based NLP can be used to extra the age of cancer onset of family members from the free text comments field.
Learn how NLP generated information can be mapped into FHIR data types for information exchange or downstream data analyses.


Jianlin Shi (Presenter)
University of Utah

Kensaku Kawamoto, University of Utah
Wendy Kohlmann, University of Utah
Danielle Mowery, University of Pennsylvania
Richard Bradshaw, University of Utah
Subhadeep Deep, University of Utah
Wendy Chapman, University of Utah
Guilherme Del Fiol, University of Utah

Presentation Materials: