The electronic health record contains a wealth of clinical information as unstructured free text. This information has the potential to enhance clinical decision support systems, define at-risk populations, or provide insight for secondary research. Unfortunately, extracting this data is difficult and time consuming. At Children’s Hospital of Philadelphia (CHOP), we developed a fault-tolerant and scalable natural language processing pipeline capable of processing hundreds of millions of clinical notes in parallel.

Learning Objective: 1. Define the benefits of natural language processing for clinical text.
2. Identify the challenges associated with running natural language processing algorithms at scale.
3. Understand practical solutions for capturing structured data from a large volume of clinical note data.


Jeritt Thayer (Presenter)
Children's Hospital of Philadelphia

Jeffrey Miller, Children's Hospital of Philadelphia
Jeffrey Pennington, Children's Hospital of Philadelphia

Presentation Materials: