This workshop will provide attendees the information necessary to implement NLP workflows using cloud native technologies by providing practical introductions to UIMA-AS, Docker, Kubernetes, and Argo. It will start with the basics of composing NLP system "ensembles" designed to optimize performance in a particular domain, and proceed through an introduction to cloud technologies-- including core concepts and technical terms, and explanation of several alternatives to the Argo/Kubernetes/Docker workflow. Explanations of when, where, and why to use each technology, along with some of the practical challenges of using each in a high-security (PHI) environment will be discussed. Workshop participants will then install Docker (a container protocol and server), Kubernetes (a container orchestration system), minikube (a platform for using Kubernetes locally), and Argo (a Kubernetes workflow manager) on their own computers and run a test NLP workflow on a collection of exemplar clinical notes (from the MTSamples corpus). We will then discuss common architectures for UIMA pipelines and pipelines for technologies that are common in other informatics domains and non-UIMA tools, as time permits.

Learning Objective: After attending this session, participants should understand how to apply the basic concepts of cloud and container technologies to Natural Language Processing ensemble methods.


Raymond Finzel (Presenter)
University of Minnesota

Greg Silverman (Presenter)
University of Minnesota

Sijia Liu (Presenter)
Mayo Clinic, Rochester

Hongfang Liu (Presenter)
Mayo Clinic, Rochester

Xiaoqian Jiang (Presenter)

Serguei Pakhomov (Presenter)
University of Minnesota

Presentation Materials: