event-icon
Description

We consider the task of producing a useful clustering of healthcare providers from their clinical action signature--their drug, procedure, and billing codes. Because high-dimensional sparse count vectors are challenging to cluster, we develop a novel autoencoder framework to address this task. Our solution creates a low-dimensional embedded representation of the high-dimensional space that preserves angular relationships and assigns examples to clusters while optimizing the quality of this clustering. Our method is able to find a better clustering than under a two-step alternative, e.g., projected K means/medoids, where a representation is learned and then clustering is applied to the representation. We demonstrate our method's characteristics through quantitative and qualitative analysis of real and simulated data, including in several real-world healthcare case studies. Finally, we develop a tool to enhance exploratory analysis of providers based on their clinical behaviors.

Learning Objective: After reading this article, you will be able to:
- discern among clustering approaches that model high-dimensional sparse data
- explain why angular representations are appropriate for such data
- identify clusters of providers based on their procedures and prescriptions in Medicare claims data
- characterize these clusters to assess alignment of specialty to clinical activities and identify surprising findings for further investigation

Authors:

Nathanael Fillmore, Veterans Affairs
Sergey Goryachev, Veterans Affairs
Jeremy Weiss (Presenter)
Carnegie Mellon University

Presentation Materials:

Tags