
For approximately 130 science, physics, and chemistry journals in the National Library of Medicine’s MEDLINE database, articles are only indexed if, after manual review, they are determined to be relevant to biomedicine and the life sciences. We recently developed a machine learning system capable of predicting which articles require indexing. Here, experts manually evaluate its high confidence predictions. We observe that the system selects and rejects these articles for indexing with nearly 100% accuracy.

Learning Objective: After participating in this session, the learner should be better able to:

Understand the current issues in selectively indexing scientific journals.

Be able to discuss the importance of developing automated approaches to determining if an article is out-of-scope for MEDLINE.

Understand the necessity of manual evaluation of the out-of-scope predictions from a system designed to maximize recall.


Max Savery (Presenter)
National Library of Medicine

Melanie Huston, National Library of Medicine
James Mork, National Library of Medicine
Olga Printseva, National Library of Medicine
Susan Schmidt, National Library of Medicine
Alastair Rae, National Library of Medicine
Dina Demner-Fushman, National Library of Medicine

Presentation Materials:
