Genetic mutations are linked to diseases such as cancer. Text mining methods exist to identify mentions of mutations in text, which require a reference to gene or protein for its contextualization. We propose linking mutations to the genes or proteins mentioned in the scientific articles using machine learning based information extraction methods. Our results show that it is possible to achieve 84.28 F1, which is comparable to human performance. We show the performance of the approach on related relation extraction tasks in the biomedical domain.
Learning Objective: As learning objective, we show that it is needed to link a mutation mentioned extracted from the scientific literature to its gene or protein. Furthermore, we show that it is possible to train a machine learning based information extraction method to extract this relation automatically from the scientific literature, with performance comparable to human performance.
Elaheh Shafieibavani, IBM Research Australia
Antonio Jimeno Yepes (Presenter)
IBM Research Australia