Detecting genetic alterations in BRAF and NTRK as oncogenic drivers in digital pathology images: towards model generalization within and across multiple thyroid cohorts.

MICCAI Workshop COMPAY 2021 · Johannes Höhne, Jacob de Zoete, Arndt A Schmitz, Tricia Bal, Emmanuelle di Tomaso, Matthias Lenga ·

In this paper, we describe the machine learning problem of identifying different types of tumors based on digital pathology images. Given a set of Hematoxylin and Eosin (H&E) stained images of thyroid tumors, we train deep learning models to detect two known molecular oncogenic drivers: BRAF mutations and NTRK gene fusions. We implement an attention-based multiple instance learning (MIL) classifier and we assess its generalization within and across three independent cohorts. We find that the model can detect both oncogenic drivers with the MIL approach, however the problem remains challenging: our exhaustive evaluation scenarios exemplify unknown data drifts and batch effects in digital pathology as the model performance decreases when processing images from an unseen cohort. These findings highlight the necessity of rich and diverse datasets for training and evaluation as well as methods for domain-agnostic learning.

PDF Abstract