publications
2024
- Automatic dataset shift identification to support root cause analysis of AI performance driftMelanie Roschewitz, Raghav Mehta, Charles Jones, and Ben GlockerNov 2024
Shifts in data distribution can substantially harm the performance of clinical AI models. Hence, various methods have been developed to detect the presence of such shifts at deployment time. However, root causes of dataset shifts are varied, and the choice of shift mitigation strategies is highly dependent on the precise type of shift encountered at test time. As such, detecting test-time dataset shift is not sufficient: precisely identifying which type of shift has occurred is critical. In this work, we propose the first unsupervised dataset shift identification framework, effectively distinguishing between prevalence shift (caused by a change in the label distribution), covariate shift (caused by a change in input characteristics) and mixed shifts (simultaneous prevalence and covariate shifts). We discuss the importance of self-supervised encoders for detecting subtle covariate shifts and propose a novel shift detector leveraging both self-supervised encoders and task model outputs for improved shift detection. We report promising results for the proposed shift identification framework across three different imaging modalities (chest radiography, digital mammography, and retinal fundus images) on five types of real-world dataset shifts, using four large publicly available datasets.
2023
- Uncertainty for Safe Utilization of Machine Learning in Medical Imaging: 5th International Workshop, UNSURE 2023, Held in Conjunction with MICCAI 2023, Vancouver, BC, Canada, October 12, 2023, ProceedingsCarole H Sudre, Christian F Baumgartner, Adrian Dalca, Raghav Mehta, Chen Qin, and William M WellsNov 2023
- Mitigating calibration bias without fixed attribute grouping for improved fairness in medical imaging analysisChangjian Shui, Justin Szeto, Raghav Mehta, Douglas L Arnold, and Tal ArbelIn International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Oct 2023Early Acceptance
Trustworthy deployment of deep learning medical imaging models into real-world clinical practice requires that they be calibrated. However, models that are well calibrated overall can still be poorly calibrated for a sub-population, potentially resulting in a clinician unwittingly making poor decisions for this group based on the recommendations of the model. Although methods have been shown to successfully mitigate biases across subgroups in terms of model accuracy, this work focuses on the open problem of mitigating calibration biases in the context of medical image analysis. Our method does not require subgroup attributes during training, permitting the flexibility to mitigate biases for different choices of sensitive attributes without re-training. To this end, we propose a novel two-stage method: Cluster-Focal to first identify poorly calibrated samples, cluster them into groups, and then introduce group-wise focal loss to improve calibration bias. We evaluate our method on skin lesion classification with the public HAM10000 dataset, and on predicting future lesional activity for multiple sclerosis (MS) patients. In addition to considering traditional sensitive attributes (e.g. age, sex) with demographic subgroups, we also consider biases among groups with different image-derived attributes, such as lesion load, which are required in medical image analysis. Our results demonstrate that our method effectively controls calibration error in the worst-performing subgroups while preserving prediction performance, and outperforming recent baselines.
- Improving Image-Based Precision Medicine with Uncertainty-Aware Causal ModelsJoshua Durso-Finley, Jean-Pierre Falet, Raghav Mehta, Douglas L Arnold, Nick Pawlowski, and Tal ArbelIn International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Oct 2023Student Travel Award (Top 10 paper)
Image-based precision medicine aims to personalize treatment decisions based on an individual’s unique imaging features so as to improve their clinical outcome. Machine learning frameworks that integrate uncertainty estimation as part of their treatment recommendations would be safer and more reliable. However, little work has been done in adapting uncertainty estimation techniques and validation metrics for precision medicine. In this paper, we use Bayesian deep learning for estimating the posterior distribution over factual and counterfactual outcomes on several treatments. This allows for estimating the uncertainty for each treatment option and for the individual treatment effects (ITE) between any two treatments. We train and evaluate this model to predict future new and enlarging T2 lesion counts on a large, multi-center dataset of MR brain images of patients with multiple sclerosis, exposed to several treatments during randomized controlled trials. We evaluate the correlation of the uncertainty estimate with the factual error, and, given the lack of ground truth counterfactual outcomes, demonstrate how uncertainty for the ITE prediction relates to bounds on the ITE error. Lastly, we demonstrate how knowledge of uncertainty could modify clinical decision-making to improve individual patient and clinical trial outcomes.
- Evaluating the Fairness of Deep Learning Uncertainty Estimates in Medical Image AnalysisRaghav Mehta, Changjian Shui, and Tal ArbelIn Medical Imaging with Deep Learning (MIDL) conference, Jul 2023
Although deep learning (DL) models have shown great success in many medical image analysis tasks, deployment of the resulting models into real clinical contexts requires: (1) that they exhibit robustness and fairness across different sub-populations, and (2) that the confidence in DL model predictions be accurately expressed in the form of uncertainties. Unfortunately, recent studies have indeed shown significant biases in DL models across demographic subgroups (e.g., race, sex, age) in the context of medical image analysis, indicating a lack of fairness in the models. Although several methods have been proposed in the ML literature to mitigate a lack of fairness in DL models, they focus entirely on the absolute performance between groups without considering their effect on uncertainty estimation. In this work, we present the first exploration of the effect of popular fairness models on overcoming biases across subgroups in medical image analysis in terms of bottom-line performance, and their effects on uncertainty quantification. We perform extensive experiments on three different clinically relevant tasks: (i) skin lesion classification, (ii) brain tumour segmentation, and (iii) Alzheimer’s disease clinical score regression. Our results indicate that popular ML methods, such as data-balancing and distributionally robust optimization, succeed in mitigating fairness issues in terms of the model performances for some of the tasks. However, this can come at the cost of poor uncertainty estimates associated with the model predictions. This tradeoff must be mitigated if fairness models are to be adopted in medical image analysis.
- Debiasing Counterfactuals in the Presence of Spurious CorrelationsAmar Kumar, Nima Fathi, Raghav Mehta, Brennan Nichyporuk, Jean-Pierre R Falet, Sotirios Tsaftaris, and Tal ArbelIn MICCAI Workshop on Fairness of AI in Medical Imaging (FAIMI), Oct 2023Best Oral Presentation AwardOral Presentation
Deep learning models can perform well in complex medical imaging classification tasks, even when basing their conclusions on spurious correlations (i.e. confounders), should they be prevalent in the training dataset, rather than on the causal image markers of interest. This would thereby limit their ability to generalize across the population. Explainability based on counterfactual image generation can be used to expose the confounders but does not provide a strategy to mitigate the bias. In this work, we introduce the first end-to-end training framework that integrates both (i) popular debiasing classifiers (e.g. distributionally robust optimization (DRO)) to avoid latching onto the spurious correlations and (ii) counterfactual image generation to unveil generalizable imaging markers of relevance to the task. Additionally, we propose a novel metric, Spurious Correlation Latching Score (SCLS), to quantify the extent of the classifier reliance on the spurious correlation as exposed by the counterfactual images. Through comprehensive experiments on two public datasets (with the simulated and real visual artifacts), we demonstrate that the debiasing method: (i) learns generalizable markers across the population, and (ii) successfully ignores spurious correlations and focuses on the underlying disease pathology.
- Confusing Large Models by Confusing Small ModelsVı́tor Albiero, Raghav Mehta, Ivan Evtimov, Samuel Bell, Levent Sagun, and Aram MarkosyanIn Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Oct 2023Oral Presentation
Despite a steady growth in average accuracy, computer vision models continue to fail on many robustness benchmarks. In this paper, we take a step back from standard benchmarks and focus on how models perceive data, and which aspects of the data they find confusing. Using an ensemble-based confusion score built on top of simple calibrations we examine how the training and test samples appear simple or confusing to a given model. Based on these heuristics, we demonstrate an application of the confusion score in identifying images that appear confusing to the trained model, and show that these images are highly likely to be misclassified by the model. We further demonstrate how confusion carries over to models of various sizes and architectures, which gives rise to the possibility of identifying challenging images via ensembles of small networks to produce a custom benchmark of challenging data, that remains appropriate for large models where ensembling is costly to implement. Finally, we demonstrate how training via upsampling on confusing images can improve accuracy on the hard subset.
- Integrating Bayesian Deep Learning Uncertainties in Medical Image AnalysisRaghav MehtaDec 2023
Although Deep Learning (DL) models have been shown to perform very well on various medical imaging tasks, inference in the presence of pathology presents several challenges to common models. These challenges impede the integration of DL models into real clinical workflows. Deployment of these models into real clinical contexts requires: (1) that the confidence in DL model predictions be accurately expressed in the form of uncertainties and (2) that they exhibit robustness and fairness across different sub-populations. Quantifying the reliability of DL model predictions in the form of uncertainties could enable clinical review of the most uncertain regions, thereby building trust and paving the way toward clinical translation. Similarly, by embedding uncertainty estimates across cascaded inference tasks, prevalent in medical image analysis, performance on the downstream inference tasks should also be improved. In this thesis, we develop an uncertainty quantification score for the task of Brain Tumour Segmentation. We evaluate the score’s usefulness during the two consecutive Brain Tumour Segmentation (BraTS) challenges, BraTS 2019 and BraTS 2020. Overall, our findings confirm the importance and complementary value that uncertainty estimates provide to segmentation algorithms, highlighting the need for uncertainty quantification in medical image analyses. We further show the importance of uncertainty estimates in medical image analysis by propagating uncertainty generated by upstream tasks into the downstream task of interest. Our results on three different clinically relevant tasks indicate that uncertainty propagation helps improve the performance of the downstream task of interest. Additionally, we combine the aspect of uncertainty estimates with fairness across demographic subgroups into the picture. By performing extensive experiments on multiple tasks, we show that popular ML methods for achieving fairness across different subgroups, such as data-balancing and distributionally robust optimization, succeed in terms of the model performances for some of the tasks. However, this can come at the cost of poor uncertainty estimates associated with the model predictions. This tradeoff must be mitigated if fairness models are to be adopted in medical image analysis. In the last part of the thesis, we look at Active Learning (AL) for reduced manual labeling of a dataset. Specifically, we present an information-theoretic active learning framework that guides the optimal selection of images for labeling. Results indicate that the proposed framework outperforms several existing AL methods, and by careful design choices, it can be integrated into existing deep learning classifiers with minimal computational overhead
2022
- Rethinking Generalization: The Impact of Annotation Style on Medical Image SegmentationBrennan Nichyporuk, Jillian Cardinell, Justin Szeto, Raghav Mehta, Jean-Pierre Falet, Douglas L. Arnold, Sotirios A. Tsaftaris, and Tal ArbelMachine Learning for Biomedical Imaging (MELBA) Journal, Dec 2022
Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, where unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the "ground-truth" label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.
- QU-BraTS: MICCAI BraTS 2020 Challenge on Quantifying Uncertainty in Brain Tumor Segmentation – Analysis of Ranking Scores and Benchmarking ResultsRaghav Mehta, Angelos Filos, Ujjwal Baid, Chiharu Sako, Richard McKinley, Michael Rebsamen, Katrin Dätwyler, Raphael Meier, Piotr Radojewski, Gowtham Krishnan Murugesan, and 82 more authorsMachine Learning for Biomedical Imaging (MELBA) Journal, Aug 2022
Deep learning (DL) models have provided the state-of-the-art performance in a wide variety of medical imaging benchmarking challenges, including the Brain Tumor Segmentation (BraTS) challenges. However, the task of focal pathology multi-compartment segmentation (e.g., tumor and lesion sub-regions) is particularly challenging, and potential errors hinder the translation of DL models into clinical workflows. Quantifying the reliability of DL model predictions in the form of uncertainties, could enable clinical review of the most uncertain regions, thereby building trust and paving the way towards clinical translation. Recently, a number of uncertainty estimation methods have been introduced for DL medical image segmentation tasks. Developing scores to evaluate and compare the performance of uncertainty measures will assist the end-user in making more informed decisions. In this study, we explore and evaluate a score developed during the BraTS 2019-2020 task on uncertainty quantification (QU-BraTS), and designed to assess and rank uncertainty estimates for brain tumor multi-compartment segmentation. This score (1) rewards uncertainty estimates that produce high confidence in correct assertions, and those that assign low confidence levels at incorrect assertions, and (2) penalizes uncertainty measures that lead to a higher percentages of under-confident correct assertions. We further benchmark the segmentation uncertainties generated by 14 independent participating teams of QU-BraTS 2020, all of which also participated in the main BraTS segmentation task. Overall, our findings confirm the importance and complementary value that uncertainty estimates provide to segmentation algorithms, and hence highlight the need for uncertainty quantification in medical image analyses.
- You only need a good embeddings extractor to fix spurious correlationsRaghav Mehta, Vı́tor Albiero, Li Chen, Ivan Evtimov, Tamar Glaser, Zhiheng Li, and Tal HassnerIn Proceedings of the IEEE/CVF European Conference on Computer Vision (ECCV) Workshops, Oct 2022Oral Presentation
Spurious correlations in training data often lead to robustness issues since models learn to use them as shortcuts. For example, when predicting whether an object is a cow, a model might learn to rely on its green background, so it would do poorly on a cow on a sandy background. A standard dataset for measuring state-of-the-art on methods mitigating this problem is Waterbirds. The best method (Group Distributionally Robust Optimization - GroupDRO) currently achieves 89% worst group accuracy and standard training from scratch on raw images only gets 72%. GroupDRO requires training a model in an end-to-end manner with subgroup labels. In this paper, we show that we can achieve up to 90% accuracy without using any sub-group information in the training set by simply using embeddings from a large pre-trained vision model extractor and training a linear classifier on top of it. With experiments on a wide range of pre-trained models and pre-training datasets, we show that the capacity of the pre-training model and the size of the pre-training dataset matters. Our experiments reveal that high capacity vision transformers perform better compared to high capacity convolutional neural networks, and larger pre-training dataset leads to better worst-group accuracy on the spurious correlation dataset.
- Information gain sampling for active learning in medical image classificationRaghav Mehta, Changjian Shui, Brennan Nichyporuk, and Tal ArbelIn International Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging (UNSURE), Oct 2022
Large, annotated datasets are not widely available in medical image analysis due to the prohibitive time, costs, and challenges associated with labelling large datasets. Unlabelled datasets are easier to obtain, and in many contexts, it would be feasible for an expert to provide labels for a small subset of images. This work presents an information-theoretic active learning framework that guides the optimal selection of images from the unlabelled pool to be labeled based on maximizing the expected information gain (EIG) on an evaluation dataset. Experiments are performed on two different medical image classification datasets: multi-class diabetic retinopathy disease scale classification and multi-class skin lesion classification. Results indicate that by adapting EIG to account for class-imbalances, our proposed Adapted Expected Information Gain (AEIG) outperforms several popular baselines including the diversity based CoreSet and uncertainty based maximum entropy sampling. Specifically, AEIG achieves 95% of overall performance with only 19% of the training data, while other active learning approaches require around 25%. We show that, by careful design choices, our model can be integrated into existing deep learning classifiers.
2021
- Propagating uncertainty across cascaded medical imaging tasks for improved deep learning inferenceRaghav Mehta, Thomas Christinck, Tanya Nair, Aurélie Bussy, Swapna Premasiri, Manuela Costantino, M Mallar Chakravarthy, Douglas L Arnold, Yarin Gal, and Tal ArbelIEEE Transactions on Medical Imaging (TMI), Oct 2021
Although deep networks have been shown to perform very well on a variety of medical imaging tasks, inference in the presence of pathology presents several challenges to common models. These challenges impede the integration of deep learning models into real clinical workflows, where the customary process of cascading deterministic outputs from a sequence of image-based inference steps (e.g. registration, segmentation) generally leads to an accumulation of errors that impacts the accuracy of downstream inference tasks. In this paper, we propose that by embedding uncertainty estimates across cascaded inference tasks, performance on the downstream inference tasks should be improved. We demonstrate the effectiveness of the proposed approach in three different clinical contexts: (i) We demonstrate that by propagating T2 weighted lesion segmentation results and their associated uncertainties, subsequent T2 lesion detection performance is improved when evaluated on a proprietary large-scale, multi-site, clinical trial dataset acquired from patients with Multiple Sclerosis. (ii) We show an improvement in brain tumour segmentation performance when the uncertainty map associated with a synthesised missing MR volume is provided as an additional input to a follow-up brain tumour segmentation network, when evaluated on the publicly available BraTS-2018 dataset. (iii) We show that by propagating uncertainties from a voxel-level hippocampus segmentation task, the subsequent regression of the Alzheimer’s disease clinical score is improved.
- Had-net: A hierarchical adversarial knowledge distillation network for improved enhanced tumour segmentation without post-contrast imagesSaverio Vadacchino, Raghav Mehta, Nazanin Mohammadi Sepahvand, Brennan Nichyporuk, James J Clark, and Tal ArbelIn Medical Imaging with Deep Learning (MIDL) conference, Jul 2021
Segmentation of enhancing tumours or lesions from MRI is important for detecting new disease activity in many clinical contexts. However, accurate segmentation requires the inclusion of medical images (e.g., T1 post-contrast MRI) acquired after injecting patients with a contrast agent (e.g., Gadolinium), a process no longer thought to be safe. Although a number of modality-agnostic segmentation networks have been developed over the past few years, they have been met with limited success in the context of enhancing pathology segmentation. In this work, we present HAD-Net, a novel offline adversarial knowledge distillation (KD) technique, whereby a pre-trained teacher segmentation network, with access to all MRI sequences, teaches a student network, via hierarchical adversarial training, to better overcome the large domain shift presented when crucial images are absent during inference. In particular, we apply HAD-Net to the challenging task of enhancing tumour segmentation when access to post-contrast imaging is not available. The proposed network is trained and tested on the BraTS 2019 brain tumour segmentation challenge dataset, where it achieves performance improvements in the ranges of 16% - 26% over (a) recent modality-agnostic segmentation methods (U-HeMIS, U-HVED), (b) KD-Net adapted to this problem, (c) the pre-trained student network and (d) a non-hierarchical version of the network (AD-Net), in terms of Dice scores for enhancing tumour (ET). The network also shows improvements in tumour core (TC) Dice scores. Finally, the network outperforms both the baseline student network and AD-Net in terms of uncertainty quantification for enhancing tumour segmentation based on the BraTS 2019 uncertainty challenge metrics.
- Cohort bias adaptation in aggregated datasets for lesion segmentationBrennan Nichyporuk, Jillian Cardinell, Justin Szeto, Raghav Mehta, Sotirios Tsaftaris, Douglas L Arnold, and Tal ArbelIn International Workshop on Domain Adaptation and Representation Transfer (DART), Oct 2021Best Paper AwardOral Presentation
Many automatic machine learning models developed for focal pathology (e.g. lesions, tumours) detection and segmentation perform well, but do not generalize as well to new patient cohorts, impeding their widespread adoption into real clinical contexts. One strategy to create a more diverse, generalizable training set is to naively pool datasets from different cohorts. Surprisingly, training on this big data does not necessarily increase, and may even reduce, overall performance and model generalizability, due to the existence of cohort biases that affect label distributions. In this paper, we propose a generalized affine conditioning framework to learn and account for cohort biases across multi-source datasets, which we call Source-Conditioned Instance Normalization (SCIN). Through extensive experimentation on three different, large scale, multi-scanner, multi-centre Multiple Sclerosis (MS) clinical trial MRI datasets, we show that our cohort bias adaptation method (1) improves performance of the network on pooled datasets relative to naively pooling datasets and (2) can quickly adapt to a new cohort by fine-tuning the instance normalization parameters, thus learning the new cohort bias with only 10 labelled samples.
- Sub-cortical structure segmentation database for young populationJayanthi Sivaswamy, Alphin J Thottupattu, Raghav Mehta, R Sheelakumari, Chandrasekharan Kesavadas, and othersNov 2021
Segmentation of sub-cortical structures from MRI scans is of interest in many neurological diagnosis. Since this is a laborious task machine learning and specifically deep learning (DL) methods have become explored. The structural complexity of the brain demands a large, high quality segmentation dataset to develop good DL-based solutions for sub-cortical structure segmentation. Towards this, we are releasing a set of 114, 1.5 Tesla, T1 MRI scans with manual delineations for 14 sub-cortical structures. The scans in the dataset were acquired from healthy young (21-30 years) subjects ( 58 male and 56 female) and all the structures are manually delineated by experienced radiology experts. Segmentation experiments have been conducted with this dataset and results demonstrate that accurate results can be obtained with deep-learning methods. Our sub-cortical structure segmentation dataset, Indian Brain Segmentation Dataset (IBSD) is made openly available.
2020
- Uncertainty evaluation metric for brain tumour segmentationMedical Imaging with Deep Learning (MIDL) Short Papers, May 2020
n this paper, we develop a metric designed to assess and rank uncertainty measures for the task of brain tumour sub-tissue segmentation in the BraTS 2019 sub-challenge on uncertainty quantification. The metric is designed to: (1) reward uncertainty measures where high confidence is assigned to correct assertions, and where incorrect assertions are assigned low confidence and (2) penalize measures that have higher percentages of under-confident correct assertions. Here, the workings of the components of the metric are explored based on a number of popular uncertainty measures evaluated on the BraTS 2019 dataset
2019
- Construction of Indian human brain atlasJayanthi Sivaswamy, Alphin J Thottupattu, Raghav Mehta, R Sheelakumari, Chandrasekharan Kesavadas, and othersNeurology India (NI) Journal, Jan 2019
A brain MRI atlas plays an important role in many neuroimage analysis tasks as it provides an atlas with a standard co-ordinate system which is needed for spatial normalization of a brain MRI. Ideally, this atlas should be as near to the average brain of the population being studied as possible. Hence, correction for age and gender is typically done by selecting age- and gender-appropriate atlases. The MNI152 \citeMNI152 is used as a standard atlas in many studies. MNI152 is constructed using T1 brain MRI scan of 152 Caucasian subjects. Similarly, the LPBA40 atlas derived from 40 ethnically diverse subjects is popular in segmentation as it provides structure probabilty maps \citelonii for 56 cortical structures of 40 brain volumes. However, there is emerging evidence for morphological difference across populations especially in terms of global brain features like height, width and length which suggest that population-specific atlases may also be needed for accurate analysis. We report on the construction of a brain atlas of subjects from India. In the first part of this paper, we construct and validate the Indian brain MRI atlas of young Indian population and the corresponding structure probability maps. Next we also report our findings based on comparison of the Indian brain atlas with other population-specific atlases. The findings confirm that there is significant morphological difference between Indian, Chinese and Caucasian populations.
- Propagating uncertainty across cascaded medical imaging tasks for improved deep learning inferenceRaghav Mehta, Thomas Christinck, Tanya Nair, Paul Lemaitre, Douglas Arnold, and Tal ArbelIn International Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging (UNSURE)), Oct 2019Best Paper AwardOral Presentation
Although deep networks have been shown to perform very well on a variety of tasks, inference in the presence of pathology in medical images presents challenges to traditional networks. Given that medical image analysis typically requires a sequence of inference tasks to be performed (e.g. registration, segmentation), this results in an accumulation of errors over the sequence of deterministic outputs. In this paper, we explore the premise that, by embedding uncertainty estimates across cascaded inference tasks, the final prediction results should improve over simply cascading the deterministic classification results or performing inference in a single stage. Specifically, we develop a deep learning framework that propagates voxel-based uncertainty measures (e.g. Monte Carlo (MC) dropout sample variance) across inference tasks in order to improve the detection and segmentation of focal pathologies (e.g. lesions, tumours) in brain MR images. We apply the framework to two different contexts. First, we demonstrate that propagating multiple sclerosis T2 lesion segmentation results along with their associated uncertainty measures improves subsequent T2 lesion detection accuracy when evaluated on a proprietary large-scale, multi-site, clinical trial dataset. Second, we show how by propagating uncertainties associated with a regressed 3D MRI volume as an additional input to a follow-on brain tumour segmentation task, one can improve segmentation results on the publicly available BraTS-2018 dataset.
- Improving pathological structure segmentation via transfer learning across diseasesBarleen Kaur, Paul Lemaı̂tre, Raghav Mehta, Nazanin Mohammadi Sepahvand, Doina Precup, Douglas Arnold, and Tal ArbelIn International Workshop on Domain Adaptation and Representation Transfer (DART), Oct 2019
One of the biggest challenges in developing robust machine learning techniques for medical image analysis is the lack of access to large-scale annotated image datasets needed for supervised learning. When the task is to segment pathological structures (e.g. lesions, tumors) from patient images, training on a dataset with few samples is very challenging due to the large class imbalance and inter-subject variability. In this paper, we explore how to best leverage a segmentation model that has been pre-trained on a large dataset of patients images with one disease in order to successfully train a deep learning pathology segmentation model for a different disease, for which only a relatively small patient dataset is available. Specifically, we train a UNet model on a large-scale, proprietary, multi-center, multi-scanner Multiple Sclerosis (MS) clinical trial dataset containing over 3500 multi-modal MRI samples with expert-derived lesion labels. We explore several transfer learning approaches to leverage the learned MS model for the task of multi-class brain tumor segmentation on the BraTS 2018 dataset. Our results indicate that adapting and fine-tuning the encoder and decoder of the network trained on the larger MS dataset leads to improvement in brain tumor segmentation when few instances are available. This type of transfer learning outperforms training and testing the network on the BraTS dataset from scratch as well as several other transfer learning approaches, particularly when only a small subset of the dataset is available.
2018
- RS-Net: Regression-segmentation 3D CNN for synthesis of full resolution missing brain MRI in the presence of tumoursRaghav Mehta, and Tal ArbelIn Third International Workshop on Simulation and Synthesis in Medical Imaging (SASHIMI), Oct 2018Oral Presentation
Accurate synthesis of a full 3D MR image containing tumours from available MRI (e.g. to replace an image that is currently unavailable or corrupted) would provide a clinician as well as downstream inference methods with important complementary information for disease analysis. In this paper, we present an end-to-end 3D convolution neural network that takes a set of acquired MR image sequences (e.g. T1, T2, T1ce) as input and concurrently performs (1) regression of the missing full resolution 3D MRI (e.g. FLAIR) and (2) segmentation of the tumour into subtypes (e.g. enhancement, core). The hypothesis is that this would focus the network to perform accurate synthesis in the area of the tumour. Experiments on the BraTS 2015 and 2017 datasets show that: (1) the proposed method gives better performance than state-of-the art methods in terms of established global evaluation metrics (e.g. PSNR), (2) replacing real MR volumes with the synthesized MRI does not lead to significant degradation in tumour and sub-structure segmentation accuracy. The system further provides uncertainty estimates based on Monte Carlo (MC) dropout for the synthesized volume at each voxel, permitting quantification of the system’s confidence in the output at each location.
- To learn or not to learn features for deformable registration?Aabhas Majumdar, Raghav Mehta, and Jayanthi SivaswamyIn First International Workshop on Deep Learning Fails (DLF), Oct 2018Oral Presentation
Feature-based registration has been popular with a variety of features ranging from voxel intensity to Self-Similarity Context (SSC). In this paper, we examine the question of how features learnt using various Deep Learning (DL) frameworks can be used for deformable registration and whether this feature learning is necessary or not. We investigate the use of features learned by different DL methods in the current state-of-the-art discrete registration framework and analyze its performance on 2 publicly available datasets. We draw insights about the type of DL framework useful for feature learning. We consider the impact, if any, of the complexity of different DL models and brain parcellation methods on the performance of discrete registration. Our results indicate that the registration performance with DL features and SSC are comparable and stable across datasets whereas this does not hold for low level features. This shows that when handcrafted features are designed based on good insights into the problem at hand, they perform better or are comparable to features learnt using deep learning framework.
- 3D U-Net for brain tumour segmentationRaghav Mehta, and Tal ArbelIn 4th International Workshop on Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries (BrainLes), Oct 2018
In this work, we present a 3D Convolutional Neural Network (CNN) for brain tumour segmentation from Multimodal brain MR volumes. The network is a modified version of the popular 3D U-net architecture, which takes as input multi-modal brain MR volumes, processes them at multiple scales, and generates a full resolution multi-class tumour segmentation as output. The network is modified such that there is a better gradient flow in the network, which in turn should allow the network to learn better segmentation. The network is trained end-to-end on BraTS [1,2,3,4,5] 2018 Training dataset using a weighted Categorical Cross Entropy (CCE) loss function. A curriculum on class weights is employed to address the class imbalance issue. We achieve competitive segmentation results on BraTS 2018 Testing dataset with Dice scores of 0.706, 0.871, and 0.771 for enhancing tumour, whole tumour, and tumour core, respectively.
- Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challengeSpyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, and 417 more authorsNov 2018
Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset. t has also been a continuously evolving/growing dataset.
2017
- BrainSegNet: a convolutional neural network architecture for automated segmentation of human brain structuresRaghav Mehta, Aabhas Majumdar, and Jayanthi SivaswamyJournal of Medical Imaging (JMI), Apr 2017
Automated segmentation of cortical and noncortical human brain structures has been hitherto approached using nonrigid registration followed by label fusion. We propose an alternative approach for this using a convolutional neural network (CNN) which classifies a voxel into one of many structures. Four different kinds of two-dimensional and three-dimensional intensity patches are extracted for each voxel, providing local and global (context) information to the CNN. The proposed approach is evaluated on five different publicly available datasets which differ in the number of labels per volume. The obtained mean Dice coefficient varied according to the number of labels, for example, it is 0.844+-0.031 and 0.743+-0.019 for datasets with the least (32) and the most (134) number of labels, respectively. These figures are marginally better or on par with those obtained with the current state-of-the-art methods on nearly all datasets, at a reduced computational time. The consistently good performance of the proposed method across datasets and no requirement for registration make it attractive for many applications where reduced computational time is necessary.
- M-net: A convolutional neural network for deep brain structure segmentationRaghav Mehta, and Jayanthi SivaswamyIn 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI), Apr 2017Oral Presentation
In this paper, we propose an end-to-end trainable Convolutional Neural Network (CNN) architecture called the M-net, for segmenting deep (human) brain structures from Magnetic Resonance Images (MRI). A novel scheme is used to learn to combine and represent 3D context information of a given slice in a 2D slice. Consequently, the M-net utilizes only 2D convolution though it operates on 3D data, which makes M-net memory efficient. The segmentation method is evaluated on two publicly available datasets and is compared against publicly available model based segmentation algorithms as well as other classification based algorithms such as Random Forrest and 2D CNN based approaches. Experiment results show that the M-net outperforms all these methods in terms of dice coefficient and is at least 3 times faster than other methods in segmenting a new volume which is attractive for clinical use.
- Population specific template construction and brain structure segmentation using deep learning methodsRaghav MehtaJul 2017
A brain template, such as MNI152 is a digital (magentic resonance image or MRI) representation of the brain in a reference coordinate system for the neuroscience research. Structural atlases, such as AAL and DKA, delineate the brain into cortical and subcortical structures which are used in Voxel Based Morphometry (VBM) and fMRI analysis. Many population specific templates, i.e. Chinese, Korean, etc., have been constructed recently. It was observed that there are morphological differences between the average brain of the eastern and the western population. In this thesis, we report on the development of a population specific brain template for the young Indian population. This is derived from a multi-centeric MRI dataset of 100 Indian adults (21 - 30 years old). Measurements made with this template indicated that the Indian brain, on average, is smaller in height and width compared to the Caucasian and the Chinese brain. A second problem this thesis examines is automated segmentation of cortical and non-cortical human brain structures, using multiple structural atlases. This has been hitherto approached using computationally expensive non-rigid registration followed by label fusion. We propose an alternative approach for this using a Convolutional Neural Network (CNN) which classifies a voxel into one of many structures. Evaluation of the proposed method on various datasets showed that the mean Dice coefficient varied from 0.844±0.031 to 0.743±0.019 for datasets with the least (32) and the most (134) number of labels, respectively. These figures are marginally better or on par with those obtained with the current state of the art methods on nearly all datasets, at a reduced computational time. We also propose an end-to-end trainable Fully Convolutional Neural Network (FCNN) architecture called the M-net, for segmenting deep (human) brain structures. A novel scheme is used to learn to combine and represent 3D context information of a given slice in a 2D slice. Consequently, the M-net utilizes only 2D convolution though it operates on 3D data. Experiment results show that the M-net outperforms other state-of-the-art model-based segmentation methods in terms of dice coefficient and is at least 3 times faster than them.
2016
- A hybrid approach to tissue-based intensity standardization of brain MRI imagesRaghav Mehta, and Jayanthi SivaswamyIn 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), Apr 2016
The variations in the intensity scale in Magnetic Resonance Images pose a problem for many tasks and Intensity Standardization (IS) aims to solve this problem. Existing methods generally use landmark values of the image histogram and match it to a standard scale. The landmarks are often chosen to be percentiles from different segmented tissues. We propose a method for IS in which tissue information (via segmentation) is needed during training but not during testing by using landmark propagation. A KL divergence-based technique is employed for identifying volumes from the training set, which are similar to a given non-standardized testing volume. The landmarks from the similar volumes are then propagated to the given test volume. Evaluation of the proposed method on 24 MRI volumes from 3 different scanners shows that the IS results are better than L4 and at par with a method which uses prior segmentation, to get percentile-based landmarks. The proposed method aids speeding up and expanding the scope of IS to volumes with no tissue information.