Deep learning-based quantification of temporalis muscle has prognostic value in patients with glioblastoma

In many cancer patient groups, the amount of muscle predicts survival. However, measuring this by hand is time-consuming. We trained some AI-software to measure this automatically and showed that the amount of muscle helps predict how long patients with a common, aggressive brain tumour live for.
Deep learning-based quantification of temporalis muscle has prognostic value in patients with glioblastoma

Glioblastoma is the most common type of brain cancer. Although it is rare, it has the highest number of years of life lost compared to other cancers as well as very low survival rates – less than 5% of patients survive for 5 or more years. Sarcopenia, loss of muscle mass and / or function has been linked with poorer outcomes in cancer patients in general as well as lower survival in GBM patients. Most of this work has been done based on imaging of the body, but patients with GBM tend to have MRI scans of the head, rather than of the body. In this work, we wanted to see if an automated approach could quickly and accurately measure the area of muscle around the head in brain tumour patients and whether this predicts disease progression or survival.

 

How it started

This work started as my final year project for my undergraduate degree at Imperial College London. My project in the lab was to build and train some software (a neural network) to draw the outline of a muscle on head MRIs in a dataset of local patients. We chose one particular muscle (Temporalis) as there was previous evidence that it could be used a proxy of skeletal muscle mass and was linked with survival. Its width was shown to be predictive of 30-day and 90-day survival by Zakaria et al. and Furtner et al. showed that temporal muscle thickness is predictive of overall and progression-free survival in glioblastoma patients. Thus, we wanted to see if it would be possible to automatically assess the muscle from brain scans.

For this task, we first manually produced segmentations of temporalis muscle using a touch screen to act as the ground truth of which some are used for training, and some are used to evaluate the accuracy of the automated segmentations. We took an already implemented version of a U-Net, which is a type of neural network commonly used in medical image segmentation tasks and tried to make it work with our local dataset. After some time of getting grey images instead of the desired output segmentation – an image where the pixels belonging to the object to be segmented are assigned one value, and pixels belonging to the background, are assigned another value – we explored the usage of a different loss function, on which the neural network is trained, which improved the model performance. At the end of my placement, we had a model that performed moderately well with the similarity between manually and automatically produced segmentation being 90%.

 

Extending initial exploration

Although our initial exploration of using deep learning in temporalis segmentation was successful, we knew there were room for improvement. One of the main drawbacks was that the model was trained on a relatively low number of images and all of them came from the same local dataset with a limited number of patients. To address this issue, we searched for external datasets that could be used. In our current work we used local and publicly available datasets to re-train the model. We then performed survival analysis using two datasets - local and TCGA-GBM, as these had survival data. With this work we managed to train a neural network with high accuracy – mean Dice score of 0.893 ± 0.045, Jaccard index of 0.809 ± 0.072 and Hausdorff distance of 1.889 ± 0.354 mm. Overall survival was significantly longer in patients with higher temporalis muscle CSA  in both the local dataset (median OS 22.4 vs 14.5 months) and TCGA-GBM (15.4  vs 12.9 months). Progression-free survival was also longer in the local dataset for the patient group with higher CSA (median PFS 14.3 vs 6.4 months). Thus, we were able to validate our initial work with using additional external datasets and further added to the evidence that temporalis muscle is predictive of outcomes in brain tumour patients.

 

Figure 1. Example of automatically produced temporalis muscle segmentations.

 

What’s next

Since the end of the initial study, we have further explored this topic and created a fully automated pipeline, that further reduces user input and time needed to obtain results. We used the previously trained model to create a pipeline that now takes 3D images as input (instead of 2D) and uses a slice-based approach. Using the eyeball as reference, the pipeline automatically selects a slice per patient, at the orbital level, where the muscle is widest, which is then segmented and the cross-sectional area for each scan is returned as output. We have presented our work at an international conference and the paper from it can be accessed here. Code can be accessed on Gitlab.

Our work has clinical implications, as we have shown that it is possible to extract additional value from routinely performed imaging data without adding too much workload. With further validation, our temporalis muscle quantification tool could aid in improving prognostic methods, patient stratification and optimization of individual treatments.