November 10, 2020
Deep Learning in Radiation Oncology: A Guest Post by Brian Anderson
My goal with this guest blog on deep learning is to provide a brief introduction on what deep learning is and how it applies to radiation oncology. In doing so, I will explain how deep learning differs from machine learning, two terms that have often been used interchangeably in our field. I’ll argue why I think radiation oncology is ripe for applications in deep learning, and then I’ll share a few tools that I hope will be useful for those of you who want to take the dive into learning more.

Deep Learning vs Machine Learning

Since the COVID-19 virus has become such a large part of our daily lives, I think it is applicable to try and use it to explain the basics of machine learning and deep learning.

Machine Learning

Let us assume that we are trying to create a machine learning algorithm to identify if a patient has COVID-19 based on their CT images:

Figure 1: Coronal slice of COVID-19 patient. Left, tissue window/level. Right, lung window/level.

Hypothetical machine learning protocol might involve the following sequence of events:

1. Changing the image window/level to focus on the values within the lung region
2. Identifying the lung regions of interest with seed points in the low HU regions
3. Extracting image features within the lungs

With the lungs relatively identified, we would then define certain features that would distinguish COVID-19 positive lungs from negative lungs. For machine learning, this means being clever and trying to find or create features to distinguish a positive result from a control. One potential approach is to find features that identify a ground-glass appearance that appear in the images. Or, perhaps take the broader approach and extract a large number of common features.
Finally, with your features, you can set up a decision tree, or ensemble model. The final deliverable of such a tree is a set of weights for each feature that the model uses to determine if it will predict a single patient as positive or negative. The entire workflow can be seen in Figure 2.

Figure 2: Overall workflow of machine learning, pre-processing leads to feature extraction and modeling.
In machine learning you need to manually craft features (or take features that others have created) in order to train the model. Predictions made from machine learning models are easily interpretable; the model can ‘explain’ why it made a certain decision based on predefined, collected features.

Deep Learning

Let’s look at this problem from the perspective of deep learning now. Let’s imagine using a convolutional neural network (don’t worry, I’ll explain this more shortly) for determining this same problem. Some of the early steps are much the same: pre-processing images with some form of window/leveling is probably a good idea, but the major changes occur in the feature extraction and training steps. For feature extraction, there is no need to define what features you want to extract. Instead, in deep learning, we create an architecture that will let the model decide itself.

What do I mean by this?

When you train a convolutional network, you essentially are telling the computer: Here is the image, and here is the outcome. Figure it out. Now, this doesn’t mean that there isn’t still a large amount of work to be done, but essentially the decision effort of feature extraction has now been replaced with identifying the best way to present the data to the architecture and tuning hyper-parameters (determining architecture, learning rates, etc.)

Why Our Field is Ripe for Deep Learning

Medical physics sits in a unique position with respect to deep learning, and while our field might not train us to be on the cutting edge of this particular technology, we often have some form of experience in coding (via Matlab, Python) and are intimately familiar with the necessities of the clinic. I do not believe that Medical Physicists will necessarily create new forms of deep learning to forward the field, but instead believe that we can take current deep learning techniques and apply them to unique problems to benefit our patients. Deep learning is becoming increasingly friendly to newcomers, and with a few tools to help along the way, anyone should be able to get started.

Getting your hands dirty

Jumping into deep learning might seem intimidating, but I hope that the following guides will help alleviate some of the frustrating first steps.

NOTE: If you have no experience in coding, I would highly recommend the book Learn Python the Hard Way. This is the book I learned from, and I found the author to be engaging and entertaining.

If you’re just beginning deep learning, I would also recommend reading Deep Learning with Python by Francois Chollet. The first edition follows old Tensorflow 1.x syntax, but reads easily and provides intuitive examples. The book is relatively short, and with many of us spending more time at home during COVID-19, there should be plenty of opportunity to read.

Data Preparation

Data preparation can be a nightmare. No matter where the data comes from, there is likely to be some form of issue that needs addressing. I would argue that I spend around 70-80% of my time just getting the data in the right format for training a model. With that in mind, what are some of the steps that can make this process easier?

Lots of time is spent on semantic segmentation (segmenting the liver, etc.), and so many of the tools that I use and write are with this task in mind.

Let us assume that you want to create a fully convolutional neural network to segment the liver. You have received a folder of 100 patients, all of which have manual contours of the liver. The overall goal now is to turn those dicom and RT structures into images and ground truth masks to feed into a network (probably some variant of a UNet).

Figure 3: Overview of deep learning workflow: take folders of patients, convert images and mask, train the model.
For this first part, I’ve made a small tool which helps me to curate the data. The base information for it is located on my Github account. As the name suggests, this code was created for converting dicom and RT structures into images and masks.
If you do not want to go through Github, using the link above, you can actually install my package the same way you would install any other Python package via the pip install DicomRTTool.

What are my ROI names?

One thing that I always, always run into is variable ROI names. If a contour is supposed to be called ‘Liver’, I’d bet there are half a dozen variants like ‘Liver_Best’, ‘Liver_Clinical’, ‘Liver_BMA_10_20_2020’, etc. I’d highly recommend performing some relatively simple metrics to first list out all of the ROIs in your folders, and maybe do some basic plotting of ROI volume for each case to see if there are any outliers (your liver shouldn’t be 10cc).
To acquire a quick list of the ROIs present within a folder of patients, you can run the following sequence:

# Import the module
from DicomRTTool import DicomReaderWriter
## Assert that you only want to view the RT structure ROIs
Dicom_reader = DicomReaderWriter(get_images_mask=False) # we set this to be False to say that we don’t want to waste time loading the images, just read through the RT structures and tell me what rois are present
## Provide a path to the images
Path = ‘C:\\users\\brianmanderson\\Patients\\’
## Tell the reader to iterate down the folders
Dicom_reader.down_folder(path)
## Print the rois present
for roi in Dicom_reader.all_rois:
     print(roi)
This last line will print out a list of each unique ROI name, you can then make an associations dictionary like so:
associations = {‘Liver_BMA’: ‘Liver’}
This will tell the reader that if an ROI named ‘Liver_BMA’ shows up, it should be treated as if it were named ‘Liver’.
Now, we can run through and pull out an example image and mask, and write all of them as .nii.gz files in an anonymized fashion with an excel sheet ‘key’ for later.
Dicom_reader = DicomReaderWriter(get_images_mask=True, Contour_Names=[‘Liver’], associations=associations)
path = ‘C:\\users\\brianmanderson\\Patients\\Patient_1\\CT_1\\’
Dicom_reader.Make_Contour_From_directory(path) # Make a contour from one path
## Now that it is loaded, you can view the image and mask
image = Dicom_reader.ArrayDicom
mask = Dicom_reader.mask
## Likewise, if you prefer working in .nii format, you can view the image and mask .nii file with
Image_handle = Dicom_reader.dicom_handle
Mask_handle = Dicom_reader.annotation_handle
## A parallel method of writing out .nii.gz files for images and masks exists, simply call
Dicom_reader.write_parallel(out_path=path_to_export, excel_file=path_to_file))
The path_to_export will create individual .nii.gz files for each image and mask, and the excel_file will create a list of MRNs for anonymized patient data.

Going Forward

Now that you have well-processed data, you can load them into the deep learning architecture of your choice. For Tensorflow, you could benefit by performing any pre-processing steps and converting them into Tensorflow .tfrecord files. Please see the repo for Make_TFRecord_Class here and for loading the .tfrecords with TFREcord_to_Dataset_Generator here.
I hope you enjoyed reading this, and I apologize for anything that I might have overlooked or issues that I don’t see, but that I am sure exist. Cheers!

written by Brian Anderson

Brian Anderson is a PhD Candidate at MD Anderson Cancer Center with plans to graduate in early 2021 and pursue a career as a therapy physicist. His work largely focuses on improving treatments in interventional radiology using deep learning, and has been invited to give several talks and workshops on getting started in AI. Brian is a hobbyist rock climber and distance biker, when the Houston heat allows.

1 Comment

  1. Stuart Swerdloff

    Great read! I see you called out how important it is to address curation of the data. I’d encourage people to take a look at #OnkoDICOM (https://github.com/didymo/OnkoDICOM.git ). Curating data and exporting radiomics based on the curated contours is it’s “raison d’etre”. Free, multi-platform (including built executables), and a great bunch of students doing the heavy lifting along with contributors from the clinical and “industry but volunteering” community.
    P.S. Full Disclosure, I’ve been contributing some of my time to OnkoDICOM while on Sabbatical…
    >

    There is some overlap and some complementary activity between the work here (which includes semi-automating renaming to standard ROI names/labels) and what’s available in OnkoDICOM.

    1
    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *