AI For Medicine Study Notes

Part 1 of 3 : AI For Diagnosis

Che-Jui Huang
5 min readNov 6, 2022

Background

One of my interests in AI development is in the healthcare industry. Therefore, to get myself more prepared for future opportunities, I decide to take the AI for Medicine Specialization course offered by DeepLearning.AI. This specialization consists of 3 courses which are “AI for Medical Diagnosis”, “AI for Medical Prognosis”, and “AI for Medical Treatment”.

Although I had experience in implementing a few medical image analysis projects, I did not fully consider the entire scope of building an AI model for healthcare. I cared only about the implementation of the model but not the preparation or the evaluation steps. This specialization should help enrich my knowledge in the development cycle of healthcare AI.

This series of articles does not cover the full course contents. Instead, I plan to share my study notes. These study notes are AI/ML concepts that I argue any AI practitioner should consider in their projects. Of course, there will be a few concepts that are strictly related to the field of healthcare.

Photo by Green Chameleon on Unsplash

Data Processing: Dealing with Class Imbalance

In week 1, the instructor talks about the application of medical image classification. Building an image classifier is straightforward, yet the instructor reinforces the importance of dealing with imbalance labels within the dataset. The count of patients is never going to exceed the count of healthy individuals. A rare disease is rare because not everyone is going to experience it. Therefore, in medical datasets, data scientists need to think cautiously about how models should be trained.

The simplest method to combat the issue must be over/under sampling. In other words, either reducing the number of training records or increasing the number of test records. Nonetheless, if you are performing data augmentation, you have to make sure the transformation makes sense. Transforming image data into unrealistic outputs, such as heavy distortion or color changes, should be avoided.

The next approach is to keep all training and test data but tell the model to give more weight to test data while performing loss optimization. This is called reweighing. I believe that reweighing is the most feasible method because the model can learn from the entire dataset without risking false manipulation.

Building Model:
Components of building Image Classifier and Image Segmentation

High-level overview in building an Image Classifier

- Load the dataset and the corresponding labels- Split the dataset into train, validation, and test sets- Sanity check for potential data leakage. Make sure that each data split contains only unique patients.- Define proper data augmentation pipeline. Make sure that the transformations are reasonable.- Capture training dataset's image statistics (mean and stdev) as the base for image normalization.- Compute label frequencies as class imbalance ratios and recalculate label contributions. For instance, positive label contribution = frequency of positive * frequency of negative- Build model and create customize loss function- Train the model and evaluate the model using confusion matrix and Grad-CAM.
Example of an evaluation of Grad-CAM

High-level overview in building an Image Segmentation

MRI Scanning Directions
— Coronal Plane ==> FRONT ~ BACK
— Sagittal Plane ==> LEFT ~ RIGHT
— Axial Plane ==> TOP ~ BOTTOM

Input as 2D Approach
— Break up 3D volumes into 2D slices
— Each of the slices is passed to the model, and combined later
— One disadvantage is losing the depth information (3D structure is lost)

Input as 3D Approach
— Break up 3D volumns into smaller sub-blocks
— Can’t pass at once cuz of limited memory
— Feeding sub blocks into the model, and combined later
— One disadvantage is losing the spatial information (2D information)

- Load the dataset and the corresponding labels- Decide an input structure for MRI scans- Standardize input of either 2D or 3D scans- Define a segmentation model (U-NET) and create customize Dice Loss
Dice Loss where p is predicted and q is actual label and epsilon prevents 0 division error
- Train the model and evaluate the model using confusion matrix statistics and create animated segmentation visualization.

Post Model Evaluation
Measuring statistics of your model using a Confusion Matrix

A confusion matrix is probably the most direct method to evaluate an AI model. However, what does each of the terms that we derive from the confusion matrix exactly mean?

Accuracy
Among all cases, what is the percentage of my model correctly predicts the positive and the negative cases?

Recall = Sensitivity
Among all positive cases, what is the percentage of my model retrieving the positive cases?

Specificity = True Negative Rate
Among all negative cases, what is the percentage of my model retrieving the negative cases?

Precision = Positive Predictive Value (PPV)
Among all positive cases, what is the percentage of my model correctly predicts the positive cases?

Negative Predictive Value (NPV)
Among all negative cases, what is the percentage of my model correctly predicts the negative cases?

I argue that understanding PPV and NPV is vital because each of which expresses the conditional probability of your model’s prediction. For instance:
PPV = Probability(Disease | Predict Positive)
NPV = Probability(Normal | Predict Negative)

Conclusion

That’s it for the day, I am going to continue on the 2nd and the 3rd course in the specialization. I hope my study notes remind AI practitioners about the importance of data preparation and model evaluation.

--

--