AI For Medicine Study Notes

Part 2 of 3 : AI For Medical Prognosis

6 min readNov 15, 2022

Background

This series of articles does not cover the full aspect of the course contents. These articles serve as my personal opinion on what was important while taking the courses. However, the study notes do contain crucial concepts that I argue any AI practitioner should consider in their projects. In this article, you will see a few concepts related to predicting the risks of events.

AI for Medicine Specialization course offered by DeepLearning.AI. This specialization consists of 3 courses which are
“AI for Medical Diagnosis”
(Part 1 of 3 : AI For Diagnosis link to the previous study notes)
“AI for Medical Prognosis”
“AI for Medical Treatment”.

Measuring a Risk Estimator

How should we quantify risk?
How could data become an input of an ML model that helps decide patients with higher/lower risk?

These two questions are relevant to any business situation and should not be limited to healthcare scenarios. To answer the questions, we should first investigate what features could become a quantitative input for models. Next, we should find a metric that evaluates the performance of our models.

Quantitative Features & Normal Distribution
I define risk as the probability of an entity facing a disadvantage or loss. In healthcare scenarios, risk can be whether a patient will develop a disease in the next five years. Whereas, in business, risk can be whether profit will change under corporate decisions. From the lectures and the examples, I realize that risk assessment can be challenging yet opportunistic. Being able to correctly measure the probability of risk often makes decisions more rational and confident. So what are some good features to model? Here are some questions to ask.

Is there an existing framework?
Does the existing framework have features that you can easily acquire?
If there is no ready-to-use framework, can you look back at historical data and possibly come up with relevant features?

These are the questions that help me think about modeling risks. I argue that inventing something new is not the best solution. Instead, I highly encourage any individual to research similar models that proved to be reliable. However, there are times when we need our customize features. In such cases, I suppose you come up with features that have some correlation with your risks. For instance, aging correlates with a higher risk of infection, yet growing taller may have no relation. By the end of the day, quantitative features should somehow represent the goal of your measurement.

One last thing that I would like to mention is the importance of having normally distributed features. Setting your features in a bell-shaped curve minimizes the risk of training a biased algorithm. If your data is not standardized and normalized, people observe that some algorithms will favor features with the greatest scaling.

Evaluation Metric: The C-Index
How should we measure the performance?
Here are some possible solutions. One is to use the traditional metrics like accuracy and precision, but notice that these metrics look at the record level. Accuracy, precision, and recall convey the information of how well your model does in predicting each record. The next is to use ML algorithms that come with probabilistic predictions. The last method is to use the C-Index.

C-Index tries to answer the question of how well your model is capable of identifying whether A is risker than B or vice versa. Therefore, C-Index takes in two parameters, one is the ground truth and the other is the probability score. A permissible pair is counted when the ground truth outcome is different between pairs. Meanwhile, a concordant pair is defined as higher risk should associate with the positive label. See the formula and the example below.

C-IndexTells us the probability of 
score A > score B given outcome A > Outcome BP ( risk score(A) > risk score(B) | Label-A > Label-B )** Random model score = 0.5 ** ** Perfect model score = 1.0**

Survival Model with Kaplan Meier Estimate

Unlike the risk estimator that tells a probability of a binary event, a survival model answers the probability of a person continues to live after a given time. For instance, how likely is a patient with stage-three cancer continues to live after 5 years? The survivor model answers the question of WHEN. To do so, we can use the Kaplan Meier Estimator. Kaplan Meier Estimator is a model that uses the conditional probability. Below is the formula and an example of calculating Kaplan Meier Estimates.

Harrell’s C-Index

Harrell’s C-index is a modified version of the C-Index. Harrell’s C-Index preserves all the components of the C-Index but it has a different definition for permissible pairs. Permissible pairs are defined as the list below,

Permissible Pairs
If both patients have an event
If one patient has an event and the censored patient has a greater T period
If one patient has an event and both patients have the same T period

Nelson Aalen Estimator (Cumulative Hazard Function)

Nelson Aalen Estimator is another form of risk estimator. Unlike Kaplan Meier Estimate, Nelson Aalen Estimate uses the cumulative hazard function to answer the question of “What’s a patient’s accumulated risk up to time T?”

Conclusion

This article does not have the structure of most of the others. This series of articles serves as study notes on what I have found to be important while taking the AI for Medicine specialization. In the second course, AI for Prognosis, I was introduced to models and evaluation metrics that aimed to answer the following questions:

Survival Function and C-Index
What is the probability of a patient’s survival past the given time T?
How good is my model in comparing the outcomes and the probability between two incidents?
Hazard Function and Cox Proportional Hazard
What is the immediate risk of a patient’s survival at a given time T?
Which patient is at more risk and how does each feature contribute to the hazard
function?
Nelson Aalen Estimator and Harrell’s C-Index
What is the cumulative risk of a patient up to the time T?
How good is my model in comparing the time of having an event and the risk scores associated with two incidents?