Back to blog
The basics of ML model interpretability

The basics of ML model interpretability

Alon Lev
June 13, 2022

Albert Einstein once famously said, “If you can’t explain it simply, you don’t understand it well enough.” In the machine learning space, however, complexity challenges can make it very difficult (but crucially, not impossible) for some people to understand and determine how a model arrived at a prediction. These are known as “black box” models in the ML space—while these machine learning systems produce valuable outputs, humans may not necessarily be able to understand them. 

Machine learning can be interpretable, though. This means that it’s possible to build models that people, including laypeople, can understand and trust. As ML systems continue to be developed at a rapid pace and transform all industries, including those that are heavily regulated such as healthcare and finance, we are beginning to rely on ML model interpretability more and more so that we can create transparency and achieve a better understanding of a model’s results. 

What is ML model interpretability?

Let’s begin with the definition of interpretability in the context of machine learning. In the simplest of terms, ML model interpretability means how easily a human being can interpret and understand how the model arrived at its decision or prediction. In other words, they need to know what caused a specific decision. 

With ML model interpretability, a human should be able to understand:

  • How a model works
  • The factors that go into its predictions
  • The process that leads to these predictions

The clearer a human’s understanding of this, the more interpretable a machine learning model is. 

Another term that goes alongside ML model interpretability is how explainable it is. The goal is to create a model that a regular person can understand and, as a result, explain how it works. If you can understand and explain how and why a model works in simple terms, then that model is interpretable. 

What can make models difficult to interpret? 

You may think it sounds strange to say that machine learning models can be difficult to interpret, especially since it’s a technology that has been developed by humans. However, the logic that underpins ML isn’t always understandable by default, and as we use the technology for increasingly complex applications and large datasets, our ability to understand and explain results decreases even further.

When ML teams build their models, they essentially create an algorithm through several tiny iterations until the point where the algorithm can capture the desired pattern. This method of development can easily lead to a black-box model, where ML teams provide inputs and let the AI perform complex calculations to arrive at a decision. This essentially means that we won’t know crucial things such as what features and inputs the model deems important or how it arrives at its decisions. 

Similarly, an ML model might be trained with data that contains biases, such as prejudice, stereotypes, and societal biases that are hard-coded into datasets that we might not know about. If you put these factors together, the result is a machine learning model that, while accurate, operates in a way that we don’t understand. 

How to make models more interpretable and transparent

Fortunately, there are things that can be done to make models more interpretable and transparent. Improving the interpretability of a model will help to improve results and accuracy, thus leading to wider adoption through more trust and perception among the target userbase.

One way to make ML models more transparent is by using explainable AI. This is a framework that enables you to interpret how your models work and understand the results, and it’s backed by tools that make it easy to dive into a model’s behavior. By doing so, you can debug it, improve its performance, and explain predictions and outputs. 

Explainable AI tools can shed light on how much individual variables contribute to a model’s prediction and expose features in the algorithm that are given more weight in arriving at a decision. 

Examples of these XAI tools include:

  • Google’s What If Tool, which allows users to visualize how different data points affect predictions of trained Tensorflow models.

  • Microsoft’s InterpretML, which is a toolkit that also helps users to visualize and explain predictions.

Let’s say for instance that you’ve developed a machine learning model that can assess the creditworthiness of a loan applicant, an XAI report can be used to tell you how much weight their credit score was given in comparison to other factors such as credit card percent utilization or debt-to-income ratio.

If this model is interpretable, you will have no problem explaining to the applicant why their application was denied. As we alluded to earlier, interpretability is extremely important in heavily-regulated industries such as finance or healthcare, or in applications where a user’s personal data is being used. In the same vein, if the applicant should have been approved but wasn’t, you will be able to isolate the area of code that caused this rejection to happen and optimize your model accordingly. 

The benefits of interpretable ML 

It’s increasingly becoming the case that it’s simply not enough to merely know what was predicted; there needs to be transparency into how a prediction was made and why it was made by the model. The value of interpretability increases exponentially with the impact that predictions have on the end-user. It can also increase with what data the model is using to make those predictions, such as personal user information which will usually come with a more substantial need for interpretability because bias can be introduced unknowingly. 

Though interpretability might not be so important for a system that’s used for predicting customer churn, it’s a must-have for models that are responsible for making critical decisions. In healthcare, for example, doctors and nurses must be able to not only rely on the predictions made by the algorithm but also understand it enough to explain to the patent why certain decisions are being made. If they can’t, then this can lead to distrust in the system.

Some of the main benefits of interpretability aside from trust include:

  • Fairness—If we can ensure that a model’s predictions are non-biased, we prevent discrimination against certain underrepresented groups.

  • Robustness—We must be confident that a model works in every setting, and that small changes in input don’t cause unexpected changes in output. 

  • Privacy—If we can understand the information that a model uses then we can stop it from accessing sensitive information where necessary.

  • Causality—We need to be sure that the model only considers causal relationships and doesn’t pick up false correlations that skew results.

Some algorithms are more interpretable than others

Not all algorithms are created equal; some are more interpretable than others. 

Examples of algorithms that can be considered inherently more interpretable include regression and decision trees. At the other end of the spectrum, we have algorithms such as random forests and neural networks which can be considered less interpretable. Having said that, there are lots of factors that can impact a model’s interpretability, so it can be difficult to generalize this. 

In addition, complex algorithms with very large datasets often make more accurate predictions, so there can be a trade-off between interpretability and accuracy. Although a linear regression algorithm may be more interpretable, its decisions may be less reliable and accurate, something which makes it less useful in a model designed for use in finance or healthcare. 

Interpretability vs explainability of ML models

The terms “interpretability” and “explainability” are often used interchangeably by the machine learning community, and there’s no real official definition for each term. 

That said, we can think of explainability as requiring a lower threshold than interpretability. A machine learning model is interpretable if we can fundamentally understand how it arrives at its decisions. Meanwhile, a model can be explainable of we can understand how a specific part of a complex model influences the output. If every part of a model is explainable and we can keep track of each explanation simultaneously, then the model is interpretable.  

Let’s think about this in the context of an autonomous vehicle. We can probably explain some of the features that make up the car’s decisions, such as how object detection can recognize objects. This means that the model is at least partially explainable because we understand some of its inner workings. This doesn’t mean that it’s interpretable, though; this requires a much deeper level of understanding. 

Interpretable vs explainable ML is quite a nuanced difference, but it’s an important one to understand.

Do you want to build interpretable ML?

If you’re looking for a solution that can help you build better, more powerful, and interpretable models, look no further than the Qwak platform! 

Qwak is the full-service machine learning platform that enables teams to take their models and transform them into well-engineered products. Our cloud-based platform removes the friction from ML development and deployment while enabling fast iterations, limitless scaling, and customizable infrastructure.

‍Want to find out more about how Qwak could help you deploy your ML models effectively? Get in touch for your free demo!

Related articles