Understanding Explainable AI And Interpretable AI


As a result of recent technological advances in machine learning (ML), ML models are now being used in a variety of fields to improve performance and eliminate the need for human labor. These disciplines can be as simple as assisting authors and poets in refining their writing style or as complex as protein structure prediction. Furthermore, there is very little tolerance for error as ML models gain popularity in a number of crucial industries, like medical diagnostics, credit card fraud detection, etc. As a result, it becomes necessary for humans to comprehend these algorithms and their workings on a deeper level. After all, for academics to design even more robust models and repair the flaws of present models concerning bias and other concerns, obtaining a greater knowledge of how ML models make predictions is crucial.

This is where Interpretable (IAI) and Explainable (XAI) Artificial Intelligence techniques come into play, and the need to understand their differences become more apparent. Although the distinction between the two is not always clear, even to academics, the terms interpretability and explainability are sometimes used synonymously when referring to ML approaches. It is crucial to distinguish between IAI and XAI models because of their increasing popularity in the ML field in order to assist organizations in selecting the best strategy for their use case. 

To put it briefly, interpretable AI models can be easily understood by humans by only looking at their model summaries and parameters without the aid of any additional tools or approaches. In other words, it is safe to say that an IAI model provides its own explanation. On the other hand, explainable AI models are very complicated deep learning models that are too complex for humans to understand without the aid of additional methods. This is why when Explainable AI models can give a clear idea of why a decision was made but not how it arrived at that decision. In the rest of the article, we take a deeper dive into the concepts of interpretability and explainability and understand them with the help of examples.

1. Interpretable Machine Learning

We argue that anything can be interpretable if it is possible to discern its meaning, i.e., its cause and effect can be clearly determined. For instance, if someone consumes too many chocolates straight after dinner, they always have trouble sleeping. Situations of this nature can be interpreted. A model is said to be interpretable in the domain of ML if people can understand it on their own based on its parameters. With interpretable AI models, humans can easily understand how the model arrived at a particular solution, but not if the criteria used to arrive at that result is sensible. Decision trees and linear regression are a couple of examples of interpretable models. Let’s illustrate interpretability better with the help of an example:

Consider a bank that uses a trained decision-tree model to determine whether to approve a loan application. The applicant’s age, monthly income, whether they have any other loans that are pending, and other variables are taken into consideration while making a decision. To understand why a particular decision was made, we can easily traverse down the nodes of the tree, and based on the decision criteria, we can understand why the end result was what it was. For instance, a decision criterion can specify that a loan application won’t be authorized if someone who is not a student has a monthly income of less than $3000. However, we cannot comprehend the rationale behind choosing the decision criteria by using these models. For instance, the model fails to explain why a $3000 minimum income requirement is enforced for a non-student applicant in this scenario.

To produce the supplied output, interpreting different factors, including weights, features, etc., is necessary for organizations that wish to better understand why and how their models generate predictions. But this is possible only when the models are fairly simple. Both the linear regression model and the decision tree have a small number of parameters. As models become more complicated, we can no longer understand them this way.

2. Explainable Machine Learning

Explainable AI models are ones whose internal workings are too complex for humans to comprehend how they affect the final prediction. Black-box models, in which model features are regarded as the input and the eventually produced predictions are the output, are another name for ML algorithms. Humans require additional methods to look into these “black-box” systems in order to comprehend how these models operate. An example of such a model would be a Random Forest Classifier consisting of many Decision Trees. In this model, each tree’s predictions are considered when determining the final prediction. This complexity only increases when neural network-based models such as LogoNet are taken into consideration. With an increase in the complexity of such models, it becomes simply impossible for humans to understand the model by just looking at the model weights.

As mentioned earlier, humans need extra methods to comprehend how sophisticated algorithms generate predictions. Researchers make use of different methods to find connections between the input data and model-generated predictions, which can be useful in understanding how the ML model behaves. Such model-agnostic methods (methods that are independent of the kind of model) include partial dependence plots, SHapley Additive exPlanations (SHAP) dependence plots, and surrogate models. Several approaches that emphasize the importance of different features are also employed. These strategies determine how well each attribute may be utilized to predict the target variable. A higher score means that the feature is more crucial to the model and has a significant impact on prediction.

However, the question that still remains is why there is a need to distinguish between the interpretability and explainability of ML models. It is clear from the arguments mentioned above that some models are easier to interpret than others. In simple terms, one model is more interpretable than another if it is easier for a human to grasp how it makes predictions than the other model. It is also the case that, generally, less complicated models are more interpretable and often have lower accuracy than more complex models involving neural networks. Thus, high interpretability typically comes at the cost of lower accuracy. For instance, employing logistic regression to perform image recognition would yield subpar results. On the other hand, model explainability starts to play a bigger role if a company wants to attain high performance but still needs to understand the behavior of the model.

Thus, businesses must consider whether interpretability is required before starting a new ML project. When datasets are large, and the data is in the form of images or text, neural networks can meet the customer’s objective with high performance. In such cases, When complex methods are needed to maximize performance, data scientists put more emphasis on model explainability than interpretability. Because of this, it’s crucial to comprehend the distinctions between model explainability and interpretability and to know when to favor one over the other.


Don’t forget to join our 15k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Khushboo Gupta is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Goa. She is passionate about the fields of Machine Learning, Natural Language Processing and Web Development. She enjoys learning more about the technical field by participating in several challenges.