Staff Working Paper No. 816
By Philippe Bracke, Anupam Datta, Carsten Jung and Shayak Sen
We propose a framework for addressing the ‘black box’ problem present in some Machine Learning (ML) applications. We implement our approach by using the Quantitative Input Inﬂuence (QII) method of Datta et al (2016) in a real‑world example: a ML model to predict mortgage defaults. This method investigates the inputs and outputs of the model, but not its inner workings. It measures feature inﬂuences by intervening on inputs and estimating their Shapley values, representing the features’ average marginal contributions over all possible feature combinations. This method estimates key drivers of mortgage defaults such as the loan‑to‑value ratio and current interest rate, which are in line with the ﬁndings of the economics and ﬁnance literature. However, given the non‑linearity of ML model, explanations vary signiﬁcantly for different groups of loans. We use clustering methods to arrive at groups of explanations for different areas of the input space. Finally, we conduct simulations on data that the model has not been trained or tested on. Our main contribution is to develop a systematic analytical framework that could be used for approaching explainability questions in real world ﬁnancial applications. We conclude though that notable model uncertainties do remain which stakeholders ought to be aware of.