Machine learning explainability in finance: an application to default risk analysis

Staff working papers set out research in progress by our staff, with the aim of encouraging comments and debate.

Published on 09 August 2019

Staff Working Paper No. 816

By Philippe Bracke, Anupam Datta, Carsten Jung and Shayak Sen

We propose a framework for addressing the ‘black box’ problem present in some Machine Learning (ML) applications. We implement our approach by using the Quantitative Input Inﬂuence (QII) method of Datta et al (2016) in a real‑world example: a ML model to predict mortgage defaults. This method investigates the inputs and outputs of the model, but not its inner workings. It measures feature inﬂuences by intervening on inputs and estimating their Shapley values, representing the features’ average marginal contributions over all possible feature combinations. This method estimates key drivers of mortgage defaults such as the loan‑to‑value ratio and current interest rate, which are in line with the ﬁndings of the economics and ﬁnance literature. However, given the non‑linearity of ML model, explanations vary signiﬁcantly for different groups of loans. We use clustering methods to arrive at groups of explanations for different areas of the input space. Finally, we conduct simulations on data that the model has not been trained or tested on. Our main contribution is to develop a systematic analytical framework that could be used for approaching explainability questions in real world ﬁnancial applications. We conclude though that notable model uncertainties do remain which stakeholders ought to be aware of.

Machine learning explainability in finance: an application to default risk analysis