Understanding and trusting models and their results is a hallmark of good science. Scientists, engineers, physicians, researchers, and humans in general have the need to understand and trust models and modeling results that affect their work and their lives.
However, the forces of innovation and competition are now driving analysts and data scientists to try ever-more complex predictive modeling and machine learning algorithms. Such algorithms for machine learning include gradient-boosted ensembles (GBM), artificial neural networks (ANN), and random forests, among many others.
Many machine learning algorithms have been labeled “black box” models because of their inscrutable inner-workings. What makes these models accurate is what makes their predictions difficult to understand: they are very complex. This is a fundamental trade-off. These algorithms are typically more accurate for predicting nonlinear, faint, or rare phenomena.
Unfortunately, more accuracy almost always comes at the expense of interpretability, and interpretability is crucial for business adoption, model documentation, regulatory oversight, and human acceptance and trust.