There have been major advances in the application of machine learning (ML) in the recent past due to a plethora of industry drivers that have revolutionized the utilization of these techniques in the risk management sphere, and beyond.
In an article in PRMIA’s October edition of Intelligent Risk, authors Danny Haydon and Moody Hadi cover the key transformational drivers causing these high adoption rates, some of the techniques, and how to assess their utility within credit risk.
Firstly, data in general has experienced a large expansion in several dimensions: size, velocity and variety. Simultaneously the abilities to record, store, combine and then process large datasets from many disparate sources has experienced wholesale improvements. This is not limited to just traditional sources, but also alternative data which fueled the need to extract information value from these sources. However, the side effect of this data expansion is an elevated level of data pollution that needs to be contended with. Data pollution includes noisy, conflicting and difficult to link datasets.
Secondly, the ease of access to enhanced computational efficiency through hardware that can run specialized operations in large scale, and also in coding language enhancements which have moved towards functional programming, have transformed the game in terms of integrating Machine Learning techniques. Languages, such as R, become the hub for numerical computing using functional programming. They leverage a lengthy history of providing numerical interfaces to computing libraries. Supervised and unsupervised algorithms allow data scientists to process these datasets into actionable insights with relative ease and to code with cheaply executable hardware.
Thirdly, reproducible research and analysis has been widely adopted by the data science community. This is defined as a set of principles about how to do quantitative and data science-driven analysis, where the data and code that leads to a decision or conclusion should be able to be replicated in an efficient and clear way.
Finally, the pervasiveness of Open Source libraries, packages and toolkits has opened doors for the community to contribute via teams of specialists, sharing code base and packaging them into easy and modular functions.
Assessing which ML techniques to use and when is an important step that needs to be done thoughtfully, with the target context in mind. There is no prescriptive method that is purely tied to a particular class of algorithms; the risk context always needs to be kept in mind in order to assess the tradeoffs.