A week ago Monday (Dec. 8th) we wrote about the OFR / Federal Reserve project to collect data on the securities financing markets. Today we take a look at what that could actually mean. What will they do with the data?
Here is a link to our post “Office of Financial Research takes a look at repo and sec lending data collection”.
The first thing we would expect will be done with this data is to finally come up with a number for the size of the US repo market. Is it $3 trillion? $4 trillion? Is the OFC and the Fed going to ask for some back data to establish historical trends? That would be a “nice to have” but may not be practical.
We wonder if the data will be able to show collateral velocity numbers? The work of IMF economist Manmohan Singh has been crucial to show the importance of collateral velocity but the calculation methodology used seemed a little indirect. We would like to see the data used to measure velocity. Securities finance has been called both the grease in the wheels of securities trading and the transmission mechanism of systemic risk and mother of interconnectedness. Having a better understanding of collateral velocity (as well as a way to monitor it) would go a long way toward systemic risk management.
Can the data show when the market is over leveraged relative to asset volatility? Imagine if regulators are able to see if hedge funds are all crowded into the same type of trade? Couldn’t it be useful to know if mREIT leverage is concentrated in certain dealers? How about if one market player is squeezing a corporate issue? Can the data show all this? It is probably wishful thinking. There is too much leakage in the markets. Even if all the LEIs are in place, the data is consistently collected across borders and accurately processed to connect all the dots, there are always going to be holes in information.
There is a risk of what is known in the statistics world as “N = all”. Simply put, this is the misapprehension you think all the data has been collected and there isn’t any chance that your conclusions might be off. Needless to say, it doesn’t really work that way. There was a good article on the topic in the March 28, 2014 FT “Big data: are we making a big mistake?” (sorry if it is behind the pay wall). The FT mentioned the (in)famous Google Flu Trends story where Google claimed to be able to track the spread of the flu across the country without the benefit of any medical records. They did this by looking where and how many people were searching for terms that might be correlated with getting the flu. Think “flu medicine” or “influenza symptoms”. Only it turned out it did not work consistently. Researchers in Science Magazine “The Parable of Google Flu: Traps in Big Data Analysis” concluded Google, as one example, overstated the 2011-12 flu season by 50% and missed the 2009 H1N1 flu altogether. Statisticians call the practice of reading too much into the data “overfitting”.
What does this have to do with repo? The repo data is always going to be incomplete. Mistaking the data collected and the correlations it suggests for gospel – especially when we know this will be prone to leakage – should be avoided. Having better estimates for the repo market, and in particular size and other terms of the bilateral market, will be useful. But it is still just a sample.