There is an emerging class of technologies aimed squarely at the need for quality training data as models are increasingly becoming commodities. An O’Reilly survey results found that a lack of high-quality training data remains the main bottleneck in most machine learning projects: cited by 26% of respondents at a mature stage of adoption; and 24% for those at the evaluation stage. This has led to automatic data cleaning solutions being deployed by multiple financial services and the census bureaus of various countries, with HoloClean mentioned as a vendor.