An astonishing amount of digital activity is occurring at any given moment. This explosion stems from the aggregate action of 4.5 billion internet users, a number that is projected to rise even further in the coming years. Since the outbreak of COVID-19, technology has played an even greater role in our daily lives. This development has increased the production of data, which are constantly being generated by our clicks, reactions on social media, shares, streaming of videos, digital transactions, digital recordings of our personal and work activities, digital circulation of scientific texts, etc.
These data can give us a better understanding of the state of the economy at both the micro and macro level, provided that we have access to them. The use of data is actually one of the world’s biggest businesses, and data ownership is therefore extremely important, said Daniele Franco senior deputy governor of the Bank of Italy in a recent speech.
“We see this in the market capitalization of some companies, like Alphabet (the owner of Google) and Facebook. Even once the value of their cash, physical and intangible assets, and accumulated R&D has been stripped out, their capitalization remains huge. These developments are causing increasing concerns among policymakers, from both a political and economic point of view,” he said.
One of the most important developments is the wave of recent improvements in artificial intelligence, especially machine learning, which represents a paradigm shift from the first wave of computerization. Historically, most computer programs were created by accurately codifying human knowledge, mapping inputs to outputs as prescribed by the programmers. In contrast, machine-learning systems use categories of general algorithms (e.g. neural networks, random forest and gradient boosting) to figure out mappings and are typically driven by voluminous amounts of data.
By employing huge data sets and big data processing resources, machines have made impressive progress: “The exponential growth of digital data, such as images, videos and speeches, from numerous sources (e.g. social media, Internet-of-Things, etc.) is driving the search for tools that allow us to extract relevant information from disparate kinds of data,” added Franco.
Rules on data management are sometimes different across jurisdictions and data domains and ideally, there would be global standards. Furthermore, the adoption of machine learning and artificial intelligence requires as input huge amounts of granular and unstructured data, which should be collected through close collaboration between public institutions and private companies.
“We also need specific platforms or data lakes to store and analyze such data so as to preserve privacy and security. This can be done through a collectively trusted cloud where all actors, both public and private, can operate to gain insights for their core business,” he said.
The availability of privacy-preserving algorithms or the use of federated learning, recently suggested by researchers at Google, could provide the enabling technology. Federated learning allows machine learning algorithms or artificial intelligence on data to be distributed on different servers, avoiding any data exchange.