Social media is increasingly used to spread fake news. The same problem can be found in capital markets – criminals spread fake news about companies in order to manipulate share prices. Researchers at the Universities of Göttingen and Frankfurt and the Jožef Stefan Institute in Ljubljana have developed an approach that can recognize such fake news, even when the news contents are repeatedly adapted. The results of the study were published in the Journal of the Association for Information Systems.
In order to detect false information – often fictitious data that presents a company in a positive light – the scientists used machine learning methods and created classification models that can be applied to identify suspicious messages based on their content and certain linguistic characteristics. “Here we look at other aspects of the text that makes up the message, such as the comprehensibility of the language and the mood that the text conveys,” said Jan Muntermann from the University of Göttingen, in a statement.
The approach is already known in principle from its use by spam filters, for example. However, the key problem with the current methods is that to avoid being recognized, fraudsters continuously adapt the content and avoid certain words that are used to identify the fake news. This is where the researchers’ new approach comes in: to identify fake news despite such strategies to evade detection, they combine models recently developed by the researchers in such a way that high detection rates and robustness come together.
So even if “suspicious” words disappear from the text, the fake news is still recognized by its linguistic features. “This puts scammers into a dilemma. They can only avoid detection if they change the mood of the text so that it is negative, for instance,” said Michael Siering, an author of the study from Goethe University Frankfurt, in a statement. “But then they would miss their target of inducing investors to buy certain stocks.”
The new approach can be used, for example, in market surveillance to temporarily suspend the trading of affected stocks. In addition, it offers investors valuable information to avoid falling for such fraud schemes. It is also possible that it could be used for criminal prosecutions in the future.
Abstract:
Researchers address the challenge of building an automated fraud detection system with robust classifiers that mitigate countermeasures from fraudsters in the field of information-based securities fraud. The work involves developing design principles for robust fraud detection systems and presenting corresponding design features.
They adopt an instrumentalist perspective that relies on theory-based linguistic features and ensemble learning concepts as justificatory knowledge for building robust classifiers. Researchers perform a naive evaluation that assesses the classifiers’ performance to identify suspicious stock recommendations, and a robustness evaluation with a simulation that demonstrates a response to fraudster countermeasures.
The results indicate that the use of theory-based linguistic features and ensemble learning can significantly increase the robustness of classifiers and contribute to the effectiveness of robust fraud detection. Implications for supervisory authorities, industry, and individual users are discussed.