Every person should be able to use the automated fact-checking as easily as we use search engines
Iryna Gurevych* is a Professor in the Department of Computer Science and the founding Director of the Ubiquitous Knowledge Processing (UKP) Lab at Technische Universität Darmstadt in Germany. Prof. Gurevych has a broad range of research interests in natural language processing (NLP) with a focus on computational argumentation, computational lexical semantics, question answering and summarization and discourse processing. She was one of the speakers at the conference “Fake News and other AI Challenges for the News Media in the 21st Century”, which took place in Vienna on the 29th and 30th of November of 2018.
For this interview, Iryna Gurevych answered questions regarding fake news, artificial intelligence (AI), automated fact-checking and NLP.
Q: According to your perception, what does fake news actually mean?
IG: I would define fake news as a piece of misinformation purposely disseminated to the public. In my opinion, the intention to misinform is what discriminates fake news from vague or just unverifiable information, which inevitably exists.
Q: Do you believe that fake news is a concept that has always been part of a disinformation communication, or is it a new and more powerful tool that arose with the emergence of new media?
IG: Fake news has existed as a part of a disinformation communication since the ancient times. It is just the scale of this phenomenon which has radically changed to a mass extent since the emergence of new media and social media in particular.
Due to this unprecedented scale, fake news has emerged as a threat to the democratic society: it provides an easy means of manipulating a huge number of people, provoking instability and brittleness of democratic values.
Q: When compared to classical (manual) fact-checking, what are the promises and pitfalls of automated fact-checking?
IG: The promise of automated fact-checking is to become an effective means to prevent the dissemination of intentional misinformation. The main distinguishing feature of automated fact-checking as compared to manual fact-checking is the scale. Every person should be able to use the automated fact-checking as easily as we use search engines to navigate through a huge amount of information every day. Similar to the information about the ingredients of food and beverages, the machine should screen and characterize any piece of information on the web regarding its contents before the end user consumes it. The main strength of the machine is the ability to tirelessly process a vast amount of content.
At the same time, the machine may not be so good at judging the content and aggregating multiple pieces of, for example, contradicting evidence. This is a potential pitfall of automated fact-checking, since the machines are not yet good enough at reasoning on top of the found information. This calls for a human-in-loop approach where the abilities of a machine and of humans are put together in a complementary way.
Q: According to your opinion, should automated fact-checking be employed on a broader scale, and if so, how would the articles to be fact-checked chosen?
IG: As said above, the broad scale is the main strength of automated fact-checking. Therefore, it is necessary and important to employ it at a large scale. In my opinion, any piece of information at the level of statements (and not necessarily whole articles) should be automatically screened. Potentially wrong information should be flagged, so that the user is alerted and can manually check it.
Q: Do you think that NLP techniques for automated fact-checking should be accessible for anyone?
IG: The user should not care so much about NLP techniques. Instead, they ought to be mainly equipped with tools for automated fact-checking. Those should be part of the end applications, such as search engines or text processors. Using automated fact-checking should be as easy and self-explanatory as using Google.
Q: Since AI can be used to create fake news themselves, how can it be used to detect them?
IG: Many AI problems are reverse ones. For example, we can use AI to detect plagiarism, and we can also use the same technology to generate plagiarism itself.
Regarding fake news, we should detect multiple pieces of evidence which are spread across a variety of sources, but are relevant to the statement we want to fact-check. Afterwards, we should reason upon the detected evidence, in order to come up with a judgment of the statement regarding its truthfulness.
Another technique to stop fake news from their expansion lies in the mechanics of the fake news dissemination. We can learn the patterns of behavior in generating and spreading fake news in order to detect and flag such cases early when they emerge.
Q: What are the biggest technical challenges regarding the detection of fake news from NLP perspective?
IG: The biggest challenges are the vague definition of truth and the incompleteness of the world and domain knowledge for machines, as well as for humans to make informed judgements in evaluating the information. Nobody knows the complete truth. So, in many cases there is just not enough information to come up with a sound judgement.
Besides, building training data for fake news detection and sophisticated models capable of making semantic inference remain significant technical challenges to be solved.
Q: Do you think that social media providers should be more active in promoting truthful news through automated fact-checking, which will examine the content of the news?
IG: Social media providers should be more active in detecting and flagging potential fake news. I do not think that any special measures are necessary to promote truthful content. Given that the fake news is filtered out, the wisdom of crowds in social media will solve the rest.
Q: What are the possible threats of this kind of decision-making regarding “approved” news through social media providers?
IG: One main threat are the biases of various kinds that could be introduced by this kind of censorship. Social media providers would be brought to a position of power, where they could influence the behavior of their audience in a specific way. This might be undesirable under different circumstances.
Q: According to your opinion, what could the future bring for the interaction of NLP techniques and automated fact-checking?
IG: Automated fact-checking is a very difficult problem due to its great context-dependence and the vague nature. Therefore, this is a wonderful benchmark for creating new natural language processing techniques going beyond the state-of-the-art.
I would like to point out to two exemplary research questions: first concerns utilizing user feedback to improve NLP models and second regards joint modeling of NLP tasks.
First, due to a potentially very large scale of automated fact-checking and a mass of its users, valuable interaction data could be collected to provide an additional supervision signal coming from users for training NLP models. Thus, content-based models could be incrementally improved through user feedback.
Second, jointly training models for multiple subtasks of automated fact-checking, such as evidence identification and evidence evaluation, will lead to more accurate performance in automated fact-checking along with the methodological advances in basic NLP research.
Google Scholar profile:
DBLP profile:Semantic Scholar profile:
Share on Social Media