Fighting Disinformation at the Source: Tanbih News Aggregator

Utilizing AI and deep neural networks, a QCRI-built platform hopes to detect fake news even before it is written

Entity: Qatar Computing Research Institute

Fighting Disinformation at the Source: Tanbih News Aggregator

Nearly four decades ago, spam made its debut to the virtual world as one of the most pervasive nuisances known to online users. Today, societies and countries are globally confronted with a much more serious threat -- the rapid proliferation of fake news online. Though the motives are many times the same, the impacts, QCRI experts explain, are many.

“It is first important to understand that the term ‘fake news’ is, in essence, an oxymoron,” says Preslav Nakov, Principal Scientist at QCRI. “We automatically associate news with the truth, so the definition -- although seemingly self-explanatory -- is quite misleading.”

It is generally understood that the spread of fake news has become the most sophisticated political weapon we know today. What is less known to us, however, is how it snowballs. “Before the advent of technology, we had tabloids, and even before that, some of the world’s most ancient civilizations were found to perpetuate fake claims as a propagandist tool through inscriptions and basic communication methods. What is making this even more possible now is the rise of technology, and the implications can be devastating -- mass mobilizations, personalized attacks, and grave health consequences are but a few to name,” said Nakov.

Media Bias Detection

QCRI’s Arabic Language Technologies (ALT) and Social Computing (SC) groups have been working with MIT Computer Science and Artificial Intelligence Laboratory (MIT CSAIL) to develop Tanbih -- an online portal designed to analyze and uncover fake news at its very origin. The technology uses artificial neural-networks to represent text in a multi-dimensional space. Through a complex series of computations, it is able to identify disinformative content and pinpoint its bias.

“The platform is unique on several fronts: it puts technology in an actual product that users can easily operate, but it was also built to be region-specific, so it pays considerable attention to what is needed in this part of the world and it offers ways to address those needs,” explains Nakov.

Importantly, Nakov says, the platform is cognizant of pre-identified and highly predictable propagandist techniques. “The platform is intelligent enough to be able to rationalize media biases; whether it is emotional manipulation or appeal to authority, the technology is capable of understanding the underlying reasons for the spread of fake news. This functionality is quite innovative and will drive global solutions in the future as it offers explainability and thus it can gradually train users to recognize the disinformation by themselves.”

Nakov frequently likens fake news to ‘spam on steroids’, namely viral and weaponized, and more recent studies are now supporting this idea. “A large-scale study by MIT Media Lab has shown that fake news spreads six times faster than real news. Another study has suggested that 50 percent of fake news spreads within the first 10 minutes of introduction. Speed is of utmost importance when trying to identify misleading content online.”

Who Factchecks the Fact-checkers?

As a news aggregator, Tanbih relies on a sophisticated series of data-run analytics to create media profiles. It attempts to uncover important information on media stance, centrality, hyper-partisanship, and inclinations when it comes to popular topics such as climate change, capital punishment, or Qatar’s blockade.

“The best algorithms show no slants, which sometimes leaves us at the intersection of ethical questions -- biases have and will always continue to be there, but the key is to always strive for balance and neutrality,” notes Nakov.

Central to this question, believes Nakov, is obtaining the best data possible. “Aggregators rely on a pool of data compiled from several sources -- social media interactions, journalist resources, and Alexa rankings are prime examples. The quality of the data in these sources is fundamental to overall outcome -- the poorer the data, the more deficient the outcome.”

Adaptability to Arabic

One key element of this technology is its capability to decipher and analyze Arabic media content. Presently at an advanced stage of implementation, the platform is quickly showing promise.

“The platform is supportive of Arabic content in that it understands nuances in media bias definitions. Categorizing media as left or right-leaning in the Arab world does not make much sense; what makes sense, however, is a liberal-conservative classification,” said Nakov.

One hurdle to overcome is the reliability of the data on which the platform infers media biases. “The growth of Arabic content online has been relatively slow and is further hindered by the lack of trustworthy analysis.”

Fake News: A Perpetual Problem?

Despite initially bleak prospects, QCRI experts believe that what began as an omnipresent threat will eventually fizzle out. “Fundamental to this is general public awareness,” says Nakov. “Propaganda becomes ineffective once people become more aware, and plenty is being done to spread knowledge on this issue.”

Supporting these efforts is the strict enforcement of legislation that prohibits slander, defamation, and hate-speech worldwide. “These tactics can be very effective when fake news is being used to impersonate people or inflict personalized attacks.”

More critically, the use of social media as a force for public good can be effective on its own. “Facebook and Twitter have already waged war against fake news with the auto-removal of bots and fake accounts. Many social media platforms now integrate smart solutions that position fake news at the very bottom of our news feeds. Social media platforms are not very vocal about these efforts because they do not want to be seen as regulatory bodies,” explains Nakov.

The research group works alongside Al Jazeera, RTÉ, Ireland, Associated Press, Tech Mahindra, V-Nova and Metaliquid as part of a project to explore ways to intelligently automate the identification of false or biased on-air content using AI. The project won the IBC2019 Catalyst Pioneer Award at this year’s IBC 2019 media conference, the largest broadcast media conference, attended by over 50K participants.

"The collaboration with Al Jazeera gives us a unique opportunity to tailor our research to the real needs of professionals, journalists and IT specialists, and therefore to make a real impact with our work,” says Dr. Giovanni da San Martino of QCRI.