Researchers from University of Melbourne using machine learning to find fake news on Twitter
(Content in this article is based on a press release from University of Melbourne and an article by Professor Stephan Winer and Marie Truelove published on Pursuit, an open-access website which is published by the University. Any reproduced content is under the Creative Commons Attribution-No Derivatives 3.0 Australia (CC BY-ND 3.0 AU). The paper can be accessed here.)
Researchers at the University of Melbourne are using machine learning to distinguish false information from the truth on social media platform, Twitter.
Increasing numbers of people rely on their social media feeds for news. But algorithms on social media platforms prioritise engagement over accuracy, and unscrupulous content creators can easily create and post misleading or even outright false information’, motivated by financial, political or other reasons.
Professor Stephan Winer and Marie Truelove from the Melbourne School of Engineering have developed a framework to assess whether a tweet is a witness account from a first-hand experience or not, relying on the principle that witness accounts are more trustworthy than hearsay.
The framework analyses details of a tweet to determine whether it is a witness account. It starts with checking the georeferenced or location information in the metadata of tweets, but only a small fraction of users turn on that option.
To identify more sources of evidence the researchers turned to the content of the tweet itself, that is the text and the pictures.
The text of the tweet could contain statements like observations of the event (smoke in the sky for a bushfire is the example given in the article written by the researchers). This information combined with images (of the smoke rising above the house or a live shot from a football match) and location information (geotags from the relevant town or suburb) can provide evidence for the Twitter user being a credible witness.
The researchers also look for counter-evidence which might show that a tweeter is not a witness on-the-ground. For example, if they describe themselves being in some other place or post an image of the event on a TV screen, the case for their being an eyewitness is weakened by the contradictory evidence.
This evidence can be extracted automatically by using machine learning and it is used to assign a tweet with a credibility measure, from low to high.
But Ms. Truelove said there were still challenges to be addressed. For instance, the Twitter user could have learnt about the event by watching it on TV. Attached pictures may be unattributed copies from other sources, or feature historic events at the same place.
Tweeters can post their excited anticipation of attending an event later in the day but not go, or alternatively delay posting their witness accounts until on the way home after the event has ended.
The posting behaviour of witnesses can also vary, depending on the type of event. For example, tweets reporting an event has not occurred will only appear if the event had been predicted in the first place, such as when predicted flooding and power outages associated with a cyclone do not happen. I
The researchers are overcoming these challenges primarily by investigating different evidence sources within tweets. A series of processes are applied to remove tweets that cannot support inferences the tweeter is at the event, for example, retweets. Then supervised machine learning techniques are used to apply classification models to extract evidence from the remaining tweets that support inferences the tweeter is at the event.
The framework is in early development phases, but it could be useful to journalists and news organisations around the world, for checking the veracity of social media sources.