The University of Sheffield at CheckThat! 2020: Claim Identification and Verification on Twitter

Thomas M. McDonald, ZiQing Dong, Yingji Zhang, Rebekah Hampson, James Young, Qianyu Cao, Jochen L. Leidner, Mark Stevenson

September 2020

Abstract

The spread of misinformation online has been gathering pace in recent years which has led to research into automatic methods for claim verification. The COVID-19 pandemic presents a unique challenge due to the large amount of inaccurate information being shared on social media platforms. This paper describes the University of Sheffield’s entry to the CLEF 2020 CheckThat! Lab, which focuses on the problems of determining check-worthiness and verification of claims found in tweets, including those related to COVID-19. For the Tweet Check-Worthiness Task (Task 1), we found that TF-IDF term weightings used by a Random Forest model outperformed more complex approaches employing Word2Vec embeddings and recurrent neural networks, and for the Claim Retrieval Task (Task 2), we found that BM25 similarity score weightings based on TF-IDF term weightings with a Support Vector Machine classifier scoring model outperformed other methods making use of cosine and Euclidean similarity metrics, and regression-based scoring models.

Type

Conference paper

Publication

In Conference and Labs of the Evaluation Forum 2020

Thomas M. McDonald

Senior Machine Learning Technologist

I am a Senior Machine Learning Technologist at Ofcom, working in the Online Safety Group on research relating to recommender systems and responsible AI.