Revue de l'Information Scientifique et Technique
Volume 27, Numéro 2, Pages 13-17
2023-11-08
Authors : Said Bachir . Barmati Mohammed Elsadiq .
Natural Language Processing has recently become one of the most trending research areas in Artificial Intelligence, especially in social media-related tasks. This paper describes our participation in the "Hate Speech Detection on Arabic Twitter” task at the CERIST NLP-Challenge 2022 competition. The proposed solution aims to classify the tweets collected in the Arabic ARACOVID19-MFH multi-label and multi-dialect dataset into "Hateful" and "Not Hateful" categories. Based on a pre-trained transformer model known as GigaBERT-v4, our solution outperformed the most common transformer models supporting the Arabic language. Experiments have proved that the GigaBERT-v4 model is more effective than the other models using the previously described dataset, obtaining a 99.46% accuracy and a 98.68% macro F1-score.
Arabic Twitter ; hate speech detection ; multilingual transformers ; GigaBERT-v4 ; XLM-T ; AraBERT ; mBERT ; COVID'19
Barmati Mohammed Elsadiq
.
Bachir Said
.
pages 1-6.
Chiker Nour Elhouda
.
pages 36-43.
Magnossão De Paula Angel Felipe
.
Bensalem Imene
.
Rosso Paolo
.
Zaghouani Wajdi
.
pages 7-12.
Bouchal Hakim
.
Belaid Ahror
.
pages 24-28.