Revue de l'Information Scientifique et Technique

Information Processing at the Digital Age


RIST is a peer-reviewed review that aims at publishing research papers on any topic related to information processing at the digital age which includes but is not limited to: digital Libraries, Digital Humanities and Heritage, Artificial Intelligence, Natural Language Processing, Semantic Web Technologies, Linked Data, Databases, big data, Machine Learning, deep learning, computer vision, ... and other issues related to information access, ethics, and privacy. The review welcomes papers in English, French, and Arabic from native Arabic speakers.


مجلة "المعلومات العلمية و التقنية" مجلة علمية  محكمة ذات لجنة تحنيم دولية، تهدف إلى نشر أوراق بحثية حول أي موضوع يتعلق بمعالجة المعلومات في العصر الرقمي والذي يشمل على سبيل المثال لا الحصر: المكتبات الرقمية ، والعلوم الإنسانية الرقمية والتراث ، والذكاء الاصطناعي ،  ومعالجة اللغة الطبيعية ، وتقنيات الويب الدلالية ، والبيانات المرتبطة ، وقواعد البيانات ، والبيانات الضخمة والتعلم الآلي والتعلم العميق ورؤية الكمبيوتر ... والقضايا الأخرى المتعلقة بالوصول إلى المعلومات كالأخلاقيات والخصوصية. ترحب المجلة بالأوراق البحثية باللغات الإنجليزية والفرنسية والعربية      







Arabic sentiment analysis within covid-19.

عرباوي سليمان,  بلفدهل علاء الدين, 

Résumé: In this paper we give a brief study that allow us to analyse some Arabic tweets posted in the Covid-19 period and classify them into “Positive, Negative and Neutral”. We worked on a dataset that consist of 4800 tuples on which we applied three different approaches “Naive Bayes, Neuron network and Stochastic gradient descent (SGD)” where the last algorithm gave the best result with an accuracy of 91%.

Mots clés: Covid-19 ; Arabic sentiment ; Classification ; Text analysis ; Naive Bayes ; Stochastic gradient descent (SGD) ; Neuron network

hate speech detection model based on BERT for the Arabic dialects

شيكر نورالهدى, 

Résumé: Hateful speech spread through social media has the potential to cause personal harm and suffering as well as social tension. Social media platforms, on the other hand, are unable to regulate all of the content that users post. As a result, there is a demand for automatic detection of hate speech. This demand is increased when the posts are written in complex languages, such as Arabic. This present study is dedicated to contributing to hate speech and offensive language detection tasks for Arabic dialects. We propose an approach based on deep learning and a pre-trained BERT model. this approach is built by adding GRU and LSTM layers to BERT outputs. Additionally, to deal with the class imbalance issue in the dataset, two methods are proposed, the first is based on data augmentation by oversampling minority class using translation and back translation method and the second uses focal loss for training. The best results reached with focal loss training are 88.51% for accuracy and 97.46% for f1-score, and with data augmentation, 89.61% for accuracy and 97.78 for f1-score.

Mots clés: hate speech ; offensive language ; detection ; Arabic dialects ; deep learning ; BERT model ; class unbalanced ; oversampling ; focal loss

Transformers and Ensemble methods: A solution for Hate Speech Detection in Arabic language

Magnossão De Paula Angel Felipe,  Bensalem Imene,  Rosso Paolo,  Zaghouani Wajdi, 

Résumé: This paper describes our participation in the shared task of hate speech detection, which is one of the subtasks of the CERIST NLP Challenge 2022. Our experiments evaluate the performance of six transformer models and their combination using 2 ensemble approaches. The best results on the training set, in a five-fold cross validation scenario, were obtained by using the ensemble approach based on the majority vote. The evaluation of this approach on the test set resulted in an F1-score of 0.60 and Accuracy of 0.86.

Mots clés: Hate speech detection ; Transformers ; Ensemble Methods ; Arabic