A Novel Approach to Arabic Fake News Detection: Leveraging Ensemble Deep Learning and Supervised FastText Embeddings
The proliferation of fake news on social media platforms poses a significant threat to society, particularly in the Arabic-speaking world. This article details a comprehensive research effort aimed at developing and evaluating advanced fake news detection models specifically designed for Arabic text. The research explores a range of machine learning (ML) and deep learning (DL) models, including traditional ML algorithms, boosting methods, transformer-based models, and novel ensemble DL architectures. The key innovation lies in the combination of ensemble deep learning, specifically a hybrid Bi-directional Long Short-Term Memory (Bi-LSTM) and Bi-directional Gated Recurrent Unit (Bi-GRU) model, with supervised FastText word embeddings.
The models were rigorously evaluated on two distinct Arabic datasets: the AFND Dataset and the ARABICFAKETWEETS Dataset. Performance was measured using standard metrics such as accuracy, precision, recall, and F1-score. The experimental setup involved a Core i7 machine with 16GB RAM running Windows 10, using Python with TensorFlow and Keras libraries. Two variations of FastText embeddings were employed: unsupervised and supervised. The supervised FastText embeddings demonstrated consistently superior performance across all tested models.
The results showcased the dominance of the proposed Bi-LSTM + Bi-GRU ensemble model. On the AFND dataset, it achieved an impressive accuracy and F1-score of 0.98 with supervised FastText embeddings. Similarly, on the ARABICFAKETWEETS dataset, it achieved near-perfect performance with an accuracy and F1-score of 0.99. These scores significantly outperformed other models, including traditional ML algorithms like Naïve Bayes and Logistic Regression, boosting methods such as Gradient Boosting and XGBoost, and transformer-based models like XLNet. Furthermore, the Bi-LSTM + Bi-GRU model surpassed other hybrid DL models, such as RNN-CNN and CNN-LSTM, demonstrating its effectiveness in capturing the complex nuances of Arabic text for fake news detection.
An in-depth error analysis provided insights into the strengths and weaknesses of the different models. While simpler models struggled with misclassifying a significant number of instances, the proposed Bi-LSTM + Bi-GRU model drastically reduced these errors. For example, on the AFND dataset with unsupervised FastText, the Bi-LSTM + Bi-GRU model had only 1,271 misclassified instances compared to 5,401 for the RNN-CNN model. This analysis highlighted the robustness of the ensemble model, particularly when combined with supervised FastText embeddings. Further investigation into the error analysis revealed that the remaining misclassifications are often due to the inherent ambiguity in some tweets, the use of informal language, slang, and sarcasm, and potential errors in the dataset labeling.
To confirm the statistical significance of the results, a paired t-test was conducted. The analysis focused on the superior performance achieved with supervised FastText embeddings. The t-test confirmed that the differences in accuracy between the Bi-LSTM + Bi-GRU model and the other hybrid DL models were statistically significant (p-value < 0.0001) for both datasets. This rigorous statistical validation further reinforces the superiority of the proposed model.
The research also benchmarked the proposed model against existing state-of-the-art Arabic fake news detection models. On the AFND dataset, the Bi-LSTM + Bi-GRU model outperformed both CAPSNET and CNN-LSTM models. On the ARABICFAKETWEETS dataset, it achieved better accuracy and F1-score than the ARBERT model, although the ARBERT model held a slight edge in precision. These comparisons solidify the position of the Bi-LSTM + Bi-GRU model as a leading approach for Arabic fake news detection.
Finally, the computational complexity of the Bi-LSTM + Bi-GRU model was analyzed. While its time and space complexity are comparable to other hybrid DL models, the non-linear relationship between computational time and the hidden state size emphasizes the importance of careful parameter tuning. Despite this, the model’s superior performance justifies its computational requirements, particularly considering the critical nature of fake news detection.
In conclusion, this research demonstrates the efficacy of combining ensemble deep learning with supervised FastText embeddings for Arabic fake news detection. The proposed Bi-LSTM + Bi-GRU model achieved state-of-the-art results on two benchmark datasets, surpassing existing models in accuracy, F1-score, and robustness. The findings suggest a promising direction for future research in combating the spread of misinformation in Arabic online communities. The research emphasizes the importance of exploring advanced language models and embedding techniques tailored to the specific challenges posed by the Arabic language.