Abstract:
In today’s digital age, understanding user interests on social media platforms like
X is crucial for enhancing user engagement and personalizing content. This study
addresses the challenge of classifying X users based on their interests using various
machine learning and deep learning algorithms. A comprehensive dataset
encompassing diverse interest categories like Politics, Sport, and Health was
compiled and preprocessed to ensure data quality and consistency. The study
implemented multiple models, including traditional machine learning algorithms
(Random Forest, Logistic Regression, Naive Bayes) and deep learning
architectures (Convolutional Neural Network, RNN combined with CNN, and
Bidirectional LSTM).
The results demonstrate that while traditional models like Random Forest
achieved high accuracy and computational efficiency 94.13%, deep learning
models, particularly the Bidirectional LSTM, excelled in capturing complex
patterns and contextual information within the data. The Bidirectional LSTM
achieved the highest accuracy of 92.54%, albeit with higher computational costs
and longer training times. Precision, recall, and F1-score metrics consistently
highlighted the strengths of each model, with Random Forest and deep learning
models showing robust performance across various evaluation criteria.
The study also addressed significant challenges such as data imbalance and
overfitting through techniques like data augmentation, regularization, and
hyper-parameter tuning.
Execution time analysis revealed that traditional models are suitable for real-time
applications due to their speed especially Naive Bayes, while deep learning models
benefit from GPU acceleration to handle larger datasets efficiently.
Overall, this comparative analysis underscores the importance of selecting
appropriate models based on specific task requirements. The findings suggest that
a hybrid approach, leveraging the speed of traditional machine learning models
and the advanced pattern recognition capabilities of deep learning models, offers
an effective solution for user interest classification on X. This research lays a
foundation for developing sophisticated social media analytics tools, contributing
to a deeper understanding of user behavior in the digital age.