تأثير اختيار الميزات على أداء خوارزميات التعلم الآلي في كشف التسلل باستخدام مجموعتي بيانات NSL-KDD وUNSW-NB15
Main Article Content
Abstract
This paper aims to analyse the impact of Feature Selection on the performance of machine learning-based Intrusion Detection Systems (IDS) using two benchmark datasets: NSL-KDD and UNSW-NB15. Three widely used machine learning algorithms were evaluated: Decision Tree (DT), Random Forest (RF), and K-Nearest Neighbors (KNN). Performance was measured using several evaluation metrics, including Accuracy, Precision, Recall, and F1-score, in addition to Inference Time and Training Time as efficiency indicators. The study adopted a feature selection methodology based on the Chi-Square test to eliminate redundant features and reduce data dimensionality.
Experimental results showed that feature selection successfully maintained high model stability and accuracy while improving computational efficiency. Random Forest achieved an accuracy of 99.78% on the NSL-KDD dataset and 97.63% on the UNSW-NB15 dataset. In contrast, the KNN algorithm achieved a significant reduction in inference time after feature reduction, decreasing from 34.77 seconds to 7.84 seconds. Furthermore, for Random Forest on the UNSW-NB15 dataset, the training time decreased from 24 seconds to 5.6 seconds, enhancing the model’s efficiency for real-time intrusion detection systems. The results confirm that feature selection not only improves classification accuracy but also reduces computational complexity, making IDS more efficient and scalable for large-scale network environments.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.