Article contents
Advanced Cybercrime Detection: A Comprehensive Study on Supervised and Unsupervised Machine Learning Approaches Using Real-world Datasets
Abstract
In the ever-evolving field of cybersecurity, sophisticated methods—which combine supervised and unsupervised approaches—are used to tackle cybercrime. Strong supervised tools include Support Vector Machines (SVM) and K-Nearest Neighbors (KNN), while well-known unsupervised methods include the K-means clustering model. These techniques are used on the publicly available StatLine dataset from CBS, which is a large dataset that includes the individual attributes of one thousand crime victims. Performance analysis shows the remarkable 91% accuracy of SVM in supervised classification by examining the differences between training and testing data. K-Nearest Neighbors (KNN) models are quite good in the unsupervised arena; their accuracy in detecting criminal activity is impressive, at 79.56%. Strong assessment metrics, such as False Positive (FP), True Negative (TN), False Negative (FN), False Positive (TP), and False Alarm Rate (FAR), Detection Rate (DR), Accuracy (ACC), Recall, Precision, Specificity, Sensitivity, and Fowlkes–Mallow's scores, provide a comprehensive assessment.