By Sergey Soldatov
Head of Security Operation Center, Kaspersky
Striving to minimize the risk of missing cyberattacks, we are forced to deal with a huge number of false positives in our detection logic. According to the MDR analytical report for 2023, the Kaspersky SOC team processed 431,512 security alerts, but only 32,294 were classified as a result of 14,160 incidents reported to customers. In these situations, there is considerable room for automation, including the use of machine learning (ML), deep learning, and artificial intelligence (AI).
Notably, the AI-based Autoanalyst used in MDR processed about 30% of false positives on average in 2023, which reduced the load on the SOC team by approximately 25%.
How does AI/ML help to detect incidents?
The most common application of machine learning in cybersecurity is the attack detection, where both supervised and unsupervised ML can be employed. In supervised machine learning, the model is trained on data related to attackers’ activity aims to identify similar malicious behavior. In contrast, unsupervised machine learning involves profiling the legitimate behavior of systems and services to detect anomalies, deviations and outliers. Despite their effectiveness, both approaches are prone to errors, meaning that false positives are still a challenge in automated detection systems.
Well aware of what modern attacks look like, and realizing that detecting these attacks often results in a large volume of alerts, SOCs are exploring how to reduce the workload on analysts with the help of ML, how to solve the problem of triage efficiency, filtering false positives in the resulting stream of alerts, and shifting the automation tasks merely from attack detection to filtering out legitimate activities, that are not threats.
The solution to the problem of filtering false positives is the AI-based Autoanalyst. This supervised machine learning model learns from alerts processed by the SOC team, and then attempts to replicate their behavior independently.
By reducing the number of alerts requiring SOC analysts’ investigation by at least one quarter, the Autoanalyst saves team resources. Moreover, Autoanalyst handles the most typical, routine alerts, allowing SOC analysts to focus on the most interesting cases that require deeper investigation by human.
How MDR benefits from AI/ML?
The present world is defined by an endless struggle between opposing forces, and the field of information security is no exception. On the one hand, we strive to minimize the risk of missing incidents by creating an increasing number of detection rules, including those based on AI/ML. This approach results in a high volume of alerts that require the attention of the SOC team, leading to more false positives and reducing the detection logic conversion.
On the other hand, we aim to reduce false positives and lighten the analysts’ workloads. The obvious way to achieve this is to reduce the total number of alerts, but this raises the probability of missing an attack. As a result, we face the challenge of balancing detection quality with detection logic conversion: either we try to catch everything and end up drowning in false positives, or we have no false positives at all, with our conversion rate is close to 100%, but we run the risk of missing some attacks.
In practice, as a rule, it is possible to strike a balance between these extremes, achieving high quality detection of hidden attacks and reducing the number of false positives simultaneously. One tool to achieve this “toggle switch” is the Autoanalyst. By increasing its filtering rate and thereby reducing the workload on the SOC team, the likelihood of its classification error increases. This means that true positive alerts might be misclassified as false positives, and vice versa. Conversely, reducing classification errors typically leads to a higher rate of false positives. Statistics show that SOC analysts can also make mistakes, so a small margin of error is acceptable for the Autoanalyst. In the case of Kaspersky MDR, the error probability does not exceed 2%. This 2% error rate defines the volume of false positive alerts that Autoanalyst can filter while maintaining acceptable quality.
The Autoanalyst’s work quality is dynamically monitored, and its share of filtering alerts is adjusted accordingly. This might sound amusing, but the Autoanalyst appears to have learned from SOC analysts not only how to recognize false positives, but also how to “get tired” and how to be “overworked,” leading to degradation in quality. This issue is addressed through constant model retraining if its false positive classification error rate exceeds 2%.