Practical Threat Hunting With Machine Learning

preview_player
Показать описание
Machine learning technologies often have a high barrier to entry and require expertise from different disciplines - data science, data engineering, software engineering, security research, and security operations - that are not always readily available. One person, from any of these individual disciplines, may not possess the requisite knowledge to operationalize machine learning models for threat hunting. We developed our 64 machine learning jobs (unsupervised models) for threat hunting by pairing a security researcher with data scientists and data engineers and we found this combination yielded the best results. I will describe our development methodology. The resulting 64 jobs have a sufficiently simple operational model that security analysts can deploy and tune them without requiring a data scientist. With tuning requirements similar to conventional rules, a security or SOC team can consume the output of the models as alerts in order to hunt interesting threats that search-based rules will often not find.

As threat actors continue to innovate in order to evade detection, ML techniques can be very useful in finding the few malicious events that may be hidden among billions of similar events with only a difference in nuance. While not a replacement for human analysis, the size, and gravity of modern logging and event data sets make ML a valuable addition to conventional search rules and hunting techniques.

Case studies will include high-value detections including C2 detection using frequency and shape analysis of network events; DGA detection using frequency and shape analysis of DNS events; privilege elevation and exfiltration in cloud environments using frequency analysis of both single fields and pairs of field values; credentialed access relevant to ransomware scenarios using frequency analysis; and LPE exploit activity using frequency analysis and computation of relative rarity. Finally, work on risk-based detection clustering will be demonstrated. Clustering often produces high-confidence correlations, making actionable detections easier to see.

#ThreatHuntingSummit #MachineLearning
Рекомендации по теме
Комментарии
Автор

This channel is the universe's greatest gift to this profession since caffeine.

jtoddcyber
Автор

The problem is that "rare" is most of the time not a threat. The alert fatigue is still there in this presentation when you talk about a big company's data logs. For example, all the Company's IT usage policy of their assets might be violated from time to time (and always if you have dozens of thousands of assets), for example with porn, games, file sharing websites, etc... Most of the time these IT policy violations are not threats. So, in order to identify these you need to do more and using geolocation, rare executables, rare IPs, or even command activity and time is not enough. This mathematically means that if you measure the "outlyiness" of each event in a company as a number, say 0% to 100%, you will have a probability distribution with very thick tails, therefore you will need ultra high and dynamic thresholds, or otherwise the alert fatigue will be imminent. The solution is to measure maliciousness as well, which is something I am currently experimenting. If you have something for me on this I would appreciate your expertise.

eduardoduarte
Автор

Love the content. Hard to listen to an "um" in every sentence, though.

LtGrandpoobah
Автор


Next up. Fuzzy hash scripts and script blocks and if rare. Alert.

jakesj