CrowdStrike Releases the Root Cause Analysis (RCA)

preview_player
Показать описание
Cyber Security News
Root Cause Analysis — Channel File 291
Рекомендации по теме
Комментарии
Автор

If I am understating this correctly, it was a very deterministic error: if you install the channel file, you get the error 100% of the time. So the incident could have been avoided by just installing the update on one test machine before releasing it to the "valued customers". Code inspection and "synthetic" testing are great for detecting errors before you deploy code, but you still have to do testing in actual conditions on a limited scale before pushing an update to thousands of systems.

DQSoft
Автор

- The error first accrued when a mismatch happened between what was validated (21 inputs) and what was provided after the validation (only 20 inputs).
- The testing face of a non-wildcard passed (Undetected). It was introduced in July 19, 2024.
We can safely say It was a faulty testing process and a miss communication between "Content Validator" and "Content Interpreter"
which made input 21 in an "out-of-bounds" state.
I would say this thou:
Thank you CrowdStrike for releasing the RCA soon and in depth, it told me that you have the professional skills that is needed to solve problems, but, it is clearly a mismanagement problem that could have been avoided with a close inspections. Give your skilled developers time and let them check all of your systems manually. Don't rush updates and always ask "What we missed?" and double check.

Aksel_
Автор

All I can say is they should have caught this. I worked in software dev and testing for a major telephone carrier before I retired and that environment is even more critical than CloudStrike. We had to prove our test suite was as bulletproof as humanly possible. One set of testing was done and sourced from our call center. These techs basically had no training on what they were testing and we wanted it this way (by the way, this wasn't the only testing gate). If something was going to break, these techs could do it. Not on purpose, of course. If a piece of software required any external touching, reading, writing of ANY file, you were expected to come up with a way to break it, then test for that later 14 ways from Sunday, to borrow a phrase. Sorry, I'm ranting a bit, but an outage like this was preventable, IMO.

parrottm
Автор

Microsoft has stated that some of the issues are old system that should have been updated year ago. So makes sense some companies will still be dealing with this.

alexcardosa
Автор

Weird, I have had daily updates from Crowdstrike

drussthelegend
Автор

I suppose the big question most folks have after this debacle is: can they be trusted that this won't happen again? I... probably wouldn't bet on it.

Aikurisu
Автор

I still don't understand the part where the development and testing phase didn't detect the mismatch. If the content interpreter expected 21 and the sensor was only supplying 20, the wildcard pattern matcher wouldn't magically create a 21st field. Shouldn't it have crashed then too? Or did the wildcard match something in the sensor data and mark that as the 21st field which when it changed to a non wildcard pattern didn't match anything and then led to OOB access ?

zoso
Автор

All these companies including the one I work for should not be using Crowdstrike, or all running the same AV/firewall software in general, the very idea of it just sounds like a disaster waiting to happen, and when it happens as it did, it will affect a large majority running the same tool.
This is what happens when business/corporations go cheap and all relay on the same service.

vhnetwork
Автор

Looks like they're taking a modicum of accountability.

applejuicecity
Автор

Do their little white lies and subtle credit hogging for any minor role in rectifying their own mistake make you feel better?
Alongside your own constructive criticism and support, of course. A little complimentary misdirection is key in selling sincerety and hedging some of the public criticism leading up to the court case. :/

Crazy that a mismatch in the number of data points snowballs into ultimately nobody getting any at all...?

TriggerJim
Автор

The real question is why only a handful of company's control the space, kind of scary when one player has so much control.
If one day MS has the same problem, it's going to be a nightmare.
Well we kind of had the same problem in the early W10 days~

liaminwales