How Crowdstrike Caused Millions of Windows PCs to BSOD - Windows Kernel Panic

preview_player
Показать описание
Recently an update pushed out by a security software vendor caused millions of PC to kernel panic, resulting in chaos in airports, hospitals, train stations, banks and more! How did it happen? What causes a Blue Screen of Death? What causes a Windows kernel panic?
---

#garyexplains
Рекомендации по теме
Комментарии
Автор

What an excellent explanation of BSOD. Your videos and simplified explanations are outstanding. I used to teach IT and I think your videos are an excellent resource for students.

johnsmith-vyvo
Автор

Black screen of death - Linux
Pink screen of death - MacOs
Blue screen of death - Windows

obinnaokafor
Автор

It was much worse than described here. Please watch Dave's Garage for an in depth analysis of what caused this. Basically a signed kernel driver loading unsigned, untested p-code, and executing it. Also it was a Boot-start driver (for some reason) so Windows would always load it at start up. The video here explains the result, but not really the cause.

stuartajc
Автор

I just love your easy to understand explanations.
Thanks Gary!

natjes
Автор

If you really want to go the rabbit hole on this one check out Dave Plummer's excellent video on his Dave's Garage Channel. Dave Plummer is the guy who wrote the Windows Task Manager. He goes into detail as to what exactly happened and why CroudStrike needs to run in kernal mode.

paulbarnett
Автор

The problem is there software was installed as a needed to boot properly driver and would download and run code in kernel space on live machines. Running application code on live machines without first testing on a similarly configured test system isn't wise or sop in most places. But this security software completely ignored standard security best practice in a manor that makes what happened inevitable in way to make Thanos jealous and snapped a big chunk of the internet. Another scary element is they got whql certification for this driver that breaks security and end runs around the point of whql certification. So MS owns a small bit of this snafu IMHO.

kaseyboles
Автор

So my understanding is that to circumvent the need to keep signing the driver Microsoft, the driver they signed is more like a bytecode interpreter.
Their updates are then bytecode blobs.

The issue is that one of those bytecode blobs was just full of zeros and the interpreter faulted, trying to parse the invalid bytecode.

Oddly enough they didn't seem to follow best practices in the interpreter, to sanity-check the input.

How close or far off is this description?

deth
Автор

nicely explained. adding to IT woes, a properly locked down device with bitlocker and tpm would required special keys to get back into the os. many publicly interfacing machine in airport and hospital would require it.

test
Автор

A bit creepy setting on that "greenscreen" looks like your ears are melted and the face moves a bit like classic EIO on videos moving around to fast xD
As always interesting!

Iliyena
Автор

From what I've seen, it seems that a file which was supposed to be valid, instead contained 0's. And that's what got executed. So the major questions would be how a valid file because invalid when distributed and why there was no mechanism for Crowdstrike to test their update using the ACTUAL distributed code? It doesn't seem to be a compiler or memory checker sort of thing in this case.

allanflippin
Автор

Great explanation of what a BSOD is and why there was a kernel panic, but no explanation as the why the computers couldn't simply be rebooted and Windows couldn't do it's usual rollback to a working state based on a recent restore point?

Also us normies have never heard of Clowdstrike before, I consider myself and advanced power user of all kinds of devices over the past 30 years and this was the first time I've ever heard the name before.

How is this company no one has ever heard of before suddenly so important to our day to day interactions?

nicolaspeter
Автор

My workplace running at 1/2 availability, some of the useable computers wasn't ready for the contract yet.

alaingraham
Автор

Although it’s a technical issue. But the problem with as always with a greedy managers. What has happened with so called “business continuity “ concept that critical systems must adhere to?
All the eggs in the same basket. The critical systems and industries must be regulated to avoid disruption.

joelmamedov
Автор

Nicely explained, but why did every IT department install it with out testing it? I thought all software was tested before they rolled it out across all their systems.

tonysheerness
Автор

note: We are not supposed to use zero terminated strings anymore, that is the basis for buffer overflows.

schmide
Автор

Crowdstrike has authority to push these server updates automatically, so the question in my mind is what is their quality control protocol that was supposedly good enough to gain this trust, and was it followed???
The next question in my mind is why did crowdstrike continue to push out this update when every (or at least most) server that got it never came back on line? This check might slow deployments slightly, but I would expect every kernel level update should be cognizant of it’s effect.

skyak
Автор

That made me wonder if it's a coincidence that in JVM NOP instruction (No operation) is coded as 00 (byte).

PeterRince
Автор

All these security and anti cheat software needing to run in kernel mode, bit of a shame it has come to this rather than staying outside.

AndersHass
Автор

Did Microsoft itself subscribe to Crowdstrike or did the MS users whose computers were affected had subscribed to Crowdstrike security?

DK-oxze
Автор

Is it not common practice for maintenance companies such as Cloud Strike to issue updates to only a few particular computers of each customer, for testing purposes before rolling out the updates to all customers' computers? Say... about four test computers of each of their customers, and wait for feedback from the onsite techs.

plipogamez