I had a great conversation with Byron Acohido of The Last Watchdog and USA Today last night about how False Positives (FPs) in our industry occur. As AntiVirus vendors, we all live in fear of major False Positives; it keeps us up at night. It has happened to everyone - including Symantec, McAfee, BitDefender and Kaspersky to varying degrees over the past several years. FP’s vary in their level of severity, and range from minor – such as triggering on very obscure and rare programs used by only a few - to severe – such triggering on a core part of the Windows Operating System, which is exactly what has happened in this case. Believe it or not but minor FP’s happen to vendors several times a week, but they rarely cause widespread damage, affecting only the more obscure applications; which is why we never hear about them at all – they simply don’t make the news.
All AntiVirus vendors use signatures as one form of detection – they are the most precise method by which to detect threats. Most AntiVirus vendors have a variety of signature formats, and also a variety of detection engines – a requirement to keep up with the growing number of threats on the Internet today. Normally you would see FP’s from ‘generic signatures’, behavioral engines, or heuristics, used to detect a complete family or strain of viruses. It’s difficult to say what kind of detection was responsible for the most recent SVCHOST.EXE False Positive that McAfee fell victim to – but it sounds like a very precise detection in this case, meaning a potential lapse in the quality assurance process.
How Something like this can Happen
In order to understand how this can happen, it’s useful to understand the process that AntiVirus vendors undertake in order to generate detections.
- Their user base; through the use of behavioral or heuristic engines
- Honeypots and web crawlers
- AntiVirus Industry sample exchanges with other vendors
In order for SVCHOST.EXE to make its way into a vendors sample collection, it will have come from one of those three sources. Any one of these could have been the culprit in this case, since a reasonable percentage of the files collected through all of these channels are going to be FP’s.
FP’s have historically been produced by both human analysts and automation technologies, so it is difficult to say which the culprit was in this case but clearly SVCHOST.EXE made its way through the analysis process without being recognized as a problem. It is quite possible for a junior analyst to mistakenly misclassify a clean file as a virus.
As vendors become aggressive and threats become more prevalent, the risks of FP’s also increase. McAfee is certainly not alone with this problem. That being said, the traditional form of publishing detections in this way leaves desktops over-exposed to situations like this. There are clear cut safe guards that can be put in place to prevent these types of events, and newer cloud-based solutions have the benefit of detecting and mitigating FP’s within minutes, limiting the damage when they do occur.
I would expect the industry to take note of this event and make a renewed effort to avoid False Positives in the future.
More about the Specific False Positive in the McAfee Case
SVCHOST.EXE is a core part of the Windows Operating System, and it is a well known clean file. Interestingly enough, very basic files like this frequently show up in AntiVirus vendor sample collections. Consumer and enterprise users will often send almost anything suspicious to AntiVirus vendors for analysis, including many core Windows processes that are frequently seen running on the system via Task Manager. It doesn’t help that some viruses actually masquerade as SVCHOST.EXE, leading to confusion and the submission of the legitimate SVCHOST.EXE process for analysis. As a result – the appearance of SVCHOST.EXE in a sample collection is not surprising at all. As for why the FP was not detected during QA, that is more difficult to answer. The file was either not part of the clean set, or the QA process failed somewhere along the way.
At the moment it looks like each machine that was affected requires manual intervention. It’s impossible to say how many systems were actually affected, but this was not an isolated event. Peter Svensson of the Associated Press reports many were affected, including Intel, and CNET’s Declan McCullagh reports, “the University of Michigan's medical school reported that 8,000 of its 25,000 computers crashed.” Geek.com’s Matthew Humphries reports numbers much lower, in the several hundred, in fact, ZDNet’s Ed Bott has some additional information worth a read too.
It would be difficult to estimate the monetary damage. Ironically, the best measure on widespread system failure that we have seen has been that resulting from the widespread virus outbreaks in the past decade. CodeRed in 2001, Slammer in 2003 and Sasser in 2004 might provide some baseline numbers on the effect of large scale system failure.
I can definitely sympathize with McAfee; nobody wants to have this problem while striving to protect people better. They are working to correct the issue as engadget’s Nilay Patel quotes a first response from McAfee (worth a read).