Published 2026-04-16

Why Vulnerability Management Fails — and What to Do About It

Published 2026-04-16

Do you feel overwhelmed by vulnerability management? You aren’t alone. Vulnerability management is the process by which known technical vulnerabilities are identified and corrected. It sounds straightforward. But most organizations are failing at it, and technical vulnerabilities remain one of the top findings in organizational risk assessments and security tests.

The numbers tell the story: organizations take an average of 60 days to remediate critical vulnerabilities, only about 5% of known vulnerabilities are resolved per month, and 57% of observed vulnerabilities in the wild are more than two years old. This is not a skills problem or a staffing problem — it is a structural problem with how vulnerability management is practiced.

Why It Fails

Three forces combine to make vulnerability management unworkable at scale:

1. Volume

The rate of vulnerability disclosure has reached approximately 131 new CVEs per day — over 48,000 in 2025 alone. A vulnerability scanner applied to even a modest environment produces findings in the tens or hundreds of thousands. Consider two real-world examples:

A 50-bed critical access hospital with 700 employees and seven IT staff runs a vulnerability scan that identifies over 150,000 findings, approximately 30,000 of which are rated high or critical.
A multinational corporation with 160,000 employees and 3,000 IT staff runs a cloud security posture management (CSPM) assessment that generates 5 million findings, of which 1.2 million are rated high or critical.

These counts are after aggressive automated patching with WSUS, SCCM, or equivalent Linux tools. The volume is not a failure of patching — it is the natural result of scanning an environment where every application, library, protocol version, and configuration setting is a potential finding.

2. Poor Prioritization

Most vulnerability scanners report findings on a qualitative scale — low, medium, high, critical — or use CVSS base scores that cluster into a small number of severity bands. Approximately 30–38% of all CVEs are rated high or critical, which means a third of every scan’s output demands urgent attention. When a third of everything is urgent, nothing is.

CVSS base scores contribute to this problem. The NVD assigns scores using a worst-case-scenario approach when complete vulnerability information is unavailable, which systematically inflates severity. Organizations that rely solely on CVSS base scores — which is most organizations, because calculating environmental metrics at scale is impractical — are prioritizing based on theoretical worst-case severity rather than actual risk in their environment.

3. Noise

Vulnerability scanners are designed to report everything. This is philosophically defensible — the scanner doesn’t know which findings matter in your environment — but operationally devastating. Many reported vulnerabilities are either irrelevant to the organization’s threat model, mitigated by existing controls, or assigned severity levels that exceed their actual exploitable impact.

By enumerating every possible vulnerability, scanner vendors do reduce their exposure to claims of negligence — though they do not eliminate it. The result is a structural asymmetry: vendors face costs for under-reporting (a missed vulnerability that turns into a breach can become a lawsuit) but face no comparable cost for over-reporting. The customer absorbs the prioritization burden. This is not a conspiracy; it is what happens when the producer’s worst case is “we missed something” and the consumer’s worst case is “we drowned in noise.”

The Patching Paradox

The obvious response to vulnerability findings is to patch. But patching itself introduces risk — and the most visible recent example was a security update, not a feature update. On 19 July 2024, CrowdStrike’s Falcon platform pushed an automated content update that grounded airlines, banks, hospitals, and emergency services worldwide. Microsoft estimated the outage affected 8.5 million Windows devices before the rollout was halted. The post-incident analysis traced the root cause to a mismatch between the number of input fields a sensor template expected (21) and the number actually provided (20) in a routine deployment that had no human-in-the-loop check. The very speed and automation that make modern security responses effective also amplified a single mistake into a global outage. The argument that “patching faster is always safer” did not survive contact with that morning.

That is the extreme case. The routine case is no less revealing. Across 2025, multiple Windows security updates produced operational failures documented on Microsoft’s own support channels: KB5058379 (May, Windows 10 22H2/21H2) sent affected systems into Recovery and produced blue-screen errors; KB5063878 (July, Windows 11 cumulative) was associated with BSoD reports; KB5068861 (November, Windows 11 security update) generated a recurring unloaded_without_cancelling_pending_operations stop code on a non-trivial population of devices. Each of these is a security update — exactly the patches organizations are most pressured to deploy fastest. The frequency of patch-related incidents is high enough that many organizations delay patches specifically to avoid operational disruption, which leaves the vulnerabilities open longer.

This creates a paradox: the cure (patching) carries its own risk of operational impact, but the disease (unpatched vulnerabilities) carries the risk of exploitation. Organizations caught between these risks often default to patching what is easy and deferring what is hard — which is not the same as patching what matters most.

The Accumulation Problem

Vulnerability management has only two immediate options for each finding: fix it or don’t fix it. If you don’t fix it, it reappears in the next scan. Because the inflow of new vulnerabilities exceeds most organizations’ capacity to remediate, unresolved findings accumulate over time. This is entropy: risk increasing because the energy applied to counter it is insufficient. The vulnerability backlog grows, the percentage of findings that are “old” increases, and each scan report becomes longer and less actionable than the last.

What to Do About It

Automate Patching with Rollback

Tools like Windows Autopatch, Automox, and Linux equivalents (unattended-upgrades, dnf-automatic) automate the deployment of security patches and can automatically roll back changes that cause failures. Autopatch targets >95% of managed devices on the latest quality update and monitors device telemetry to pause rollouts when failure rates spike. This addresses the patching paradox by reducing the human cost of patching while containing the blast radius of bad patches.

Automated patching does not eliminate the need for vulnerability management — it reduces the volume of findings that require manual attention.

Improve Prioritization

CVSS base scores alone are insufficient. Complement them with:

EPSS (Exploit Prediction Scoring System) — predicts the probability that a vulnerability will be exploited in the wild within 30 days. EPSS shifts prioritization from “how bad could this be?” to “how likely is this to be exploited?” — a fundamentally more useful question for resource allocation.
STORM vulnerability mode — provides 100 severity levels instead of 3–5 qualitative buckets. If your scanner provides CVSS scores, STORM’s CVSSA transform converts them to a continuous scale that supports meaningful comparison and trending. STORM in risk mode goes further, producing a probabilistic value between 0 and 1 that accounts for asset value, threat likelihood, and vulnerability exposure.
Environmental context — a critical vulnerability in an Internet-facing application is not the same as a critical vulnerability in a system on an isolated network segment behind a firewall. Prioritization must account for exposure, not just severity.

Manage the Backlog

Use tools that allow you to annotate, track, and disposition findings:

Seen/Acknowledged — the finding has been reviewed and a decision is pending.
Accepted — the risk has been evaluated and explicitly accepted, with documented rationale. This is a governance act, not neglect (see Risk Treatment in Guerilla Security, Chapter 18).
Suppressed — the finding is irrelevant, mitigated by compensating controls, or a known false positive. Suppressing a finding removes it from future reports so it does not consume attention.
Remediated — the finding has been fixed and verified.

Tools like AWS Security Hub, Tenable, and similar platforms support workflow-based finding management. For AWS environments, RESCOR developed HubAccelerator, which enables mass annotation and status updates across Security Hub findings — converting a manual, per-finding process into a bulk operation.

There is a legitimate concern about suppressing findings that might someday result in an exploit. But if you are managing hundreds of thousands of findings, the alternative is worse: a backlog so large that no individual finding receives meaningful attention, and the organization patches based on what is convenient rather than what matters.

Use AI — Carefully

Large language models can assist with vulnerability triage by analyzing finding descriptions, mapping them to your environment, and recommending dispositions. However, current research shows that all major LLMs — including GPT-4o, Claude, DeepSeek, and Gemini — exhibit a consistent tendency to over-prioritize vulnerabilities, generating more false positives than correct classifications. In a benchmark evaluation of vulnerability triage decisions, Claude produced 8,900 false positives, DeepSeek 26,600, and ChatGPT 6,300.

This means LLMs can reduce a human analyst’s workload by pre-screening and categorizing findings, but their output must be reviewed by a qualified human before action is taken. Treat LLM-assisted triage as a filter that reduces volume, not as a decision-maker that replaces judgment.

The Bottom Line

Vulnerability management fails not because organizations lack tools or diligence, but because the volume of findings exceeds human capacity, the prioritization mechanisms are too coarse, and the noise-to-signal ratio is too high. The solution is not more scanning or faster patching — it is better prioritization, smarter automation, deliberate backlog management, and the organizational discipline to accept that not every finding requires remediation.

RESCOR Can Help

RESCOR helps organizations build vulnerability management programs that produce actionable results — using STORM quantitative risk measurement, RAPID iterative governance, and StrongCOR subscription services to replace the noise with signal.

Schedule a consultation → | +1 863 SECURE1 (+1 863 732-8731)