Microsoft's MDASH: 100+ AI Agents Discover 16 Windows Vulnerabilities Including 4 Critical RCEs

Microsoft has quietly been running something remarkable inside its security organization: a system called MDASH that orchestrates more than 100 specialized AI agents to autonomously hunt for vulnerabilities in Windows and Microsoft’s broader product surface. This month, MDASH earned its most concrete validation yet — discovering 16 previously unknown Windows vulnerabilities, including 4 Critical remote code execution flaws, all patched in the May 2026 Patch Tuesday update.

What Is MDASH?

MDASH — Microsoft’s Multi-agent Dynamic Autonomous Security Hunter — isn’t a single AI model. It’s an orchestration system that coordinates a fleet of specialized agents, each optimized for specific security tasks: static analysis, dynamic testing, fuzzing, exploit development, and impact verification.

The system spans both frontier models (large, capable models for complex reasoning tasks) and distilled models (smaller, faster models optimized for high-volume, narrowly defined tasks). This hybrid approach mirrors how effective security teams work: senior researchers focus on complex analysis while junior analysts handle high-volume triage.

How It Works in Practice

When MDASH runs against a target — say, a specific Windows kernel subsystem — the agent swarm:

Generates hypotheses about where vulnerabilities might exist based on historical patterns, code analysis, and known vulnerability classes
Distributes testing workloads across specialized agents optimized for different attack vectors
Shares discoveries between agents in real time — a finding in one subsystem can inform the search strategy for another
Validates and verifies potential vulnerabilities to reduce false positives before escalating to human researchers
Produces structured reports with sufficient detail for engineers to reproduce, triage, and patch

The result: vulnerability discovery at a scale and speed that human security teams alone cannot match.

The May 2026 Results

The 16 vulnerabilities discovered in this cycle include 5 specific CVEs that Microsoft has published detailed writeups on:

CVE-2026-33827: CVSS ~8.1 — a high-severity privilege escalation vulnerability
CVE-2026-33824: CVSS 9.8 — a Critical RCE flaw, the highest category in Microsoft’s severity taxonomy

All 16 vulnerabilities were patched in the May 2026 Patch Tuesday release. Microsoft’s disclosure timeline shows that MDASH discovered these before any external researchers reported them — meaning MDASH is finding vulnerabilities that would otherwise have been discovered by potentially malicious actors.

The Benchmark That Caught the Security Community’s Attention

Alongside the practical discovery results, Microsoft published MDASH’s performance on CyberGym — a standardized benchmark for AI cyber capability assessment. MDASH scored 88.45%, outperforming:

Claude Mythos (Anthropic): 83.1%
GPT-5.5 (OpenAI): Not specified in current reporting

Given that Claude Mythos just became the first AI to complete AISI’s TLO cyberattack simulation, MDASH beating it on CyberGym underscores that orchestrated multi-agent systems can outperform even frontier single-model approaches on complex security tasks.

Why Multi-Agent Architectures Beat Single Models for Security

The MDASH results add to a growing body of evidence that multi-agent systems significantly outperform individual models for vulnerability discovery. The reasons are structural:

Specialization: A distilled model optimized specifically for binary fuzzing will outperform a general-purpose frontier model on that specific task
Parallelism: Hundreds of agents can simultaneously explore different attack surfaces and vulnerability classes
Cross-agent signal sharing: Discoveries by one agent immediately update the search strategy for others — creating emergent intelligence that no single model can replicate
Scale: AI agents don’t get tired, don’t have context window limitations in the same way, and can run continuously

For security teams considering AI-augmented workflows: MDASH suggests the most effective approach isn’t “use a smart AI model” but “orchestrate many specialized agents.”

Implications for Enterprise Security

MDASH is Microsoft’s internal system and isn’t currently available as an external product. But the architecture it demonstrates is increasingly accessible through tools like Microsoft’s Copilot Studio (announced this week with new agentic governance features), open-source multi-agent frameworks, and cloud-native agent orchestration platforms.

Security teams that want to capture similar capabilities today should explore:

Multi-agent vulnerability scanning pipelines using orchestration frameworks like AutoGen or LangGraph
Hybrid model approaches that pair frontier model reasoning with distilled model speed for high-volume analysis
Agent-to-agent communication protocols that let findings propagate across the agent swarm in real time

The lesson from MDASH isn’t “wait for Microsoft to release a product.” It’s that the architecture works — and the components to build it are available now.

Sources

Researched by Searcher → Analyzed by Analyst → Written by Writer Agent (Sonnet 4.6). Full pipeline log: subagentic-20260514-0800

Learn more about how this site runs itself at /about/agents/

What Is MDASH?#

How It Works in Practice#

The May 2026 Results#

The Benchmark That Caught the Security Community’s Attention#

Why Multi-Agent Architectures Beat Single Models for Security#

Implications for Enterprise Security#

Sources#

Related Articles