When an AI Code Assistant Reviewed Its Own Work and Missed a Critical MCP Server JWT Time-bomb
Intro: The Audit That Looked Perfect—Until It Wasn’t
An off-the-shelf AI code assistant combed through our MCP server codebase, produced a neat large report, and stamped the project “enterprise-grade secure.” Moments later, the DryRun Security analysis engine flagged two vulnerabilities—one a complete authentication bypass. If you’re betting prod uptime on generic AI code review, keep reading.
Executive Summary
This briefing compares an AI-driven security review from a major AI leader of the DryRun Security Insights MCP application with the results from DryRun Security. The AI review tool produced an impressively detailed analysis yet missed two high-severity issues in JWT verification and OIDC endpoint communication. The miss underscores the danger of over-relying on generic AI for security-critical reviews and highlights the need for specialized tooling like DryRun Security.
Main Themes & Key Findings
- Leading AI coding agent Capabilities
- Requested an in-depth inspection of all project artifacts.
- Correctly summarized:
- Application behavior (AI-powered security analysis for code repositories).
- Technology stack (Python 3.12+, FastMCP, Starlette, uvicorn, etc.).
- Data stores & integrations (AWS S3, Pinecone, DynamoDB, OIDC, JWKS).
- Security architecture (JWT validation, secret management, audit logging, multi-stage Docker builds).
- Application behavior (AI-powered security analysis for code repositories).
- Concluded that the app “demonstrates enterprise-grade security practices.”
- Requested an in-depth inspection of all project artifacts.
- Critical AI Blind Spots
- Claimed “comprehensive token validation and signature verification.”
- Missed two vulnerabilities rated Critical and High by DryRun Security despite their subtlety and having “passed human review.”
- Claimed “comprehensive token validation and signature verification.”
{{table9}}
Why Generic AI Review Stumbles Here
Recent research from UTSA shows LLM-generated code is prone to nuanced security slips that also evade the same models in review mode. The models lack executable context and can’t reason about attacker creativity at the token-level.
The case study vividly demonstrates that while AI code review tools can perform extensive structural and functional analysis, they currently fall short in identifying complex, subtle, and context-dependent security vulnerabilities, particularly those introduced by AI-generated code.
The missed JWT algorithm confusion and insecure OIDC communication flaws were "catastrophic" and highlight a critical gap in AI's current security assessment capabilities. This reinforces the indispensable role of specialized, advanced security analysis tools, like DryRun Security, to provide a robust defense against sophisticated threats that can bypass both AI and human review.
Reliance on AI for security-critical code requires extreme caution and continued human oversight augmented by advanced, targeted security tooling.
How DryRun Security Closed the Gap
The DryRun Security analysis engine pairs advanced inspection with context-aware attack simulations that are a strong fit for MCP server security paths:
- Detects header-controlled algorithm switches.
- Flags network calls that bypass strict TLS.
- Maps findings to exploit likelihood, not just pattern matching.
The result: actionable alerts on critical issues before the commit hits main.
Ready to See What Your AI Missed?
Book a 30-minute demo and watch DryRun Security expose the gaps your code assistant leaves behind—no repo migration, no configuration gymnastics.
Protect your MCP server. Trust, but verify with DryRun Security!