If you only need strict style compliance, a linter will do. If you need to surface context‑dependent security flaws, you need more than the deterministic pattern matching found in traditional SAST.
AppSec teams keep echoing the same mandate: don’t chase theoretical perfection—deliver signal that matters. They want scanners that surface logic flaws nobody could spot before, not dashboards full of ghost bugs.
Developers already dread legacy SAST triage, and security engineers are burned out from patching brittle rules. Underneath this is the debate between determinism and probabilism and their places in application security. In this blog, I'd like to offer my thoughts on why this matters and how we're dealing with this at DryRun Security.
Deterministic SAST: Strengths and Limits
Determinism assumes identical outputs for identical inputs. Classic SAST tools, Fortify, Checkmarx, and Veracode, encode this principle through rule engines and pattern matching.
The approach yields repeatable results, which is valuable for quality‑assurance checkpoints that cannot tolerate variance. Yet software development is fundamentally creative—even with AI‑assisted coding—so pattern matching quickly falters once the code drifts beyond narrowly defined rules.
Where deterministic SAST excels:
- Syntax and style violations
- Blocking banned functions and libraries
- Simple API misuse in monolithic codebases
Where deterministic SAST falls short:
- Microservices and serverless functions
- Multi‑repo applications or multi‑app monorepos
- Cross‑system logic, authorization, and business‑workflow flaws
The Accuracy Gap
Seasoned reviewers know deterministic SAST has never achieved perfect precision or recall. Rule sets flood dashboards with false positives while overlooking deeper issues. AppSec teams spend countless cycles updating those rules—often just to keep pace with the latest JavaScript/TypeScript craze or a fresh take on Spring.
Inaccuracy is almost guaranteed, because no generic rule set can fully understand proprietary libraries or domain‑specific frameworks.
As architectures have grown more distributed, the gulf between what deterministic SAST can model and what modern software actually does has only widened.
Key drivers of inaccuracy:
- Service decomposition
A single request can touch multiple repositories and runtimes. - Externalized authorization
Access checks often live in shared libraries outside the local codebase. - Business‑logic flaws
Vulnerabilities such as broken object‑level authorization rarely follow consistent syntactic patterns.
These accuracy gaps push teams into perpetual rule‑tuning cycles—operational toil that siphons time from real security improvements.
OWASP has repeatedly highlighted this maintenance burden. The OWASP Software Assurance Maturity Model (SAMM) estimates that maintaining SAST rule sets can consume up to 20 percent of an AppSec engineer’s time for every release—a figure many practitioners consider conservative. A 2023 session at OWASP Global AppSec Dublin, “SAST Rule Maintainability in DevSecOps,” captured the pain succinctly: “Current SAST tools are limited… and produce high numbers of false positives.” This echos OWASP’s broader assessment of source‑code analysis (OWASP).
This burden is because traditional tools can only follow a strict recipe card they’ve been using for years, requiring these cards to be constantly updated for every new variant and threat. To break out we need to throw away the recipe cards and hire an all-knowing chef who analyzes the context in the restaurant (ingredients, customers, tools) to create the best dish possible every time.
Probabilistic, AI‑Native Analysis
DryRun Security’s Contextual Security Analysis (CSA) adopts probabilistic techniques instead of static rules. By weighting context, relevance, validation likelihood, and continuous feedback, CSA applies advances in natural‑language processing to the unique challenges of secure code review—an approach rooted in the founders’ years of AppSec training and GitHub code‑review leadership.
We introduced this paradigm in 2023 with the publication of our Contextual Security Analysis Guide. From day one, DryRun has been AI‑first and AI‑native (founded after the transformer breakthrough) so we never had to retrofit a language model onto brittle pattern engines.
Because CSA was born in the LLM era, we systematically evaluated every major model, fine‑tuned domain‑specific variants, and built a library of Natural‑Language Code Policies (NLCPs) capable of reasoning across repositories.
We do not hide a legacy pattern matcher beneath a thin AI veneer; our pipeline is probabilistic from invocation to outcome.
Early experimentation taught us how to balance token budgets, model specialization, and human‑in‑the‑loop validation. Those lessons shaped the accuracy framework below.
Our Accuracy Framework
- Scoped context windows
Segment code into coherent chunks to preserve intent without exceeding token limits. - Multi‑pass pipelines
Initial passes surface candidates; subsequent passes perform semantic validation. - Quantitative evaluation
A regression harness measures every pipeline change. - Agent‑based validation
Specialized model agents cross‑check one another’s conclusions. - Model specialization
Each task routes to the language model that offers the best cost‑accuracy balance. - Code‑aware queries
Agents navigate repositories to confirm security‑critical patterns, such as proper authorization checks.
Does Probabilistic Mean Less Certain?
“Probabilistic” does not mean “random.” It means each potential finding is scored by likelihood and then vetted by companion agents. Instead of a brittle yes/no rule, CSA delivers a confidence‑weighted verdict you can sort, filter, and dispute—complete with the contextual evidence that drove the call.
The payoff is visible in the numbers. In the 2025 SAST Accuracy Report, CSA surfaced 88 percent of critical vulnerabilities versus 45 percent for the best deterministic scanner. That 43‑point swing represents entire classes of business‑logic flaws finally appearing on dashboards instead of in post‑incident write‑ups.
For mature AppSec programs, adopting CSA is a pragmatic upgrade, not a moon‑shot experiment. It plugs into existing CI pipelines, replaces the noisiest step, and immediately cuts toil for both developers and security engineers. Lower noise, higher recall, and evidence you can act on—hard to call that anything but a sure bet.
Toward Evidence‑Based Security
The conversation is shifting from “prove your rules cover everything” to “prove your results with data.” Probabilistic methods can adapt alongside evolving codebases and attacker tactics, offering a sustainable path to better outcomes.
For a full breakdown of metrics and methodology, download the 2025 SAST Accuracy Report. The report compares CSA with leading pattern‑matching SAST tools and provides complete benchmark data—proof that in application security, context beats patterns every time.