Anatomy of an autonomous pentest scan

A technical walkthrough of Sekura's seven-phase multi-agent pipeline: SAST, recon, dynamic probing, exploit synthesis, chain analysis, post-quantum review, and reporting, with examples of what each phase produces.

A scan that cannot prove exploitation has not finished its job.

Most automated security tools stop at detection. They identify something that resembles a vulnerability pattern, file a finding, and stop there. We built Sekura differently. The entire pipeline is oriented toward one outcome: a working proof-of-exploit, or silence.

The Seven Phases

Every Sekura scan runs through seven phases in sequence. Each phase hands its output to the next. The order is not arbitrary: each phase narrows the search space so that by the time we reach exploit synthesis, we are working with a short list of high-confidence hypotheses, not a long list of theoretical candidates.

The Pipeline in Motion

The phases share state through a central evidence store. Some run concurrently where dependencies allow. SAST and recon start in parallel because they have no shared dependency. Exploit synthesis waits for dynamic probing to confirm a hypothesis first. The post-quantum review runs independently and joins at reporting.

Here is what the data flow looks like for a single scan:

sequenceDiagram participant SAST as Phase 1: SAST participant Recon as Phase 2: Recon participant Probe as Phase 3: Dynamic Probing participant Synth as Phase 4: Exploit Synthesis participant Chain as Phase 5: Chain Analysis participant PQC as Phase 6: PQC Review participant Report as Phase 7: Reporting SAST->>Probe: candidate findings and call graph Recon->>Probe: live attack surface map Probe->>Synth: confirmed hypotheses Synth->>Chain: proven exploits Synth-->>Report: proven unchained exploits Chain->>Report: chained exploit sequences PQC->>Report: cryptographic risk flags

What Exploit Synthesis Produces

The output of phase four is not a CVSS score. It is a runnable proof.

For a SQL injection finding, the synthesis agent constructs something like this:

curl -s "https://target.example.com/api/users/search?q=1%27%20OR%201%3D1--" \
  -H "Accept: application/json" | jq '.users[0]'

And the scan captures the actual response:

{
  "id": 1,
  "email": "[email protected]",
  "role": "admin",
  "password_hash": "$2b$12$aB3kP9xR..."
}

We attach the exact HTTP request, the live response excerpt, and the parameter. You can reproduce it yourself in under thirty seconds. That is the point.

Phase five then asks whether this access extends further. In this example it did: the admin session was reachable from the same connection, turning a data disclosure into a full account takeover. The report surfaces a chained finding, not two separate medium-severity items.

I think this is where most teams are under-served by their current tooling. A list of findings with CVSS scores does not tell you what an attacker can actually reach from your environment right now.

Post-Quantum Review Is a Separate Track

Phase six runs against a different threat model: not what an attacker can do today, but what becomes practical when quantum-capable hardware matures.

We check for four categories:

  1. RSA keys under 4096 bits used for key exchange.
  2. ECDH on pre-quantum curves (P-256, P-384) in long-lived sessions.
  3. TLS 1.2 configurations without hybrid key exchange (X25519Kyber768).
  4. Symmetric keys hard-coded in source, regardless of algorithm or length.

These findings do not carry a proof-of-exploit. The report marks them clearly as forward-looking risks, not active compromises. We do not conflate the two, because conflating them makes both worse.

The Report Is the Contract

Every finding in the report maps to a confirmed proof. Every proof is reproducible. Nothing ships without both halves.

Security tools have gotten good at generating volume. We are trying to do something different: generate a smaller document you can trust completely. A 40-finding report where every finding is confirmed is more useful than a 400-finding report where you do not know which ones are real.

The better question for any team is not how many findings did the tool produce. It is how many of those findings can an attacker actually use against you today. If your current tooling cannot answer that, the pipeline is not complete.

The shift from detection to proof is not a product feature. It is a different theory of what a security tool is for.

See the pipeline run against your own codebase at /poc/.