2026.06.10 verificationuvm 2 min read

A passing testbench proves nothing - until it can fail

Inside the Regex Accelerator UVM campaign: 66 tests, a hard protocol gate, three suspected RTL bugs that were not, and why the zero matters.

VoskenAI · Jun 10, 2026

The Regex Accelerator’s UVM campaign ended at 66 of 66 tests passing with zero RTL defects found. That sentence should make you suspicious. “All green, nothing wrong” is also what a vacuous testbench reports - one that drives no traffic, checks nothing, and passes by default. So this post is not about the 66; it is about the machinery that makes the zero believable.

Make the gate hard

A check that can be downgraded will be downgraded, eventually, by someone with a deadline. We bound ARM’s AXI protocol checker onto both 512-bit AXI4 interfaces and configured its mandatory per-cycle checks as real errors - not warnings, no continue-on-failure flag, nothing soft. UVM scoreboards and 14 SVA bind modules run in the same regression with the same severity. A protocol violation anywhere in any test fails the run.

The environment itself is non-trivial: three bus agents (AXI4-Lite control, AXI4 ingress, AXI4 match-ring responder with a memory model), two scoreboards, eight covergroups, and an interrupt monitor. Every test runs the real end-to-end flow - program the engine over the control interface, load a rule set into the runtime tables, stream bytes in, and check the match events that land in host memory.

Require the checkers to have worked

Passing is necessary, not sufficient. A test in this environment also fails if its scoreboards saw no transactions - a green result with idle checkers is treated as a broken test, not a good one. The strongest single data point in the campaign is an exact-count check at volume: a 2,048-byte adversarial stream producing exactly the 1,536 matches the golden model predicts. Not “roughly”, not “at least”. Exactly.

The three bugs that weren’t

During the campaign, three suspected RTL bugs were raised: an ingress FSM that looked stuck, a control interface that seemed to stall during ring drain, and an apparent 128-byte ingress limit. Each one was chased to a waveform, and each one turned out to live in the testbench, not the design.

It would have been easy to “fix” the RTL to make the symptoms disappear. The lineage discipline is what prevented that: every suspected bug had to be traced to a specific requirement and a specific signal before anyone was allowed to touch anything - and the trace kept ending inside the verification environment. The RTL was never modified. The testbench was, sixteen times.

That asymmetry is the finding. A verification campaign that hardens the testbench while the design holds is exactly what a disciplined generation pipeline should produce: the design decisions were made and reviewed upstream, so by the time UVM gets involved, what remains to find is mostly in the harness.

The honest ledger

Seventy-four of seventy-five requirements trace to passing tests. The seventy-fifth is a documented limitation, stated in the package rather than rounded up to 100%. We would rather ship a ledger with one named gap than a claim with none - you already know which vendor behaviour the alternative predicts.

The full environment - tests, scoreboards, covergroups, SVA binds, and the verification plan they trace to - ships in the release package. As always: do not trust the summary. Open the package and check.

Want the evidence behind the words?

See Verification Evidence →