Verification

Verification is where many agent workflows either become trustworthy or fall apart. A convincing explanation is not verification. A passing command, screenshot, or reproduced behavior is.

Quick Take

Agents do better when they have a feedback loop. Tests, screenshots, and command results are that loop.

What To Ask For

relevant tests for behavior changes
formatting, linting, or type checks
screenshots or browser evidence for UI changes
logs or command output summaries for debugging work
an explicit statement of what was verified versus assumed

Testing Rules Worth Writing Down

## Testing Rules

- Write or update tests when behavior changes.
- Run the relevant test command before finishing.
- For UI changes, capture visual evidence when possible.
- Never present an unrun check as if it passed.
- Say what remains unverified.

TDD Prompt Pattern

For risky logic changes, say so explicitly:

Write the failing tests first, show that they fail, then implement until they pass.

A Useful End-Of-Task Pattern

Ask the agent to report back in two buckets:

Verified

What it actually ran, checked, or observed.

Assumed

Anything it could not validate directly, such as environment-specific behavior or an unavailable integration.

Verification Questions

Before accepting a change, make sure you can answer:

Did the agent run the right checks, not just any check?
Did it add tests when the change affected behavior?
Did it verify the user-facing outcome, not only compile success?
Did it call out what remains uncertain?

UI Counts Too

For frontend work, screenshots are often the missing piece. A passing build is not enough if the page is visually wrong.

Why This Matters

Coding agents fail most dangerously when they sound done before the work is truly validated. Verification turns a plausible change into a trustworthy one.