Snapshot testing
Use golden-file comparison to assert on CLI output format and catch accidental regressions when the output changes.
- Explain what snapshot testing is and when it is useful for CLIs
- Write a simple golden-string assertion for a CliRunner result
- Understand how to update a snapshot when the output intentionally changes
Unit tests check correctness: does the function return the right value? Snapshot tests check stability: did the output change? For CLIs, the output format is part of the contract. A tool that silently reorders columns, changes a status label, or drops a field breaks users who depend on that format — even if the underlying logic is unchanged.
What snapshot testing is
The pattern is simple:
- Run the command and record the exact output.
- Save that output as the expected value — the "golden file" or "snapshot."
- Future test runs compare the live output against the saved snapshot.
- When the output changes deliberately, update the snapshot; when it changes accidentally, the test fails.
The simplest form: a golden string
For short, deterministic output, the snapshot can live directly in the test as a string:
EXPECTED = """\
File Lines Blank
main.py 12 2
utils.py 8 0
"""
def test_report_output():
runner = CliRunner()
result = runner.invoke(report_cmd, ["--dir", "/fixtures"])
assert result.output == EXPECTEDIf the report command adds a column, reorders rows, or changes a column name,
this test fails. The failure is a prompt to decide: was the change intentional?
If yes, update EXPECTED. If no, fix the regression.
Golden files on disk
For longer output, the snapshot lives in a file:
import pathlib
GOLDEN = pathlib.Path("tests/golden/report.txt")
def test_report_golden(tmp_path):
runner = CliRunner()
result = runner.invoke(report_cmd, ["--dir", str(tmp_path)])
if not GOLDEN.exists():
GOLDEN.write_text(result.output) # first run: create the golden file
return
assert result.output == GOLDEN.read_text()To update a golden file after an intentional change, delete it and re-run the test once. On the first run, the file is recreated with the new output; subsequent runs compare against that new baseline.
Libraries like syrupy automate this workflow (update with --snapshot-update),
but the underlying concept is the same.
Try it
Snapshot tests are brittle for output that includes timestamps, durations,
or random values. Either strip those values before comparing
(re.sub(r"in \d+\.\d+s", "in X.Xs", output)) or use a different assertion
strategy for the dynamic parts. Snapshot the stable parts; assert the
dynamic parts separately.
Where to go next
Next: lab — CLI coverage — write a test suite that covers the ds-tool
from the previous module: happy paths, error paths, stdin input, and env var
injection.