Retry in practice
Use the tenacity library to wrap a flaky function with exponential backoff and per-exception routing, and see it recover from transient failures automatically.
- Use the tenacity library to retry a flaky function with exponential backoff
- Distinguish retriable exceptions from permanent ones in the retry predicate
- Observe retry attempts and final success or failure in the output
The previous lesson explained the theory of exponential backoff, jitter, and transient vs permanent failures. Here you will see a working tenacity setup against a simulated flaky function, with deliberately short wait times so the demo runs quickly in the browser.
The setup
The demo defines two custom exceptions: TransientError (retriable) and
PermanentError (not). The flaky function fails with TransientError for a
configurable number of calls before succeeding. A second call deliberately raises
PermanentError to show that tenacity propagates it immediately without retrying.
Run the snippet and notice:
- Demo 1: three "retrying in …s" lines, then a success on the fourth call.
- Demo 2: the
PermanentErrorsurfaces immediately — the loop makes exactly one call and re-raises without sleeping.
Using the real tenacity library
In production code you would use tenacity directly:
from tenacity import (
retry,
wait_exponential,
stop_after_attempt,
retry_if_exception_type,
before_sleep_log,
)
import logging
logger = logging.getLogger(__name__)
@retry(
wait=wait_exponential(multiplier=1, min=1, max=60),
stop=stop_after_attempt(5),
retry=retry_if_exception_type(TransientError),
before_sleep=before_sleep_log(logger, logging.WARNING),
)
def fetch_report(url: str) -> dict:
response = requests.get(url, timeout=30)
response.raise_for_status()
return response.json()before_sleep_log logs each retry attempt to your logger automatically — no
manual print calls needed in production.
Cap your max_wait value. Without a ceiling, a cascade of failures can produce
waits of minutes or hours, leaving your pipeline stalled with no obvious sign of
activity. A maximum of 60 seconds is reasonable for most HTTP APIs; increase it
only for operations with inherently long recovery times.
Configuring the retry predicate
retry_if_exception_type is the most common predicate, but tenacity offers
others:
| Predicate | Retries when |
|---|---|
retry_if_exception_type(T) | exception is instance of T |
retry_if_result(predicate) | return value fails predicate(result) |
retry_if_exception_message(match=…) | exception message matches a pattern |
Combining predicates with | lets you retry on a 429 status code or a timeout
without needing a custom exception hierarchy.
Where to go next
Next: lab — harden a pipeline — put idempotency, atomic writes, and retry logic together to harden a brittle three-step pipeline that currently fails on almost every run.
Retry logic
Transient failures are a fact of life in networked pipelines. Learn when to retry, when not to, and how exponential backoff with jitter prevents a thundering herd.
Lab: harden a pipeline
Take a brittle three-step pipeline and make it production-ready — add idempotency checkpoints, atomic writes, exponential-backoff retry logic, and failure alerting.