Code of the Day
AdvancedRobust Pipelines

Retry in practice

Use the tenacity library to wrap a flaky function with exponential backoff and per-exception routing, and see it recover from transient failures automatically.

WorkflowAdvanced8 min read
Recommended first
By the end of this lesson you will be able to:
  • Use the tenacity library to retry a flaky function with exponential backoff
  • Distinguish retriable exceptions from permanent ones in the retry predicate
  • Observe retry attempts and final success or failure in the output

The previous lesson explained the theory of exponential backoff, jitter, and transient vs permanent failures. Here you will see a working tenacity setup against a simulated flaky function, with deliberately short wait times so the demo runs quickly in the browser.

The setup

The demo defines two custom exceptions: TransientError (retriable) and PermanentError (not). The flaky function fails with TransientError for a configurable number of calls before succeeding. A second call deliberately raises PermanentError to show that tenacity propagates it immediately without retrying.

Python — editable, runs in your browser

Run the snippet and notice:

  • Demo 1: three "retrying in …s" lines, then a success on the fourth call.
  • Demo 2: the PermanentError surfaces immediately — the loop makes exactly one call and re-raises without sleeping.

Using the real tenacity library

In production code you would use tenacity directly:

from tenacity import (
    retry,
    wait_exponential,
    stop_after_attempt,
    retry_if_exception_type,
    before_sleep_log,
)
import logging

logger = logging.getLogger(__name__)

@retry(
    wait=wait_exponential(multiplier=1, min=1, max=60),
    stop=stop_after_attempt(5),
    retry=retry_if_exception_type(TransientError),
    before_sleep=before_sleep_log(logger, logging.WARNING),
)
def fetch_report(url: str) -> dict:
    response = requests.get(url, timeout=30)
    response.raise_for_status()
    return response.json()

before_sleep_log logs each retry attempt to your logger automatically — no manual print calls needed in production.

Cap your max_wait value. Without a ceiling, a cascade of failures can produce waits of minutes or hours, leaving your pipeline stalled with no obvious sign of activity. A maximum of 60 seconds is reasonable for most HTTP APIs; increase it only for operations with inherently long recovery times.

Configuring the retry predicate

retry_if_exception_type is the most common predicate, but tenacity offers others:

PredicateRetries when
retry_if_exception_type(T)exception is instance of T
retry_if_result(predicate)return value fails predicate(result)
retry_if_exception_message(match=…)exception message matches a pattern

Combining predicates with | lets you retry on a 429 status code or a timeout without needing a custom exception hierarchy.

Where to go next

Next: lab — harden a pipeline — put idempotency, atomic writes, and retry logic together to harden a brittle three-step pipeline that currently fails on almost every run.

Finished reading? Mark it complete to track your progress.

On this page