Retry in practice

Use the tenacity library to wrap a flaky function with exponential backoff and per-exception routing, and see it recover from transient failures automatically.

The previous lesson explained the theory of exponential backoff, jitter, and transient vs permanent failures. Here you will see a working tenacity setup against a simulated flaky function, with deliberately short wait times so the demo runs quickly in the browser.

The setup

The demo defines two custom exceptions: TransientError (retriable) and PermanentError (not). The flaky function fails with TransientError for a configurable number of calls before succeeding. A second call deliberately raises PermanentError to show that tenacity propagates it immediately without retrying.

Python — editable, runs in your browser

import random
import time

# ── Simulate tenacity-style behaviour without the external package ──────────
# (Pyodide ships without tenacity; this replicates the core logic so the
# demo runs in-browser identically to what you would write with the real lib.)

class TransientError(Exception):
  pass

class PermanentError(Exception):
  pass

def retry_with_backoff(fn, max_attempts=5, base=0.05, max_wait=0.4,
                     retriable_types=(TransientError,)):
  """Exponential backoff with full jitter. Mirrors tenacity's core loop."""
  for attempt in range(1, max_attempts + 1):
      try:
          return fn()
      except retriable_types as exc:
          if attempt == max_attempts:
              raise
          wait = random.uniform(0, min(base * (2 ** (attempt - 1)), max_wait))
          print(f"  attempt {attempt} failed ({exc}), retrying in {wait:.3f}s")
          time.sleep(wait)
      # Any non-retriable exception propagates immediately (no except clause).

# ── Flaky function — fails 3 times before succeeding ───────────────────────
_call_count = 0

def flaky_fetch():
  global _call_count
  _call_count += 1
  if _call_count <= 3:
      raise TransientError(f"connection reset (call {_call_count})")
  print(f"  attempt {_call_count} succeeded")
  return {"status": "ok", "records": 42}

# ── Demo 1: transient failure recovers ──────────────────────────────────────
print("=== Demo 1: recovers after 3 transient failures ===")
_call_count = 0
result = retry_with_backoff(flaky_fetch)
print("Result:", result)

# ── Demo 2: permanent failure propagates immediately ────────────────────────
print()
print("=== Demo 2: permanent failure is not retried ===")

_perm_count = 0

def always_fails():
  global _perm_count
  _perm_count += 1
  raise PermanentError("invalid API key")

try:
  retry_with_backoff(always_fails, retriable_types=(TransientError,))
except PermanentError as exc:
  print(f"Caught immediately after 1 attempt: {exc}")
  print(f"Total calls made: {_perm_count}")

Run the snippet and notice:

Demo 1: three "retrying in …s" lines, then a success on the fourth call.
Demo 2: the PermanentError surfaces immediately — the loop makes exactly one call and re-raises without sleeping.

Using the real tenacity library

In production code you would use tenacity directly:

from tenacity import (
    retry,
    wait_exponential,
    stop_after_attempt,
    retry_if_exception_type,
    before_sleep_log,
)
import logging

logger = logging.getLogger(__name__)

@retry(
    wait=wait_exponential(multiplier=1, min=1, max=60),
    stop=stop_after_attempt(5),
    retry=retry_if_exception_type(TransientError),
    before_sleep=before_sleep_log(logger, logging.WARNING),
)
def fetch_report(url: str) -> dict:
    response = requests.get(url, timeout=30)
    response.raise_for_status()
    return response.json()

before_sleep_log logs each retry attempt to your logger automatically — no manual print calls needed in production.

Cap your max_wait value. Without a ceiling, a cascade of failures can produce waits of minutes or hours, leaving your pipeline stalled with no obvious sign of activity. A maximum of 60 seconds is reasonable for most HTTP APIs; increase it only for operations with inherently long recovery times.

Configuring the retry predicate

retry_if_exception_type is the most common predicate, but tenacity offers others:

Predicate	Retries when
`retry_if_exception_type(T)`	exception is instance of `T`
`retry_if_result(predicate)`	return value fails `predicate(result)`
`retry_if_exception_message(match=…)`	exception message matches a pattern

Combining predicates with | lets you retry on a 429 status code or a timeout without needing a custom exception hierarchy.

Where to go next

Next: lab — harden a pipeline — put idempotency, atomic writes, and retry logic together to harden a brittle three-step pipeline that currently fails on almost every run.

Finished reading? Mark it complete to track your progress.

The setup

Using the real tenacity library

Configuring the retry predicate

Where to go next

On this page