Delete Set public Set private Add tags Delete tags
  Add tag   Cancel
  Delete tag   Cancel
  • • DevOps notes •
  •  
  • AI
  • Tags
  • Login

Implementing Retries and Timeouts/shaare/-hYM4Q

  • python
  • python

Implementing Retries and Timeouts

  • External services can be slow or unreliable, causing scripts to hang or fail unexpectedly.
  • Timeouts and retries help ensure your automation scripts remain responsive and resilient.

Timeouts

  • By default, requests may wait indefinitely for a response, which is risky in automation.
  • Use the timeout parameter with a single value for both connect and read, or a tuple (connect, read) for fine-grained control.
  • A ConnectTimeout is raised if the connection can’t be established in time; a ReadTimeout is raised if data stops arriving within the read timeout.
HTTPBIN_ENDPOINT = "https://httpbin.org"
import requests
import time

delay_url = f"{HTTPBIN_ENDPOINT}/delay/5" # Simulate a 5-second delay

start = time.perf_counter()

try:
    res = requests.get(delay_url, timeout=2)
    print(f"Completed in {time.perf_counter() - start:.2f}s, status {response.status_code}")
except (
    requests.exceptions.ConnectTimeout,
    requests.exceptions.ReadTimeout
) as timeout_err:
    print(f"Timeout after {time.perf_counter() - start:.2f}s: {timeout_err}")

Retries

  • Transient issues like network blips or server overloads may cause requests to fail temporarily.
  • Implement a simple retry loop that catches errors, retries on server-side (5xx) errors or network exceptions, and breaks on success or client errors.
  • Use a fixed delay between retries for simplicity, or an exponential backoff for a more robust approach.
  • Avoid retrying non-idempotent operations.
import requests
import time

flaky_url = f"{HTTPBIN_ENDPOINT}/status/200,500,503"

max_retries = 3
delay = 2

for attempt in range(1, max_retries + 1):
    print(f"Attempt {attempt}/{max_retries}...")

    try:
        res = requests.get(flaky_url, timeout=10)
        res.raise_for_status()
        print(f"Succeeded with status {res.status_code}")
        break
    except requests.exceptions.HTTPError as err:
        if err.response.status_code < 500:
            print(f"Failed with client error code {err.response.status_code}. Skipping retry.")
            break
        else:
            print(f"Failed with server error code {err.response.status_code}.")
    if attempt < max_retries:
        print(f"Waiting {delay}s before retry...")
        time.sleep(delay)
else:
    print(f"All {max_retries} attempts failed!")

Exponential Backoff with Jitter

  • Fixed delays can overwhelm a recovering server if many clients retry simultaneously.
  • Exponential backoff increases the wait time after each failure (e.g., 1s, 2s, 4s...).
  • Adding jitter (a small random offset) prevents synchronized retry spikes.
import requests
import time
import random

def get_with_backoff(url, max_retries=3):
    delay=1

    for attempt in range(1, max_retries + 1):
        print(f"Attempt {attempt}/{max_retries}...")

        try:
            res = requests.get(url, timeout=10)
            res.raise_for_status()
            print(f"Succeeded with status {res.status_code}")
            return res
        except requests.exceptions.HTTPError as err:
            if err.response.status_code < 500:
                print(f"Failed with client error code {err.response.status_code}. Skipping retry.")
                raise RuntimeError(f"Client error! Please review request.")
            else:
                jitter = random.uniform(-0.1 * delay, 0.1 * delay)
                # delay = 1 -> jitter [-0.1, 0.1] -> 0.9 and 1.1s
                # delay = 2 -> jitter [-0.2, 0.2] -> 1.8 and 2.2s
                # delay = 4 -> jitter [-0.4, 0.4] -> 3.6 and 4.4s
                wait = min(delay * 2, 30) + jitter
                print(f"  Failed with server error code {err.response.status_code}. Retrying in {wait:.2f}s")
                time.sleep(wait)
                delay = min(delay * 2, 30)
    raise RuntimeError(f"All retries to query {url} failed!")

try:
    res = get_with_backoff(
        f"{HTTPBIN_ENDPOINT}/status/503",
        max_retries=4
    )
except RuntimeError as e:
    print(e)

Common Pitfalls & How to Avoid Them

  • Forgetting to set timeouts can cause scripts to hang indefinitely; always use timeout.
  • Retrying client errors (4xx) usually won’t help; only retry transient server errors (5xx) or network issues.
  • Retrying non-idempotent operations (e.g., POST) can cause duplicate actions; limit retries to safe methods.
  • Fixed retry delays can lead to synchronized retry spikes; use exponential backoff with jitter for production scenarios.
    python
1 month ago Permalink
cluster icon
  • Logging to Files : Logging to Files Basic File Logging with FileHandler Use logging.FileHandler to write log records to a file. mode='a' (append) preserves existing log...
  • Running Python modules : Running Scripts: python -m vs. python file.py The Core Difference: What is "Entry Point Zero"? The key to understanding the difference lies in the fir...
  • Fixtures in Pytest : Fixtures in Pytest As tests grow more complex, repeating setup and cleanup steps makes tests harder to read and maintain. Pytest fixtures allow centr...
  • Generators and Lazy Pipelines : Generators and Lazy Pipelines You can chain generator functions to form multi-stage data pipelines that process items one at a time. No intermediat...
  • Working with YAML files : Working with YAML files YAML (“YAML Ain’t Markup Language”) focuses on human readability. Indentation replaces braces and brackets, comments are allo...


(97)
Filter untagged links
Fold Fold all Expand Expand all Are you sure you want to delete this link? Are you sure you want to delete this tag? The personal, minimalist, super-fast, database free, bookmarking service by the Shaarli community