Network calls flake. A request that succeeds 99 times out of 100 will fail eventually if you run it daily. The fix isn't to crash on the bad day — it's to retry.
The pattern: try the call. If it raises an exception, wait a bit, try again. Cap the number of attempts so you don't loop forever:
import time
last_error = None
for attempt in range(1, 4): # 3 attempts
try:
result = toolset.execute_action(Action.GMAIL_FETCH_EMAILS, {"max_results": 1})
break # success — exit the loop
except Exception as e:
last_error = e
print(f"attempt {attempt} failed: {type(e).__name__}")
time.sleep(2 ** attempt) # 2s, 4s, 8s — exponential backoff
else:
raise last_error # all attempts failed — re-raise the last one
print("got result")Why exponential — 2, 4, 8 — instead of fixed 2 seconds each time?
Backoff. If the service is overloaded, hammering it every 2 seconds makes it worse. Doubling the wait gives the service room to recover. The first retry is fast (cheap to retry on a transient blip); successive retries get patient.
And for / else?
Python quirk: an else on a for loop runs only if the loop completed without break. Here, success inside the try calls break. If no attempt succeeds, no break, so else runs — and we re-raise. Idiomatic for retry loops.
import time
last_error = None
for attempt in range(1, MAX_ATTEMPTS + 1):
try:
result = SOME_CALL()
break
except Exception as e:
last_error = e
print(f"attempt {attempt} failed: {type(e).__name__}")
time.sleep(2 ** attempt)
else:
raise last_errorNot every error should be retried. Retry helps when the failure is transient — likely to clear up if you wait. Don't retry on errors that would fail the same way every time.
| Error | Retry? | Why |
|---|---|---|
ConnectionError, TimeoutError | yes | Network blip — usually clears |
| 429 Too Many Requests | yes | Rate-limited — wait then try again |
| 500 / 502 / 503 / 504 | yes | Service problem — usually clears |
| 401 Unauthorized | no | Credentials are wrong — won't fix itself |
| 403 Forbidden | no | Permission missing — won't fix itself |
| 400 Bad Request | no | Your call is malformed — fix the call |
| 404 Not Found | no | The thing doesn't exist — fix your reference |
When you can, catch the specific transient errors and re-raise everything else immediately. That avoids retrying on bugs.
3 is a sensible default. More than that and you're masking a real problem.
print(f"attempt {attempt} failed: {type(e).__name__}") lets you tell, after the fact, whether your script succeeded on attempt 1 (clean run) or attempt 3 (something is wrong, look). Without it you can't see retry pattern at all.
time.sleep(2) each retry) — fine for local testing, hammers an overloaded service.2, 4, 8) — gives the service room to recover.time.sleep(2 ** attempt + random.uniform(0, 1))) — when many clients are retrying simultaneously, jitter prevents synchronised retry storms. Production-grade.Network calls flake. A request that succeeds 99 times out of 100 will fail eventually if you run it daily. The fix isn't to crash on the bad day — it's to retry.
The pattern: try the call. If it raises an exception, wait a bit, try again. Cap the number of attempts so you don't loop forever:
import time
last_error = None
for attempt in range(1, 4): # 3 attempts
try:
result = toolset.execute_action(Action.GMAIL_FETCH_EMAILS, {"max_results": 1})
break # success — exit the loop
except Exception as e:
last_error = e
print(f"attempt {attempt} failed: {type(e).__name__}")
time.sleep(2 ** attempt) # 2s, 4s, 8s — exponential backoff
else:
raise last_error # all attempts failed — re-raise the last one
print("got result")Why exponential — 2, 4, 8 — instead of fixed 2 seconds each time?
Backoff. If the service is overloaded, hammering it every 2 seconds makes it worse. Doubling the wait gives the service room to recover. The first retry is fast (cheap to retry on a transient blip); successive retries get patient.
And for / else?
Python quirk: an else on a for loop runs only if the loop completed without break. Here, success inside the try calls break. If no attempt succeeds, no break, so else runs — and we re-raise. Idiomatic for retry loops.
import time
last_error = None
for attempt in range(1, MAX_ATTEMPTS + 1):
try:
result = SOME_CALL()
break
except Exception as e:
last_error = e
print(f"attempt {attempt} failed: {type(e).__name__}")
time.sleep(2 ** attempt)
else:
raise last_errorNot every error should be retried. Retry helps when the failure is transient — likely to clear up if you wait. Don't retry on errors that would fail the same way every time.
| Error | Retry? | Why |
|---|---|---|
ConnectionError, TimeoutError | yes | Network blip — usually clears |
| 429 Too Many Requests | yes | Rate-limited — wait then try again |
| 500 / 502 / 503 / 504 | yes | Service problem — usually clears |
| 401 Unauthorized | no | Credentials are wrong — won't fix itself |
| 403 Forbidden | no | Permission missing — won't fix itself |
| 400 Bad Request | no | Your call is malformed — fix the call |
| 404 Not Found | no | The thing doesn't exist — fix your reference |
When you can, catch the specific transient errors and re-raise everything else immediately. That avoids retrying on bugs.
3 is a sensible default. More than that and you're masking a real problem.
print(f"attempt {attempt} failed: {type(e).__name__}") lets you tell, after the fact, whether your script succeeded on attempt 1 (clean run) or attempt 3 (something is wrong, look). Without it you can't see retry pattern at all.
time.sleep(2) each retry) — fine for local testing, hammers an overloaded service.2, 4, 8) — gives the service room to recover.time.sleep(2 ** attempt + random.uniform(0, 1))) — when many clients are retrying simultaneously, jitter prevents synchronised retry storms. Production-grade.Create a free account to get started. Paid plans unlock all tracks.