You have a list of 5 short sentences and you want each classified as positive or negative. Same prompt shape, different input each call.
The for loop from Python L13 — body is the LLM call:
from pydantic_ai import Agent
sentences = [
"This is the best day ever.",
"I hate this.",
"It's fine, I guess.",
"Absolutely amazing.",
"Terrible experience.",
]
results = []
for s in sentences:
prompt = f'Classify the sentiment of this sentence as either "positive" or "negative". Reply with only that single word.\n\nSentence: {s}'
result = Agent(model).run_sync(prompt)
results.append(result.output.strip().lower())
for s, r in zip(sentences, results):
print(f"{r}: {s}")Five separate LLM calls. Five quota slots.
Right. Each call is one quota slot. A 5-item batch = 5 slots. A 100-item batch = 100 slots. Plan for it. We'll cover cost-aware batching in L26.
Could I cram all 5 sentences into one call?
Yes. Ask for a JSON array of 5 labels in one call. Cheaper (1 slot vs 5) but more failure-prone — the model might mis-count, return malformed JSON, or skip some. Trade-off: per-item calls are reliable; batch-in-prompt calls are cheap. Most production code mixes both — bulk in batches of 10-20, fall back to per-item if parsing fails.
results = []
for item in items:
prompt = TEMPLATE.format(item=item)
result = Agent(model).run_sync(prompt)
results.append(result.output)Clean. One call, one slot, per item. The output list aligns with the input list — same length, same order.
5 items = 5 slots. 100 items = 100 slots. If you have 10,000 items to classify, this is a lot of API calls. Two mitigations:
Use it when:
Move to bulk-in-prompt when:
results = []
for item in items:
try:
result = Agent(model).run_sync(make_prompt(item))
results.append(result.output)
except Exception as e:
results.append(None)
print(f"failed on item: {item} ({e})")The whole batch shouldn't crash because one item failed. None placeholders preserve list alignment so you can retry just the missing slots later.
If you've already classified ["foo", "bar"] and run again with ["foo", "baz"], only "baz" needs a new call. Cache the lookup:
cache = {}
for item in items:
if item in cache:
results.append(cache[item])
continue
out = Agent(model).run_sync(...).output
cache[item] = out
results.append(out)Free quota saved per re-run. Composes with everything else.
You have a list of 5 short sentences and you want each classified as positive or negative. Same prompt shape, different input each call.
The for loop from Python L13 — body is the LLM call:
from pydantic_ai import Agent
sentences = [
"This is the best day ever.",
"I hate this.",
"It's fine, I guess.",
"Absolutely amazing.",
"Terrible experience.",
]
results = []
for s in sentences:
prompt = f'Classify the sentiment of this sentence as either "positive" or "negative". Reply with only that single word.\n\nSentence: {s}'
result = Agent(model).run_sync(prompt)
results.append(result.output.strip().lower())
for s, r in zip(sentences, results):
print(f"{r}: {s}")Five separate LLM calls. Five quota slots.
Right. Each call is one quota slot. A 5-item batch = 5 slots. A 100-item batch = 100 slots. Plan for it. We'll cover cost-aware batching in L26.
Could I cram all 5 sentences into one call?
Yes. Ask for a JSON array of 5 labels in one call. Cheaper (1 slot vs 5) but more failure-prone — the model might mis-count, return malformed JSON, or skip some. Trade-off: per-item calls are reliable; batch-in-prompt calls are cheap. Most production code mixes both — bulk in batches of 10-20, fall back to per-item if parsing fails.
results = []
for item in items:
prompt = TEMPLATE.format(item=item)
result = Agent(model).run_sync(prompt)
results.append(result.output)Clean. One call, one slot, per item. The output list aligns with the input list — same length, same order.
5 items = 5 slots. 100 items = 100 slots. If you have 10,000 items to classify, this is a lot of API calls. Two mitigations:
Use it when:
Move to bulk-in-prompt when:
results = []
for item in items:
try:
result = Agent(model).run_sync(make_prompt(item))
results.append(result.output)
except Exception as e:
results.append(None)
print(f"failed on item: {item} ({e})")The whole batch shouldn't crash because one item failed. None placeholders preserve list alignment so you can retry just the missing slots later.
If you've already classified ["foo", "bar"] and run again with ["foo", "baz"], only "baz" needs a new call. Cache the lookup:
cache = {}
for item in items:
if item in cache:
results.append(cache[item])
continue
out = Agent(model).run_sync(...).output
cache[item] = out
results.append(out)Free quota saved per re-run. Composes with everything else.
Create a free account to get started. Paid plans unlock all tracks.