Day 13's summarize_and_classify returned just the classification label. What if the caller also needs the intermediate summary?
Return a dict instead of a string — {'summary': summary_str, 'urgency': urgency_str}. Both the intermediate and final outputs are useful to the caller.
Exactly. The pipeline logic is identical to Day 13 — summarise first, classify the summary. The only change is the return shape: a dict instead of a bare string:
def ai_pipeline(text: str) -> dict:
summary = summarize_text(text)
agent = Agent(model, result_type=Literal["high", "medium", "low"])
urgency = agent.run_sync(summary).output
return {"summary": summary, "urgency": urgency}Why classify urgency on a summary instead of the original text?
The same reason as Day 13 — the summariser reduces noise. A 300-word response with scattered urgency signals is harder to classify reliably than a clean two-sentence summary that focuses on the core claim. The summariser acts as a noise filter before the classifier sees the text:
def ai_pipeline(text: str) -> dict:
summary = summarize_text(text)
agent = Agent(model, result_type=Literal["high", "medium", "low"])
urgency = agent.run_sync(summary).output
print(f"Summary: {summary[:50]}... → {urgency}")
return {"summary": summary, "urgency": urgency}The dict output is perfect for a spreadsheet — each key becomes a column. I can batch this over 200 responses and get a DataFrame with summary and urgency columns.
That's the intended downstream use. The pipeline returns typed, structured data at every step. The dict makes both the intermediate and final outputs available for QA — you can review the summaries to validate that the urgency labels are reasonable.
I've been manually building urgency-labelled datasets for my methodology section. This function does it automatically and gives me the summary to verify each label.
Verify a stratified sample — 10 from each urgency level. That's a defensible validation methodology for AI-assisted coding, and it protects your thesis if anyone questions the approach.
def ai_pipeline(text: str) -> dict:
summary = summarize_text(text)
agent = Agent(model, result_type=Literal["high", "medium", "low"])
urgency = agent.run_sync(summary).output
return {"summary": summary, "urgency": urgency}Return a dict when callers may need intermediate outputs for QA or downstream use. The summary lets you verify that the urgency label is reasonable — a one-line sanity check without an extra API call.
Review a stratified sample — 10 from each urgency level — to validate AI-assisted coding. That's a defensible methodology for research contexts where the classification process must be documented.
Pass this function to a list comprehension over 200 responses and collect a list of dicts, then load into a DataFrame: pd.DataFrame(results).
Day 13's summarize_and_classify returned just the classification label. What if the caller also needs the intermediate summary?
Return a dict instead of a string — {'summary': summary_str, 'urgency': urgency_str}. Both the intermediate and final outputs are useful to the caller.
Exactly. The pipeline logic is identical to Day 13 — summarise first, classify the summary. The only change is the return shape: a dict instead of a bare string:
def ai_pipeline(text: str) -> dict:
summary = summarize_text(text)
agent = Agent(model, result_type=Literal["high", "medium", "low"])
urgency = agent.run_sync(summary).output
return {"summary": summary, "urgency": urgency}Why classify urgency on a summary instead of the original text?
The same reason as Day 13 — the summariser reduces noise. A 300-word response with scattered urgency signals is harder to classify reliably than a clean two-sentence summary that focuses on the core claim. The summariser acts as a noise filter before the classifier sees the text:
def ai_pipeline(text: str) -> dict:
summary = summarize_text(text)
agent = Agent(model, result_type=Literal["high", "medium", "low"])
urgency = agent.run_sync(summary).output
print(f"Summary: {summary[:50]}... → {urgency}")
return {"summary": summary, "urgency": urgency}The dict output is perfect for a spreadsheet — each key becomes a column. I can batch this over 200 responses and get a DataFrame with summary and urgency columns.
That's the intended downstream use. The pipeline returns typed, structured data at every step. The dict makes both the intermediate and final outputs available for QA — you can review the summaries to validate that the urgency labels are reasonable.
I've been manually building urgency-labelled datasets for my methodology section. This function does it automatically and gives me the summary to verify each label.
Verify a stratified sample — 10 from each urgency level. That's a defensible validation methodology for AI-assisted coding, and it protects your thesis if anyone questions the approach.
def ai_pipeline(text: str) -> dict:
summary = summarize_text(text)
agent = Agent(model, result_type=Literal["high", "medium", "low"])
urgency = agent.run_sync(summary).output
return {"summary": summary, "urgency": urgency}Return a dict when callers may need intermediate outputs for QA or downstream use. The summary lets you verify that the urgency label is reasonable — a one-line sanity check without an extra API call.
Review a stratified sample — 10 from each urgency level — to validate AI-assisted coding. That's a defensible methodology for research contexts where the classification process must be documented.
Pass this function to a list comprehension over 200 responses and collect a list of dicts, then load into a DataFrame: pd.DataFrame(results).
Create a free account to get started. Paid plans unlock all tracks.