In summarize_then_classify you chained two agents and got a string back each time. What happens when the value you need isn't a sentence — it's a name and an email address pulled from a paragraph of text?
You'd prompt the agent to return both, but it might format them differently every time — sometimes just the name first, sometimes with labels, sometimes comma-separated. Parsing that is a nightmare.
Exactly the problem. A string output shifts the parsing burden onto you. A Pydantic model shifts it onto the agent — and onto the type system. You define the shape you want, pass it as result_type, and the agent is contractually required to return that structure:
from pydantic import BaseModel
class Contact(BaseModel):
name: str
email: str
agent = Agent(model, result_type=Contact)
result = agent.run_sync(text)
print(result.output.name, result.output.email)So the LLM literally has to return this exact shape? What if it can't find an email in the text?
PydanticAI re-prompts the model until the output validates — or raises after retries. For well-formed inputs like email signatures or intro paragraphs, it works reliably. The field names in your model act as instructions: name and email tell the model what to look for. Once you have a validated Contact instance, .model_dump() converts it to a plain dict your code can use:
def extract_contact(text: str) -> dict:
result = Agent(model, result_type=Contact).run_sync(text)
return result.output.model_dump()That's it? The model reads the field names, fills them in, and I get {"name": "...", "email": "..."} back without writing a single regex?
No regex. No split-on-comma. No brittle string parsing. The model handles the extraction; Pydantic handles the validation. If you need a third field later — company, phone — you add one line to the class and the agent adapts automatically.
I spent half a sprint writing a regex extractor for business card data. I could have defined two fields and called it done.
That regex breaks on every new format. The model-based extractor generalises because language models already understand what a name and an email are — you're just giving it a schema to fill.
result_type Beats String ParsingWhen you pass result_type=Contact to Agent, PydanticAI tells the underlying model to return JSON matching the model's schema. It validates the response and re-prompts on failure — you never see a malformed output.
The pattern in full:
from pydantic import BaseModel
class Contact(BaseModel):
name: str
email: str
result = Agent(model, result_type=Contact).run_sync(text)
return result.output.model_dump()result.output is a Contact instance (not a string). .model_dump() converts it to {"name": "...", "email": "..."} — a plain dict your code can index, store, or pass to the next step.
Field names matter: The Pydantic field names (name, email) become part of the prompt schema the model sees. Clear, conventional names improve extraction accuracy.
Common mistake: Calling result.output and expecting a string — it's a Contact. Always chain .model_dump() when your function signature returns dict.
In summarize_then_classify you chained two agents and got a string back each time. What happens when the value you need isn't a sentence — it's a name and an email address pulled from a paragraph of text?
You'd prompt the agent to return both, but it might format them differently every time — sometimes just the name first, sometimes with labels, sometimes comma-separated. Parsing that is a nightmare.
Exactly the problem. A string output shifts the parsing burden onto you. A Pydantic model shifts it onto the agent — and onto the type system. You define the shape you want, pass it as result_type, and the agent is contractually required to return that structure:
from pydantic import BaseModel
class Contact(BaseModel):
name: str
email: str
agent = Agent(model, result_type=Contact)
result = agent.run_sync(text)
print(result.output.name, result.output.email)So the LLM literally has to return this exact shape? What if it can't find an email in the text?
PydanticAI re-prompts the model until the output validates — or raises after retries. For well-formed inputs like email signatures or intro paragraphs, it works reliably. The field names in your model act as instructions: name and email tell the model what to look for. Once you have a validated Contact instance, .model_dump() converts it to a plain dict your code can use:
def extract_contact(text: str) -> dict:
result = Agent(model, result_type=Contact).run_sync(text)
return result.output.model_dump()That's it? The model reads the field names, fills them in, and I get {"name": "...", "email": "..."} back without writing a single regex?
No regex. No split-on-comma. No brittle string parsing. The model handles the extraction; Pydantic handles the validation. If you need a third field later — company, phone — you add one line to the class and the agent adapts automatically.
I spent half a sprint writing a regex extractor for business card data. I could have defined two fields and called it done.
That regex breaks on every new format. The model-based extractor generalises because language models already understand what a name and an email are — you're just giving it a schema to fill.
result_type Beats String ParsingWhen you pass result_type=Contact to Agent, PydanticAI tells the underlying model to return JSON matching the model's schema. It validates the response and re-prompts on failure — you never see a malformed output.
The pattern in full:
from pydantic import BaseModel
class Contact(BaseModel):
name: str
email: str
result = Agent(model, result_type=Contact).run_sync(text)
return result.output.model_dump()result.output is a Contact instance (not a string). .model_dump() converts it to {"name": "...", "email": "..."} — a plain dict your code can index, store, or pass to the next step.
Field names matter: The Pydantic field names (name, email) become part of the prompt schema the model sees. Clear, conventional names improve extraction accuracy.
Common mistake: Calling result.output and expecting a string — it's a Contact. Always chain .model_dump() when your function signature returns dict.
Create a free account to get started. Paid plans unlock all tracks.