Week 1's agents return strings. You want to extract the author name and email from a bibliography entry and use them in a spreadsheet. What's the problem with a string response?
I'd have to parse the string — find where the name ends and the email starts. The model might format it differently each time. I need a structured object with named fields.
Exactly. Define a Pydantic BaseModel with your fields, pass it as result_type, and the API enforces that shape. You get a typed Python object back — result.output.name gives you the name string directly:
class Contact(BaseModel):
name: str
email: str
def extract_contact(text: str) -> dict:
agent = Agent(model, result_type=Contact)
result = agent.run_sync(text)
return result.output.model_dump()Where does BaseModel come from? I didn't see an import in the preamble.
The sandbox preamble injects BaseModel (from Pydantic), Agent, and model before your code runs. Same as Agent and model in Week 1 — you use them without importing them. Define the class in the function body or at module level, then pass it to Agent:
def extract_contact(text: str) -> dict:
class Contact(BaseModel):
name: str
email: str
agent = Agent(model, result_type=Contact)
result = agent.run_sync(text)
data = result.output.model_dump()
print(f"Extracted: {data}")
return dataresult.output is a Contact object — I can access .name and .email directly. And .model_dump() converts it to a dict for easy use in a spreadsheet or database.
That's the pipeline pattern. Every step that needs structured data uses a Pydantic model. The dict output from .model_dump() feeds directly into a DataFrame, a CSV row, or another function.
My bibliography has 50 entries. I could extract every author and email into a clean spreadsheet in under a minute.
Structured output enforces the shape — but the model can still hallucinate a field value. If no email is in the text, the model may invent one. Validate a sample of your extractions manually before trusting the output in your research data.
result_type with Pydantic BaseModelclass Contact(BaseModel):
name: str
email: str
agent = Agent(model, result_type=Contact)
result = agent.run_sync(text)
return result.output.model_dump()result_type=Contact tells the API to return a validated object, not free text. The model must produce JSON that matches the schema — missing or extra fields cause a validation error.
.model_dump()Converts the Pydantic object to a plain Python dict. Use it when you need JSON-serialisable output for downstream processing, logging, or spreadsheet export.
The field names and types are the extraction spec. Rename fields to change what you extract — full_name, email_address, institution for a richer contact card.
Week 1's agents return strings. You want to extract the author name and email from a bibliography entry and use them in a spreadsheet. What's the problem with a string response?
I'd have to parse the string — find where the name ends and the email starts. The model might format it differently each time. I need a structured object with named fields.
Exactly. Define a Pydantic BaseModel with your fields, pass it as result_type, and the API enforces that shape. You get a typed Python object back — result.output.name gives you the name string directly:
class Contact(BaseModel):
name: str
email: str
def extract_contact(text: str) -> dict:
agent = Agent(model, result_type=Contact)
result = agent.run_sync(text)
return result.output.model_dump()Where does BaseModel come from? I didn't see an import in the preamble.
The sandbox preamble injects BaseModel (from Pydantic), Agent, and model before your code runs. Same as Agent and model in Week 1 — you use them without importing them. Define the class in the function body or at module level, then pass it to Agent:
def extract_contact(text: str) -> dict:
class Contact(BaseModel):
name: str
email: str
agent = Agent(model, result_type=Contact)
result = agent.run_sync(text)
data = result.output.model_dump()
print(f"Extracted: {data}")
return dataresult.output is a Contact object — I can access .name and .email directly. And .model_dump() converts it to a dict for easy use in a spreadsheet or database.
That's the pipeline pattern. Every step that needs structured data uses a Pydantic model. The dict output from .model_dump() feeds directly into a DataFrame, a CSV row, or another function.
My bibliography has 50 entries. I could extract every author and email into a clean spreadsheet in under a minute.
Structured output enforces the shape — but the model can still hallucinate a field value. If no email is in the text, the model may invent one. Validate a sample of your extractions manually before trusting the output in your research data.
result_type with Pydantic BaseModelclass Contact(BaseModel):
name: str
email: str
agent = Agent(model, result_type=Contact)
result = agent.run_sync(text)
return result.output.model_dump()result_type=Contact tells the API to return a validated object, not free text. The model must produce JSON that matches the schema — missing or extra fields cause a validation error.
.model_dump()Converts the Pydantic object to a plain Python dict. Use it when you need JSON-serialisable output for downstream processing, logging, or spreadsheet export.
The field names and types are the extraction spec. Rename fields to change what you extract — full_name, email_address, institution for a richer contact card.
Create a free account to get started. Paid plans unlock all tracks.