Yesterday's search_the_web returns a plain text result. How do you extract a specific structured fact from that text?
A second agent with result_type=Fact — define a Pydantic model with the fields I need, run it on the search result text. Same pattern as Day 10's extract_contact but on search output instead of a bibliography entry.
Exactly. search_the_web produces the raw text; a structured extraction agent pulls out the typed fields. class Fact(BaseModel): fact: str; source: str and a result_type=Fact agent:
def research_and_extract(topic: str) -> dict:
search_result = search_the_web(topic)
class Fact(BaseModel):
fact: str
source: str
agent = Agent(model, result_type=Fact)
result = agent.run_sync(f"Extract one key fact and its source from: {search_result}")
return result.output.model_dump()The search result might be very long. Does the extraction agent handle long input well?
Extraction agents handle long input — but precision drops if the input is very noisy. For the capstone, the search result is typically a synthesised summary of 200–400 words, which is well within reliable extraction range. If you need to extract from very long documents, add a summarisation step before extraction:
def research_and_extract(topic: str) -> dict:
search_result = search_the_web(topic)
class Fact(BaseModel):
fact: str
source: str
agent = Agent(model, result_type=Fact)
result = agent.run_sync(f"Extract one key fact and its source from: {search_result}")
data = result.output.model_dump()
print(f"Extracted: {data}")
return dataSearch + extract = a two-step research pipeline. I ask a research question, the first agent searches the web, the second agent extracts a typed fact. I get a structured dict I can use directly in a table.
That two-step pattern is the foundation of Day 28's capstone — which extracts a list of three Finding objects with an additional year field. The code structure is identical; only the Pydantic model changes.
My thesis methodology section will have a footnote: 'Initial literature triage performed with a two-agent pipeline.' That's going to be a very interesting defence conversation.
Disclose it accurately: 'AI-assisted search and extraction, validated against source documents.' That's defensible methodology in 2026.
search_result = search_the_web(topic)
class Fact(BaseModel):
fact: str
source: str
agent = Agent(model, result_type=Fact)
result = agent.run_sync(f"Extract the key fact and source from: {search_result}")
return result.output.model_dump()Step 1: search_the_web(topic) returns a plain-text search result. Step 2: a structured extraction agent reads that text and returns a validated Pydantic object.
Separation of concerns. The search agent finds information; the extraction agent formats it. Each is independently testable.
result_type=Fact guarantees the extraction returns a dict with fact and source keys. No string parsing needed — the API validates the shape before returning.
Yesterday's search_the_web returns a plain text result. How do you extract a specific structured fact from that text?
A second agent with result_type=Fact — define a Pydantic model with the fields I need, run it on the search result text. Same pattern as Day 10's extract_contact but on search output instead of a bibliography entry.
Exactly. search_the_web produces the raw text; a structured extraction agent pulls out the typed fields. class Fact(BaseModel): fact: str; source: str and a result_type=Fact agent:
def research_and_extract(topic: str) -> dict:
search_result = search_the_web(topic)
class Fact(BaseModel):
fact: str
source: str
agent = Agent(model, result_type=Fact)
result = agent.run_sync(f"Extract one key fact and its source from: {search_result}")
return result.output.model_dump()The search result might be very long. Does the extraction agent handle long input well?
Extraction agents handle long input — but precision drops if the input is very noisy. For the capstone, the search result is typically a synthesised summary of 200–400 words, which is well within reliable extraction range. If you need to extract from very long documents, add a summarisation step before extraction:
def research_and_extract(topic: str) -> dict:
search_result = search_the_web(topic)
class Fact(BaseModel):
fact: str
source: str
agent = Agent(model, result_type=Fact)
result = agent.run_sync(f"Extract one key fact and its source from: {search_result}")
data = result.output.model_dump()
print(f"Extracted: {data}")
return dataSearch + extract = a two-step research pipeline. I ask a research question, the first agent searches the web, the second agent extracts a typed fact. I get a structured dict I can use directly in a table.
That two-step pattern is the foundation of Day 28's capstone — which extracts a list of three Finding objects with an additional year field. The code structure is identical; only the Pydantic model changes.
My thesis methodology section will have a footnote: 'Initial literature triage performed with a two-agent pipeline.' That's going to be a very interesting defence conversation.
Disclose it accurately: 'AI-assisted search and extraction, validated against source documents.' That's defensible methodology in 2026.
search_result = search_the_web(topic)
class Fact(BaseModel):
fact: str
source: str
agent = Agent(model, result_type=Fact)
result = agent.run_sync(f"Extract the key fact and source from: {search_result}")
return result.output.model_dump()Step 1: search_the_web(topic) returns a plain-text search result. Step 2: a structured extraction agent reads that text and returns a validated Pydantic object.
Separation of concerns. The search agent finds information; the extraction agent formats it. Each is independently testable.
result_type=Fact guarantees the extraction returns a dict with fact and source keys. No string parsing needed — the API validates the shape before returning.
Create a free account to get started. Paid plans unlock all tracks.