Four weeks of building blocks. Search, extraction, structured output, batch processing. What would it take to turn a research question into a formatted literature summary in one function call?
Chain everything. Run the Perplexity search agent on the topic, extract a Finding Pydantic model from each result — claim, authors, year, journal — then format all findings into a markdown mini-review grouped by year. Return the markdown string.
That's the capstone. Three stages, one function:
from pydantic import BaseModel
class Finding(BaseModel):
claim: str
authors: str
year: int
journal: str
search_result = Agent(model).run_sync(f"List the 5 most recent papers on {topic}").output
extract_agent = Agent(model, result_type=list[Finding])
findings = extract_agent.run_sync(search_result).output
lines = [f"**{f.year}** — {f.authors}: {f.claim}" for f in findings]Splitting on "\n\n" assumes the search result separates papers by blank lines. What if the format is different?
Good catch. The Perplexity response format varies. A more robust approach: run the extraction agent on the full search result and use result_type=list[Finding] to extract all findings at once. The model parses whatever format the search returns. That's the production refinement — today's version demonstrates the pipeline architecture; robustness comes with iteration.
So the capstone is the skeleton and the robustness is the research iteration. Ship the pipeline, then improve the parsing.
That's how research tools get built. The pipeline runs; the edge cases surface in production; you fix them one at a time.
Research question in, structured markdown literature summary out. This is the tool I'd have paid for at the start of my PhD — and I built it in four weeks.
The final return ties it together:
sorted_lines = sorted(lines, reverse=True)
return "## Literature Summary" + chr(10) + chr(10) + chr(10).join(sorted_lines)Verify sources before citing them. The model extracts from training and search — authors, years, and journal names can contain hallucinations. Use it as a discovery tool; confirm details in the originals before your methods section.
from pydantic import BaseModel
class Finding(BaseModel):
claim: str
authors: str
year: int
journal: str| Stage | Code | Output |
|---|---|---|
| 1. Search | Agent(model).run_sync(topic_query).output | Long string from Perplexity |
| 2. Extract | Agent(model, result_type=list[Finding]).run_sync(text).output | list[Finding] |
| 3. Format | Sort by year, join with markdown headers | Markdown string |
Tip: Using result_type=list[Finding] on the full search result is more robust than splitting on blank lines — the model parses whatever format the search returns.
Verify sources before citing in your methods section — the model can hallucinate author names, years, or journal names.
Four weeks of building blocks. Search, extraction, structured output, batch processing. What would it take to turn a research question into a formatted literature summary in one function call?
Chain everything. Run the Perplexity search agent on the topic, extract a Finding Pydantic model from each result — claim, authors, year, journal — then format all findings into a markdown mini-review grouped by year. Return the markdown string.
That's the capstone. Three stages, one function:
from pydantic import BaseModel
class Finding(BaseModel):
claim: str
authors: str
year: int
journal: str
search_result = Agent(model).run_sync(f"List the 5 most recent papers on {topic}").output
extract_agent = Agent(model, result_type=list[Finding])
findings = extract_agent.run_sync(search_result).output
lines = [f"**{f.year}** — {f.authors}: {f.claim}" for f in findings]Splitting on "\n\n" assumes the search result separates papers by blank lines. What if the format is different?
Good catch. The Perplexity response format varies. A more robust approach: run the extraction agent on the full search result and use result_type=list[Finding] to extract all findings at once. The model parses whatever format the search returns. That's the production refinement — today's version demonstrates the pipeline architecture; robustness comes with iteration.
So the capstone is the skeleton and the robustness is the research iteration. Ship the pipeline, then improve the parsing.
That's how research tools get built. The pipeline runs; the edge cases surface in production; you fix them one at a time.
Research question in, structured markdown literature summary out. This is the tool I'd have paid for at the start of my PhD — and I built it in four weeks.
The final return ties it together:
sorted_lines = sorted(lines, reverse=True)
return "## Literature Summary" + chr(10) + chr(10) + chr(10).join(sorted_lines)Verify sources before citing them. The model extracts from training and search — authors, years, and journal names can contain hallucinations. Use it as a discovery tool; confirm details in the originals before your methods section.
from pydantic import BaseModel
class Finding(BaseModel):
claim: str
authors: str
year: int
journal: str| Stage | Code | Output |
|---|---|---|
| 1. Search | Agent(model).run_sync(topic_query).output | Long string from Perplexity |
| 2. Extract | Agent(model, result_type=list[Finding]).run_sync(text).output | list[Finding] |
| 3. Format | Sort by year, join with markdown headers | Markdown string |
Tip: Using result_type=list[Finding] on the full search result is more robust than splitting on blank lines — the model parses whatever format the search returns.
Verify sources before citing in your methods section — the model can hallucinate author names, years, or journal names.
Create a free account to get started. Paid plans unlock all tracks.