Your agent knows what you put in the prompt and nothing newer than its training cutoff. What do you add so it can answer questions about last week's news?
Swap in a search-capable model? Or call a search API first and stuff the results into the prompt?
The second. search(query, count=n) returns a list of {title, url, snippet} dicts — fresh web results. Join the snippets into a context string, inject it as system_prompt, and the agent answers using that live context:
results = search(question, count=3)
context = chr(10).join(r["snippet"] for r in results)
agent = Agent(model, system_prompt=f"Use these sources:{chr(10)}{context}")
reply = agent.run_sync(question).outputSo RAG is just "fetch context, prepend it to the prompt, call the agent"? The whole pattern fits in four lines?
That is the minimum viable RAG. No vector database, no embeddings — just retrieval plus injection. The full function:
def rag_answer(question: str) -> str:
results = search(question, count=3)
context = chr(10).join(r["snippet"] for r in results)
agent = Agent(model, system_prompt=f"Use these sources:{chr(10)}{context}")
reply = agent.run_sync(question).output
return replyDoes the model really use the sources, or does it just fall back to training data when the answer is easy?
The system_prompt strongly biases the reply toward the provided snippets, especially for current events where training data has nothing. For safety-critical RAG you add a line like "Only use the sources provided" to constrain further, but the core move — retrieve, inject, ask — stays the same.
So this is the upgrade to yesterday's agent_with_context: the context is not static facts anymore, it is fresh search results pulled live per question?
Fresh every call. Retrieval makes the agent as current as the web. Same injection pattern, new source of memory, big jump in answer quality.
TL;DR: search(...) feeds snippets into system_prompt; the agent answers with fresh context.
search(query, count=n) — live web results, returns a list of dictschr(10).join(r["snippet"] for r in results)system_prompt — injects the fresh context| Source | Freshness |
|---|---|
| Static facts | as old as the code |
search() snippets | as old as the web |
RAG-lite is the smallest shape that lets an agent answer about the world as it is today.
Your agent knows what you put in the prompt and nothing newer than its training cutoff. What do you add so it can answer questions about last week's news?
Swap in a search-capable model? Or call a search API first and stuff the results into the prompt?
The second. search(query, count=n) returns a list of {title, url, snippet} dicts — fresh web results. Join the snippets into a context string, inject it as system_prompt, and the agent answers using that live context:
results = search(question, count=3)
context = chr(10).join(r["snippet"] for r in results)
agent = Agent(model, system_prompt=f"Use these sources:{chr(10)}{context}")
reply = agent.run_sync(question).outputSo RAG is just "fetch context, prepend it to the prompt, call the agent"? The whole pattern fits in four lines?
That is the minimum viable RAG. No vector database, no embeddings — just retrieval plus injection. The full function:
def rag_answer(question: str) -> str:
results = search(question, count=3)
context = chr(10).join(r["snippet"] for r in results)
agent = Agent(model, system_prompt=f"Use these sources:{chr(10)}{context}")
reply = agent.run_sync(question).output
return replyDoes the model really use the sources, or does it just fall back to training data when the answer is easy?
The system_prompt strongly biases the reply toward the provided snippets, especially for current events where training data has nothing. For safety-critical RAG you add a line like "Only use the sources provided" to constrain further, but the core move — retrieve, inject, ask — stays the same.
So this is the upgrade to yesterday's agent_with_context: the context is not static facts anymore, it is fresh search results pulled live per question?
Fresh every call. Retrieval makes the agent as current as the web. Same injection pattern, new source of memory, big jump in answer quality.
TL;DR: search(...) feeds snippets into system_prompt; the agent answers with fresh context.
search(query, count=n) — live web results, returns a list of dictschr(10).join(r["snippet"] for r in results)system_prompt — injects the fresh context| Source | Freshness |
|---|---|
| Static facts | as old as the code |
search() snippets | as old as the web |
RAG-lite is the smallest shape that lets an agent answer about the world as it is today.
Create a free account to get started. Paid plans unlock all tracks.