Single-turn calls send one prompt and get one response. Real chat is multi-turn — alternating user and assistant messages, with each call seeing the whole history.
from pydantic_ai import Agent
from pydantic_ai.messages import ModelRequest, ModelResponse, UserPromptPart, TextPart
agent = Agent(model)
# turn 1
result1 = agent.run_sync("What's the capital of France?")
history = result1.all_messages()
# turn 2 — send the history along with the new question
result2 = agent.run_sync("What about Germany?", message_history=history)
print(result2.output) # "Berlin" — model knows we're asking about capitalsall_messages() returns the whole conversation so far?
Right. Each run_sync returns a result whose .all_messages() gives you the complete history including this turn's user input and the model's reply. Pass that history to the next call's message_history= and the model has full context.
Under the hood, every "chat" is a list of message objects:
[
{role: "user", content: "What's the capital of France?"},
{role: "assistant", content: "Paris"},
{role: "user", content: "What about Germany?"},
{role: "assistant", content: "Berlin"},
]
The model sees all of it on each call. It can reference earlier turns — "What about Germany?" only makes sense given the prior "capital of France?" turn.
agent = Agent(model)
# turn 1
result1 = agent.run_sync("<user prompt>")
history = result1.all_messages() # both user prompt + assistant response
# turn 2
result2 = agent.run_sync("<next user prompt>", message_history=history)
history = result2.all_messages() # extended with turn-2 user + assistant
# turn 3
result3 = agent.run_sync("<next user prompt>", message_history=history)
# ...The library serializes/deserializes the messages between turns. You only ever pass the history list; pydantic_ai handles the wire format.
Every call sends the full history. Token count = (sum of all prior turns) + (this turn). A 10-turn conversation costs more in token-input than a single 1-turn call. Watch the count when conversations get long.
For production multi-turn:
This lesson and day 16 keep history short (3 turns) so trimming isn't yet a concern.
Single-turn calls send one prompt and get one response. Real chat is multi-turn — alternating user and assistant messages, with each call seeing the whole history.
from pydantic_ai import Agent
from pydantic_ai.messages import ModelRequest, ModelResponse, UserPromptPart, TextPart
agent = Agent(model)
# turn 1
result1 = agent.run_sync("What's the capital of France?")
history = result1.all_messages()
# turn 2 — send the history along with the new question
result2 = agent.run_sync("What about Germany?", message_history=history)
print(result2.output) # "Berlin" — model knows we're asking about capitalsall_messages() returns the whole conversation so far?
Right. Each run_sync returns a result whose .all_messages() gives you the complete history including this turn's user input and the model's reply. Pass that history to the next call's message_history= and the model has full context.
Under the hood, every "chat" is a list of message objects:
[
{role: "user", content: "What's the capital of France?"},
{role: "assistant", content: "Paris"},
{role: "user", content: "What about Germany?"},
{role: "assistant", content: "Berlin"},
]
The model sees all of it on each call. It can reference earlier turns — "What about Germany?" only makes sense given the prior "capital of France?" turn.
agent = Agent(model)
# turn 1
result1 = agent.run_sync("<user prompt>")
history = result1.all_messages() # both user prompt + assistant response
# turn 2
result2 = agent.run_sync("<next user prompt>", message_history=history)
history = result2.all_messages() # extended with turn-2 user + assistant
# turn 3
result3 = agent.run_sync("<next user prompt>", message_history=history)
# ...The library serializes/deserializes the messages between turns. You only ever pass the history list; pydantic_ai handles the wire format.
Every call sends the full history. Token count = (sum of all prior turns) + (this turn). A 10-turn conversation costs more in token-input than a single 1-turn call. Watch the count when conversations get long.
For production multi-turn:
This lesson and day 16 keep history short (3 turns) so trimming isn't yet a concern.
Create a free account to get started. Paid plans unlock all tracks.