Every LLM call has a cost in tokens. Two prompts of different sizes burn different amounts. The result object has a .usage() method that surfaces the count.
result = Agent(model).run_sync("Tell me a fact about Python")
usage = result.usage()
print(f"requests: {usage.requests}")
print(f"input tokens: {usage.input_tokens}")
print(f"output tokens: {usage.output_tokens}")
print(f"total: {usage.total_tokens}")Why care about token count when I'm on max-tier with monthly quota?
Three reasons:
A token is roughly ~3-4 characters or ~0.75 words for English. Different models tokenize differently; the model decides what counts.
.usage() shaperesult = Agent(model).run_sync(prompt)
u = result.usage()
u.requests # 1 (or more, for retries / tool calls)
u.input_tokens # tokens in the prompt + system + history
u.output_tokens # tokens in the response
u.total_tokens # input + output# short prompt, short response
short = Agent(model).run_sync("What is 2+2?")
print(short.usage().total_tokens) # ~20
# long prompt, long response
long = Agent(model).run_sync("Explain Python's async/await syntax in 500 words")
print(long.usage().total_tokens) # ~700For lessons, keep an eye on the count — develop the habit before you're paying real money.
To reduce token cost:
AI Advanced covers model routing)AI Advanced)Every LLM call has a cost in tokens. Two prompts of different sizes burn different amounts. The result object has a .usage() method that surfaces the count.
result = Agent(model).run_sync("Tell me a fact about Python")
usage = result.usage()
print(f"requests: {usage.requests}")
print(f"input tokens: {usage.input_tokens}")
print(f"output tokens: {usage.output_tokens}")
print(f"total: {usage.total_tokens}")Why care about token count when I'm on max-tier with monthly quota?
Three reasons:
A token is roughly ~3-4 characters or ~0.75 words for English. Different models tokenize differently; the model decides what counts.
.usage() shaperesult = Agent(model).run_sync(prompt)
u = result.usage()
u.requests # 1 (or more, for retries / tool calls)
u.input_tokens # tokens in the prompt + system + history
u.output_tokens # tokens in the response
u.total_tokens # input + output# short prompt, short response
short = Agent(model).run_sync("What is 2+2?")
print(short.usage().total_tokens) # ~20
# long prompt, long response
long = Agent(model).run_sync("Explain Python's async/await syntax in 500 words")
print(long.usage().total_tokens) # ~700For lessons, keep an eye on the count — develop the habit before you're paying real money.
To reduce token cost:
AI Advanced covers model routing)AI Advanced)Create a free account to get started. Paid plans unlock all tracks.