vercel-ai-sdk
Original:🇺🇸 English
Translated
Vercel AI SDK (Python) - patterns for building LLM-powered apps with streaming, tools, hooks, and structured output
7installs
Sourcevercel-labs/py-ai
Added on
NPX Install
npx skill4agent add vercel-labs/py-ai vercel-ai-sdkTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Vercel AI SDK (Python)
bash
uv add vercel-ai-sdkpython
import vercel_ai_sdk as aiCore workflow
ai.run(root, *args, checkpoint=None, cancel_on_hooks=False)RuntimerootMessagestream_stepexecute_toolai.run()The root function is any async function. If it declares a param typed , it's auto-injected.
ai.Runtimepython
@ai.tool
async def talk_to_mothership(question: str) -> str:
"""Contact the mothership for important decisions."""
return "Soon."
async def agent(llm: ai.LanguageModel, query: str) -> ai.StreamResult:
return await ai.stream_loop(
llm,
messages=ai.make_messages(system="You are a robot assistant.", user=query),
tools=[talk_to_mothership],
)
llm = ai.ai_gateway.GatewayModel(model="anthropic/claude-opus-4.6")
async for msg in ai.run(agent, llm, "When will the robots take over?"):
print(msg.text_delta, end="")@ai.toolToolruntime: ai.Runtimeai.stream_step(llm, messages, tools=None, label=None, output_type=None)StreamResult.text.tool_calls.output.usage.last_messageai.stream_loop(llm, messages, tools, label=None, output_type=None)StreamResultBoth are thin convenience wrappers (not magical -- they could be reimplemented by the user). is a -decorated function that calls . calls in a while loop with between iterations.
stream_step@ai.streamllm.stream()stream_loopstream_stepai.execute_tool()ai.execute_tool(tool_call, message=None)Multi-agent
Use with labels to run agents in parallel:
asyncio.gatherpython
async def multi(llm: ai.LanguageModel, query: str) -> ai.StreamResult:
r1, r2 = await asyncio.gather(
ai.stream_loop(llm, msgs1, tools=[t1], label="researcher"),
ai.stream_loop(llm, msgs2, tools=[t2], label="analyst"),
)
return await ai.stream_loop(
llm,
ai.make_messages(user=f"{r1.text}\n{r2.text}"),
tools=[],
label="summary",
)The field on messages lets the consumer distinguish which agent produced output (e.g. ).
labelmsg.label == "researcher"Messages
ai.make_messages(system=None, user=str)MessagerolepartsTextPart | ToolPart | ReasoningPart | HookPart | StructuredOutputPartlabelusagemsg.model_dump()ai.Message.model_validate(data)Key properties for consuming streamed output:
- -- current text chunk (use for live streaming display)
msg.text_delta - -- full accumulated text
msg.text - -- list of
msg.tool_callsobjectsToolPart - -- validated Pydantic instance (when using
msg.output)output_type - -- true when all parts finished streaming
msg.is_done - -- find a hook suspension part (for human-in-the-loop)
msg.get_hook_part()
Customization
Custom loop
When doesn't fit (conditional tool execution, approval gates, custom routing), use in a manual loop:
stream_loopstream_steppython
async def agent(llm: ai.LanguageModel, query: str) -> ai.StreamResult:
messages = ai.make_messages(system="...", user=query)
tools = [get_weather, get_population]
while True:
result = await ai.stream_step(llm, messages, tools)
if not result.tool_calls:
return result
messages.append(result.last_message)
await asyncio.gather(*(ai.execute_tool(tc, message=result.last_message) for tc in result.tool_calls))Custom stream
@ai.streamMessageai.run()llm.stream()python
@ai.stream
async def custom_step(llm: ai.LanguageModel, messages: list[ai.Message]) -> AsyncGenerator[ai.Message]:
async for msg in llm.stream(messages=messages, tools=[...]):
msg.label = "custom"
yield msg
result = await custom_step(llm, messages) # returns StreamResultTools can also stream intermediate progress via :
runtime.put_message()python
@ai.tool
async def long_task(input: str, runtime: ai.Runtime) -> str:
"""Streams progress back to the caller."""
for step in ["Connecting...", "Processing..."]:
await runtime.put_message(
ai.Message(role="assistant", parts=[ai.TextPart(text=step, state="streaming")], label="progress")
)
return "final result"Hooks
Hooks are typed suspension points for human-in-the-loop. Decorate a Pydantic model to define the resolution schema:
python
@ai.hook
class Approval(pydantic.BaseModel):
granted: bool
reason: strInside agent code -- blocks until resolved:
python
approval = await Approval.create("approve_send_email", metadata={"tool": "send_email"})
if approval.granted:
await ai.execute_tool(tc, message=result.last_message)
else:
tc.set_error(f"Rejected: {approval.reason}")From outside (API handler, iterator loop):
python
Approval.resolve("approve_send_email", {"granted": True, "reason": "User approved"})
Approval.cancel("approve_send_email")Long-running mode (, default): blocks until or is called externally. Use for websocket/interactive UIs.
cancel_on_hooks=Falsecreate()resolve()cancel()Serverless mode (): unresolved hooks are cancelled, the run ends. Inspect and to resume later.
cancel_on_hooks=Trueresult.pending_hooksresult.checkpointConsuming hooks in the iterator:
python
async for msg in ai.run(agent, llm, query):
if (hook := msg.get_hook_part()) and hook.status == "pending":
answer = input(f"Approve {hook.hook_id}? [y/n] ")
Approval.resolve(hook.hook_id, {"granted": answer == "y", "reason": "operator"})
continue
print(msg.text_delta, end="")Checkpoints
Checkpointpython
data = result.checkpoint.model_dump() # serialize (JSON-safe dict)
checkpoint = ai.Checkpoint.model_validate(data) # restore
result = ai.run(agent, llm, query, checkpoint=checkpoint) # replay completed workPrimary use case is serverless hook re-entry.
Adapters
Providers
python
# Vercel AI Gateway (recommended)
# Uses AI_GATEWAY_API_KEY env var
llm = ai.ai_gateway.GatewayModel(model="anthropic/claude-opus-4.6", thinking=True, budget_tokens=10000)
# Direct
llm = ai.openai.OpenAIModel(model="gpt-5")
llm = ai.anthropic.AnthropicModel(model="claude-opus-4-6", thinking=True, budget_tokens=10000)All implement with (async generator of ) and (returns final ). Gateway routes Anthropic models through the native Anthropic API for full feature support, others through OpenAI-compatible endpoint.
LanguageModelstream()Messagebuffer()MessageAI SDK UI
For streaming to AI SDK frontend (, etc.):
useChatpython
from vercel_ai_sdk.ai_sdk_ui import to_sse_stream, to_messages, UI_MESSAGE_STREAM_HEADERS
messages = to_messages(request.messages)
return StreamingResponse(to_sse_stream(ai.run(agent, llm, query)), headers=UI_MESSAGE_STREAM_HEADERS)Other features
Structured output
Pass a Pydantic model as :
output_typepython
class Forecast(pydantic.BaseModel):
city: str
temperature: float
result = await ai.stream_step(llm, messages, output_type=Forecast)
result.output.city # validated Pydantic instance
# Also works directly on the model:
msg = await llm.buffer(messages, output_type=Forecast)MCP
python
tools = await ai.mcp.get_http_tools("https://mcp.example.com/mcp", headers={...}, tool_prefix="docs")
tools = await ai.mcp.get_stdio_tools("npx", "-y", "@anthropic/mcp-server-filesystem", "/tmp", tool_prefix="fs")Returns objects usable in /. Connections are pooled per and cleaned up automatically.
Toolstream_stepstream_loopai.run()Telemetry
python
ai.telemetry.enable() # OTel-based, emits gen_ai.* spans for runs/steps/tools