foundation-models-on-device
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFoundationModels: On-Device LLM (iOS 26)
FoundationModels:设备端LLM(iOS 26)
Patterns for integrating Apple's on-device language model into apps using the FoundationModels framework. Covers text generation, structured output with , custom tool calling, and snapshot streaming — all running on-device for privacy and offline support.
@Generable本文介绍如何使用FoundationModels框架将Apple设备端语言模型集成到应用中的实践方案,涵盖文本生成、基于的结构化输出、自定义工具调用以及快照流功能——所有功能均在设备端运行,保障隐私并支持离线使用。
@GenerableWhen to Activate
适用场景
- Building AI-powered features using Apple Intelligence on-device
- Generating or summarizing text without cloud dependency
- Extracting structured data from natural language input
- Implementing custom tool calling for domain-specific AI actions
- Streaming structured responses for real-time UI updates
- Need privacy-preserving AI (no data leaves the device)
- 基于Apple Intelligence在设备端构建AI功能
- 无需依赖云端即可生成或总结文本
- 从自然语言输入中提取结构化数据
- 为特定领域AI操作实现自定义工具调用
- 流式传输结构化响应以实现实时UI更新
- 需要隐私保护型AI(数据不会离开设备)
Core Pattern — Availability Check
核心模式——可用性检查
Always check model availability before creating a session:
swift
struct GenerativeView: View {
private var model = SystemLanguageModel.default
var body: some View {
switch model.availability {
case .available:
ContentView()
case .unavailable(.deviceNotEligible):
Text("Device not eligible for Apple Intelligence")
case .unavailable(.appleIntelligenceNotEnabled):
Text("Please enable Apple Intelligence in Settings")
case .unavailable(.modelNotReady):
Text("Model is downloading or not ready")
case .unavailable(let other):
Text("Model unavailable: \(other)")
}
}
}在创建会话前务必检查模型可用性:
swift
struct GenerativeView: View {
private var model = SystemLanguageModel.default
var body: some View {
switch model.availability {
case .available:
ContentView()
case .unavailable(.deviceNotEligible):
Text("Device not eligible for Apple Intelligence")
case .unavailable(.appleIntelligenceNotEnabled):
Text("Please enable Apple Intelligence in Settings")
case .unavailable(.modelNotReady):
Text("Model is downloading or not ready")
case .unavailable(let other):
Text("Model unavailable: \(other)")
}
}
}Core Pattern — Basic Session
核心模式——基础会话
swift
// Single-turn: create a new session each time
let session = LanguageModelSession()
let response = try await session.respond(to: "What's a good month to visit Paris?")
print(response.content)
// Multi-turn: reuse session for conversation context
let session = LanguageModelSession(instructions: """
You are a cooking assistant.
Provide recipe suggestions based on ingredients.
Keep suggestions brief and practical.
""")
let first = try await session.respond(to: "I have chicken and rice")
let followUp = try await session.respond(to: "What about a vegetarian option?")Key points for instructions:
- Define the model's role ("You are a mentor")
- Specify what to do ("Help extract calendar events")
- Set style preferences ("Respond as briefly as possible")
- Add safety measures ("Respond with 'I can't help with that' for dangerous requests")
swift
// 单轮会话:每次创建新会话
let session = LanguageModelSession()
let response = try await session.respond(to: "What's a good month to visit Paris?")
print(response.content)
// 多轮会话:复用会话以保留对话上下文
let session = LanguageModelSession(instructions: """
You are a cooking assistant.
Provide recipe suggestions based on ingredients.
Keep suggestions brief and practical.
""")
let first = try await session.respond(to: "I have chicken and rice")
let followUp = try await session.respond(to: "What about a vegetarian option?")指令设置要点:
- 定义模型角色(如“你是一位导师”)
- 指定任务内容(如“帮助提取日历事件”)
- 设置风格偏好(如“尽可能简洁地回复”)
- 添加安全规则(如“对于危险请求,回复‘我无法提供帮助’”)
Core Pattern — Guided Generation with @Generable
核心模式——基于@Generable的引导式生成
Generate structured Swift types instead of raw strings:
生成结构化Swift类型而非原始字符串:
1. Define a Generable Type
1. 定义可生成类型
swift
@Generable(description: "Basic profile information about a cat")
struct CatProfile {
var name: String
@Guide(description: "The age of the cat", .range(0...20))
var age: Int
@Guide(description: "A one sentence profile about the cat's personality")
var profile: String
}swift
@Generable(description: "Basic profile information about a cat")
struct CatProfile {
var name: String
@Guide(description: "The age of the cat", .range(0...20))
var age: Int
@Guide(description: "A one sentence profile about the cat's personality")
var profile: String
}2. Request Structured Output
2. 请求结构化输出
swift
let response = try await session.respond(
to: "Generate a cute rescue cat",
generating: CatProfile.self
)
// Access structured fields directly
print("Name: \(response.content.name)")
print("Age: \(response.content.age)")
print("Profile: \(response.content.profile)")swift
let response = try await session.respond(
to: "Generate a cute rescue cat",
generating: CatProfile.self
)
// 直接访问结构化字段
print("Name: \(response.content.name)")
print("Age: \(response.content.age)")
print("Profile: \(response.content.profile)")Supported @Guide Constraints
支持的@Guide约束
- — numeric range
.range(0...20) - — array element count
.count(3) - — semantic guidance for generation
description:
- — 数值范围
.range(0...20) - — 数组元素数量
.count(3) - — 生成的语义引导
description:
Core Pattern — Tool Calling
核心模式——工具调用
Let the model invoke custom code for domain-specific tasks:
让模型调用自定义代码以完成特定领域任务:
1. Define a Tool
1. 定义工具
swift
struct RecipeSearchTool: Tool {
let name = "recipe_search"
let description = "Search for recipes matching a given term and return a list of results."
@Generable
struct Arguments {
var searchTerm: String
var numberOfResults: Int
}
func call(arguments: Arguments) async throws -> ToolOutput {
let recipes = await searchRecipes(
term: arguments.searchTerm,
limit: arguments.numberOfResults
)
return .string(recipes.map { "- \($0.name): \($0.description)" }.joined(separator: "\n"))
}
}swift
struct RecipeSearchTool: Tool {
let name = "recipe_search"
let description = "Search for recipes matching a given term and return a list of results."
@Generable
struct Arguments {
var searchTerm: String
var numberOfResults: Int
}
func call(arguments: Arguments) async throws -> ToolOutput {
let recipes = await searchRecipes(
term: arguments.searchTerm,
limit: arguments.numberOfResults
)
return .string(recipes.map { "- \($0.name): \($0.description)" }.joined(separator: "\n"))
}
}2. Create Session with Tools
2. 创建带工具的会话
swift
let session = LanguageModelSession(tools: [RecipeSearchTool()])
let response = try await session.respond(to: "Find me some pasta recipes")swift
let session = LanguageModelSession(tools: [RecipeSearchTool()])
let response = try await session.respond(to: "Find me some pasta recipes")3. Handle Tool Errors
3. 处理工具错误
swift
do {
let answer = try await session.respond(to: "Find a recipe for tomato soup.")
} catch let error as LanguageModelSession.ToolCallError {
print(error.tool.name)
if case .databaseIsEmpty = error.underlyingError as? RecipeSearchToolError {
// Handle specific tool error
}
}swift
do {
let answer = try await session.respond(to: "Find a recipe for tomato soup.")
} catch let error as LanguageModelSession.ToolCallError {
print(error.tool.name)
if case .databaseIsEmpty = error.underlyingError as? RecipeSearchToolError {
// 处理特定工具错误
}
}Core Pattern — Snapshot Streaming
核心模式——快照流
Stream structured responses for real-time UI with types:
PartiallyGeneratedswift
@Generable
struct TripIdeas {
@Guide(description: "Ideas for upcoming trips")
var ideas: [String]
}
let stream = session.streamResponse(
to: "What are some exciting trip ideas?",
generating: TripIdeas.self
)
for try await partial in stream {
// partial: TripIdeas.PartiallyGenerated (all properties Optional)
print(partial)
}使用类型流式传输结构化响应,实现实时UI更新:
PartiallyGeneratedswift
@Generable
struct TripIdeas {
@Guide(description: "Ideas for upcoming trips")
var ideas: [String]
}
let stream = session.streamResponse(
to: "What are some exciting trip ideas?",
generating: TripIdeas.self
)
for try await partial in stream {
// partial: TripIdeas.PartiallyGenerated(所有属性为可选类型)
print(partial)
}SwiftUI Integration
SwiftUI集成
swift
@State private var partialResult: TripIdeas.PartiallyGenerated?
@State private var errorMessage: String?
var body: some View {
List {
ForEach(partialResult?.ideas ?? [], id: \.self) { idea in
Text(idea)
}
}
.overlay {
if let errorMessage { Text(errorMessage).foregroundStyle(.red) }
}
.task {
do {
let stream = session.streamResponse(to: prompt, generating: TripIdeas.self)
for try await partial in stream {
partialResult = partial
}
} catch {
errorMessage = error.localizedDescription
}
}
}swift
@State private var partialResult: TripIdeas.PartiallyGenerated?
@State private var errorMessage: String?
var body: some View {
List {
ForEach(partialResult?.ideas ?? [], id: \.self) { idea in
Text(idea)
}
}
.overlay {
if let errorMessage { Text(errorMessage).foregroundStyle(.red) }
}
.task {
do {
let stream = session.streamResponse(to: prompt, generating: TripIdeas.self)
for try await partial in stream {
partialResult = partial
}
} catch {
errorMessage = error.localizedDescription
}
}
}Key Design Decisions
关键设计决策
| Decision | Rationale |
|---|---|
| On-device execution | Privacy — no data leaves the device; works offline |
| 4,096 token limit | On-device model constraint; chunk large data across sessions |
| Snapshot streaming (not deltas) | Structured output friendly; each snapshot is a complete partial state |
| Compile-time safety for structured generation; auto-generates |
| Single request per session | |
| Correct API — always access results via |
| 决策 | 理由 |
|---|---|
| 设备端执行 | 隐私保护——数据不会离开设备;支持离线使用 |
| 4096令牌限制 | 设备端模型约束;可跨会话拆分大型数据 |
| 快照流(而非增量流) | 适配结构化输出;每个快照都是完整的部分状态 |
| 为结构化生成提供编译时安全保障;自动生成 |
| 单会话单次请求 | |
使用 | 正确的API使用方式——始终通过 |
Best Practices
最佳实践
- Always check before creating a session — handle all unavailability cases
model.availability - Use to guide model behavior — they take priority over prompts
instructions - Check before sending a new request — sessions handle one request at a time
isResponding - Access for results — not
response.content.output - Break large inputs into chunks — 4,096 token limit applies to instructions + prompt + output combined
- Use for structured output — stronger guarantees than parsing raw strings
@Generable - Use to tune creativity (higher = more creative)
GenerationOptions(temperature:) - Monitor with Instruments — use Xcode Instruments to profile request performance
- 创建会话前务必检查——处理所有不可用场景
model.availability - **使用**引导模型行为——其优先级高于提示词
instructions - 发送新请求前检查——会话同一时间仅处理一个请求
isResponding - 通过访问结果——而非
response.content.output - 将大型输入拆分为多个块——4096令牌限制适用于指令+提示词+输出的总和
- **使用**实现结构化输出——比解析原始字符串更可靠
@Generable - **使用**调整创造力(值越高越有创意)
GenerationOptions(temperature:) - 用Instruments监控——使用Xcode Instruments分析请求性能
Anti-Patterns to Avoid
需避免的反模式
- Creating sessions without checking first
model.availability - Sending inputs exceeding the 4,096 token context window
- Attempting concurrent requests on a single session
- Using instead of
.outputto access response data.content - Parsing raw string responses when structured output would work
@Generable - Building complex multi-step logic in a single prompt — break into multiple focused prompts
- Assuming the model is always available — device eligibility and settings vary
- 未检查就创建会话
model.availability - 发送超过4096令牌上下文窗口的输入
- 尝试在单个会话上发送并发请求
- 使用而非
.output访问响应数据.content - 当结构化输出可用时仍解析原始字符串响应
@Generable - 在单个提示词中构建复杂的多步逻辑——拆分为多个聚焦的提示词
- 假设模型始终可用——设备兼容性和设置各不相同
When to Use
适用场景
- On-device text generation for privacy-sensitive apps
- Structured data extraction from user input (forms, natural language commands)
- AI-assisted features that must work offline
- Streaming UI that progressively shows generated content
- Domain-specific AI actions via tool calling (search, compute, lookup)
- 隐私敏感型应用中的设备端文本生成
- 从用户输入(表单、自然语言命令)中提取结构化数据
- 必须支持离线使用的AI辅助功能
- 逐步展示生成内容的流式UI
- 通过工具调用实现特定领域AI操作(搜索、计算、查询)