coreml

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Core ML Swift Integration

Core ML Swift 集成

Load, configure, and run Core ML models in iOS apps. This skill covers the Swift side: model loading, prediction, MLTensor, profiling, and deployment. Target iOS 26+ with Swift 6.2, backward-compatible to iOS 14 unless noted.

Scope boundary: Python-side model conversion, optimization (quantization, palettization, pruning), and framework selection live in the
apple-on-device-ai
skill. This skill owns Swift integration only.

See

references/coreml-swift-integration.md

for complete code patterns including actor-based caching, batch inference, image preprocessing, and testing.

在iOS应用中加载、配置并运行Core ML模型。本技能涵盖Swift端的内容：模型加载、预测、MLTensor、性能分析以及部署。目标平台为iOS 26+，使用Swift 6.2，除非特别说明，否则向后兼容至iOS 14。

范围边界： Python端的模型转换、优化（量化、调色板化、剪枝）以及框架选择属于
apple-on-device-ai
技能的范畴。本技能仅负责Swift端的集成。

完整的代码模式（包括基于actor的缓存、批量推理、图像预处理和测试）请参见

references/coreml-swift-integration.md

。

Loading Models

加载模型

Auto-Generated Classes

自动生成类

When you drag a

.mlpackage

.mlmodelc

into Xcode, it generates a Swift class with typed input/output. Use this whenever possible.

swift

import CoreML

let config = MLModelConfiguration()
config.computeUnits = .all

let model = try MyImageClassifier(configuration: config)

将

.mlpackage

或

.mlmodelc

拖入Xcode时，它会生成一个带有类型化输入/输出的Swift类。尽可能使用该类。

swift

import CoreML

let config = MLModelConfiguration()
config.computeUnits = .all

let model = try MyImageClassifier(configuration: config)

Manual Loading

手动加载

Load from a URL when the model is downloaded at runtime or stored outside the bundle.

swift

let modelURL = Bundle.main.url(
    forResource: "MyModel", withExtension: "mlmodelc"
)!
let model = try MLModel(contentsOf: modelURL, configuration: config)

当模型在运行时下载或存储在包外时，从URL加载模型。

swift

let modelURL = Bundle.main.url(
    forResource: "MyModel", withExtension: "mlmodelc"
)!
let model = try MLModel(contentsOf: modelURL, configuration: config)

Async Loading (iOS 16+)

异步加载（iOS 16+）

Load models without blocking the main thread. Prefer this for large models.

swift

let model = try await MLModel.load(
    contentsOf: modelURL,
    configuration: config
)

在不阻塞主线程的情况下加载模型。对于大型模型，建议使用此方式。

swift

let model = try await MLModel.load(
    contentsOf: modelURL,
    configuration: config
)

Compile at Runtime

运行时编译

Compile a

.mlpackage

.mlmodel

.mlmodelc

on device. Useful for models downloaded from a server.

swift

let compiledURL = try await MLModel.compileModel(at: packageURL)
let model = try MLModel(contentsOf: compiledURL, configuration: config)

Cache the compiled URL -- recompiling on every launch wastes time. Copy

compiledURL

to a persistent location (e.g., Application Support).

在设备上将

.mlpackage

或

.mlmodel

编译为

.mlmodelc

。适用于从服务器下载的模型。

swift

let compiledURL = try await MLModel.compileModel(at: packageURL)
let model = try MLModel(contentsOf: compiledURL, configuration: config)

缓存编译后的URL——每次启动时重新编译会浪费时间。将

compiledURL

复制到持久化位置（例如Application Support）。

Model Configuration

模型配置

MLModelConfiguration

controls compute units, GPU access, and model parameters.

MLModelConfiguration

控制计算单元、GPU访问和模型参数。

Compute Units Decision Table

计算单元决策表

Value	Uses	When to Choose
`.all`	CPU + GPU + Neural Engine	Default. Let the system decide.
`.cpuOnly`	CPU	Background tasks, audio sessions, or when GPU is busy.
`.cpuAndGPU`	CPU + GPU	Need GPU but model has ops unsupported by ANE.
`.cpuAndNeuralEngine`	CPU + Neural Engine	Best energy efficiency for compatible models.

swift

let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine

// Allow low-priority background inference
config.computeUnits = .cpuOnly

取值	使用资源	适用场景
`.all`	CPU + GPU + 神经引擎	默认选项。由系统决定。
`.cpuOnly`	CPU	后台任务、音频会话或GPU繁忙时。
`.cpuAndGPU`	CPU + GPU	需要GPU但模型包含ANE不支持的操作时。
`.cpuAndNeuralEngine`	CPU + 神经引擎	兼容模型的最佳能效选择。

swift

let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine

// 允许低优先级后台推理
config.computeUnits = .cpuOnly

Configuration Properties

配置属性

swift

let config = MLModelConfiguration()
config.computeUnits = .all
config.allowLowPrecisionAccumulationOnGPU = true // faster, slight precision loss

swift

let config = MLModelConfiguration()
config.computeUnits = .all
config.allowLowPrecisionAccumulationOnGPU = true // 速度更快，精度略有损失

Making Predictions

执行预测

With Auto-Generated Classes

使用自动生成类

The generated class provides typed input/output structs.

swift

let model = try MyImageClassifier(configuration: config)
let input = MyImageClassifierInput(image: pixelBuffer)
let output = try model.prediction(input: input)
print(output.classLabel)        // "golden_retriever"
print(output.classLabelProbs)   // ["golden_retriever": 0.95, ...]

生成的类提供类型化的输入/输出结构体。

swift

let model = try MyImageClassifier(configuration: config)
let input = MyImageClassifierInput(image: pixelBuffer)
let output = try model.prediction(input: input)
print(output.classLabel)        // "golden_retriever"
print(output.classLabelProbs)   // ["golden_retriever": 0.95, ...]

With MLDictionaryFeatureProvider

使用MLDictionaryFeatureProvider

Use when inputs are dynamic or not known at compile time.

swift

let inputFeatures = try MLDictionaryFeatureProvider(dictionary: [
    "image": MLFeatureValue(pixelBuffer: pixelBuffer),
    "confidence_threshold": MLFeatureValue(double: 0.5),
])
let output = try model.prediction(from: inputFeatures)
let label = output.featureValue(for: "classLabel")?.stringValue

当输入是动态的或在编译时未知时使用。

swift

let inputFeatures = try MLDictionaryFeatureProvider(dictionary: [
    "image": MLFeatureValue(pixelBuffer: pixelBuffer),
    "confidence_threshold": MLFeatureValue(double: 0.5),
])
let output = try model.prediction(from: inputFeatures)
let label = output.featureValue(for: "classLabel")?.stringValue

Async Prediction (iOS 17+)

异步预测（iOS 17+）

swift

let output = try await model.prediction(from: inputFeatures)

swift

let output = try await model.prediction(from: inputFeatures)

Batch Prediction

批量预测

Process multiple inputs in one call for better throughput.

swift

let batchInputs = try MLArrayBatchProvider(array: inputs.map { input in
    try MLDictionaryFeatureProvider(dictionary: ["image": MLFeatureValue(pixelBuffer: input)])
})
let batchOutput = try model.predictions(from: batchInputs)
for i in 0..<batchOutput.count {
    let result = batchOutput.features(at: i)
    print(result.featureValue(for: "classLabel")?.stringValue ?? "unknown")
}

一次性处理多个输入以提高吞吐量。

swift

let batchInputs = try MLArrayBatchProvider(array: inputs.map { input in
    try MLDictionaryFeatureProvider(dictionary: ["image": MLFeatureValue(pixelBuffer: input)])
})
let batchOutput = try model.predictions(from: batchInputs)
for i in 0..<batchOutput.count {
    let result = batchOutput.features(at: i)
    print(result.featureValue(for: "classLabel")?.stringValue ?? "unknown")
}

Stateful Prediction (iOS 18+)

有状态预测（iOS 18+）

Use

MLState

for models that maintain state across predictions (sequence models, LLMs, audio accumulators). Create state once and pass it to each prediction call.

swift

let state = model.makeState()

// Each prediction carries forward the internal model state
for frame in audioFrames {
    let input = try MLDictionaryFeatureProvider(dictionary: [
        "audio_features": MLFeatureValue(multiArray: frame)
    ])
    let output = try await model.prediction(from: input, using: state)
    let classification = output.featureValue(for: "label")?.stringValue
}

State is not

Sendable

-- use it from a single actor or task. Call

model.makeState()

to create independent state for concurrent streams.

对于在预测之间保持状态的模型（序列模型、LLM、音频累加器），使用

MLState

。创建一次状态并将其传递给每个预测调用。

swift

let state = model.makeState()

// 每个预测都会延续模型的内部状态
for frame in audioFrames {
    let input = try MLDictionaryFeatureProvider(dictionary: [
        "audio_features": MLFeatureValue(multiArray: frame)
    ])
    let output = try await model.prediction(from: input, using: state)
    let classification = output.featureValue(for: "label")?.stringValue
}

状态不支持

Sendable

——请从单个actor或任务中使用它。调用

model.makeState()

可为并发流创建独立状态。

MLTensor (iOS 18+)

MLTensor（iOS 18+）

MLTensor

is a Swift-native multidimensional array for pre/post-processing. Operations run lazily -- call

.shapedArray(of:)

to materialize results.

swift

import CoreML

// Creation
let tensor = MLTensor([1.0, 2.0, 3.0, 4.0])
let zeros = MLTensor(zeros: [3, 224, 224], scalarType: Float.self)

// Reshaping
let reshaped = tensor.reshaped(to: [2, 2])

// Math operations
let softmaxed = tensor.softmax()
let normalized = (tensor - tensor.mean()) / tensor.standardDeviation()

// Interop with MLMultiArray
let multiArray = try MLMultiArray([1.0, 2.0, 3.0, 4.0])
let fromMultiArray = MLTensor(multiArray)
let backToArray = tensor.shapedArray(of: Float.self)

MLTensor

是Swift原生的多维数组，用于预处理/后处理。操作会延迟执行——调用

.shapedArray(of:)

以生成结果。

swift

import CoreML

// 创建
let tensor = MLTensor([1.0, 2.0, 3.0, 4.0])
let zeros = MLTensor(zeros: [3, 224, 224], scalarType: Float.self)

// 重塑形状
let reshaped = tensor.reshaped(to: [2, 2])

// 数学运算
let softmaxed = tensor.softmax()
let normalized = (tensor - tensor.mean()) / tensor.standardDeviation()

// 与MLMultiArray互操作
let multiArray = try MLMultiArray([1.0, 2.0, 3.0, 4.0])
let fromMultiArray = MLTensor(multiArray)
let backToArray = tensor.shapedArray(of: Float.self)

Working with MLMultiArray

MLMultiArray 用法

MLMultiArray

is the primary data exchange type for non-image model inputs and outputs. Use it when the auto-generated class expects array-type features.

swift

// Create a 3D array: [batch, sequence, features]
let array = try MLMultiArray(shape: [1, 128, 768], dataType: .float32)

// Write values
for i in 0..<128 {
    array[[0, i, 0] as [NSNumber]] = NSNumber(value: Float(i))
}

// Read values
let value = array[[0, 0, 0] as [NSNumber]].floatValue

// Create from data pointer for zero-copy interop
let data: [Float] = [1.0, 2.0, 3.0]
let fromData = try MLMultiArray(dataPointer: UnsafeMutableRawPointer(mutating: data),
                                 shape: [3],
                                 dataType: .float32,
                                 strides: [1])

See

references/coreml-swift-integration.md

for advanced MLMultiArray patterns including NLP tokenization and audio feature extraction.

MLMultiArray

是非图像模型输入和输出的主要数据交换类型。当自动生成的类期望数组类型的特征时使用它。

swift

// 创建一个3D数组：[批次, 序列, 特征]
let array = try MLMultiArray(shape: [1, 128, 768], dataType: .float32)

// 写入值
for i in 0..<128 {
    array[[0, i, 0] as [NSNumber]] = NSNumber(value: Float(i))
}

// 读取值
let value = array[[0, 0, 0] as [NSNumber]].floatValue

// 从数据指针创建以实现零拷贝互操作
let data: [Float] = [1.0, 2.0, 3.0]
let fromData = try MLMultiArray(dataPointer: UnsafeMutableRawPointer(mutating: data),
                                 shape: [3],
                                 dataType: .float32,
                                 strides: [1])

有关高级MLMultiArray模式（包括NLP分词和音频特征提取），请参见

references/coreml-swift-integration.md

。

Image Preprocessing

图像预处理

Image models expect

CVPixelBuffer

input. Use

CGImage

conversion for photos from the camera or photo library. Vision's

VNCoreMLRequest

handles this automatically; manual conversion is needed only for direct

MLModel

prediction.

swift

import CoreVideo

func createPixelBuffer(from cgImage: CGImage, width: Int, height: Int) -> CVPixelBuffer? {
    var pixelBuffer: CVPixelBuffer?
    let attrs: [CFString: Any] = [
        kCVPixelBufferCGImageCompatibilityKey: true,
        kCVPixelBufferCGBitmapContextCompatibilityKey: true,
    ]
    CVPixelBufferCreate(kCFAllocatorDefault, width, height,
                        kCVPixelFormatType_32ARGB, attrs as CFDictionary, &pixelBuffer)

    guard let buffer = pixelBuffer else { return nil }
    CVPixelBufferLockBaseAddress(buffer, [])
    let context = CGContext(
        data: CVPixelBufferGetBaseAddress(buffer),
        width: width, height: height,
        bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(buffer),
        space: CGColorSpaceCreateDeviceRGB(),
        bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
    )
    context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))
    CVPixelBufferUnlockBaseAddress(buffer, [])
    return buffer
}

For additional preprocessing patterns (normalization, center-cropping), see

references/coreml-swift-integration.md

图像模型期望输入

CVPixelBuffer

。对于来自相机或照片库的照片，使用

CGImage

转换。Vision的

VNCoreMLRequest

会自动处理此操作；仅当直接使用

MLModel

进行预测时才需要手动转换。

swift

import CoreVideo

func createPixelBuffer(from cgImage: CGImage, width: Int, height: Int) -> CVPixelBuffer? {
    var pixelBuffer: CVPixelBuffer?
    let attrs: [CFString: Any] = [
        kCVPixelBufferCGImageCompatibilityKey: true,
        kCVPixelBufferCGBitmapContextCompatibilityKey: true,
    ]
    CVPixelBufferCreate(kCFAllocatorDefault, width, height,
                        kCVPixelFormatType_32ARGB, attrs as CFDictionary, &pixelBuffer)

    guard let buffer = pixelBuffer else { return nil }
    CVPixelBufferLockBaseAddress(buffer, [])
    let context = CGContext(
        data: CVPixelBufferGetBaseAddress(buffer),
        width: width, height: height,
        bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(buffer),
        space: CGColorSpaceCreateDeviceRGB(),
        bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
    )
    context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))
    CVPixelBufferUnlockBaseAddress(buffer, [])
    return buffer
}

有关其他预处理模式（归一化、中心裁剪），请参见

references/coreml-swift-integration.md

。

Multi-Model Pipelines

多模型流水线

Chain models when preprocessing or postprocessing requires a separate model.

swift

// Sequential inference: preprocessor -> main model -> postprocessor
let preprocessed = try preprocessor.prediction(from: rawInput)
let mainOutput = try mainModel.prediction(from: preprocessed)
let finalOutput = try postprocessor.prediction(from: mainOutput)

For Xcode-managed pipelines, use the pipeline model type in the

.mlpackage

. Each sub-model runs on its optimal compute unit.

当预处理或后处理需要单独模型时，将模型链式调用。

swift

// 顺序推理：预处理器 -> 主模型 -> 后处理器
let preprocessed = try preprocessor.prediction(from: rawInput)
let mainOutput = try mainModel.prediction(from: preprocessed)
let finalOutput = try postprocessor.prediction(from: mainOutput)

对于Xcode管理的流水线，在

.mlpackage

中使用流水线模型类型。每个子模型都会在其最优计算单元上运行。

Vision Integration

Vision 集成

Use Vision to run Core ML image models with automatic image preprocessing (resizing, normalization, color space, orientation).

使用Vision运行Core ML图像模型，自动处理图像预处理（调整大小、归一化、颜色空间、方向）。

Modern: CoreMLRequest (iOS 18+)

现代方式：CoreMLRequest（iOS 18+）

swift

import Vision
import CoreML

let model = try MLModel(contentsOf: modelURL, configuration: config)
let request = CoreMLRequest(model: .init(model))
let results = try await request.perform(on: cgImage)

if let classification = results.first as? ClassificationObservation {
    print("\(classification.identifier): \(classification.confidence)")
}

swift

import Vision
import CoreML

let model = try MLModel(contentsOf: modelURL, configuration: config)
let request = CoreMLRequest(model: .init(model))
let results = try await request.perform(on: cgImage)

if let classification = results.first as? ClassificationObservation {
    print("\(classification.identifier): \(classification.confidence)")
}

Legacy: VNCoreMLRequest

传统方式：VNCoreMLRequest

swift

let vnModel = try VNCoreMLModel(for: model)
let request = VNCoreMLRequest(model: vnModel) { request, error in
    guard let results = request.results as? [VNRecognizedObjectObservation] else { return }
    for observation in results {
        let label = observation.labels.first?.identifier ?? "unknown"
        let confidence = observation.labels.first?.confidence ?? 0
        let boundingBox = observation.boundingBox // normalized coordinates
        print("\(label): \(confidence) at \(boundingBox)")
    }
}
request.imageCropAndScaleOption = .scaleFill

let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer)
try handler.perform([request])

For complete Vision framework patterns (text recognition, barcode detection, document scanning), see the
vision-framework
skill.

swift

let vnModel = try VNCoreMLModel(for: model)
let request = VNCoreMLRequest(model: vnModel) { request, error in
    guard let results = request.results as? [VNRecognizedObjectObservation] else { return }
    for observation in results {
        let label = observation.labels.first?.identifier ?? "unknown"
        let confidence = observation.labels.first?.confidence ?? 0
        let boundingBox = observation.boundingBox // 归一化坐标
        print("\(label): \(confidence) at \(boundingBox)")
    }
}
request.imageCropAndScaleOption = .scaleFill

let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer)
try handler.perform([request])

有关完整的Vision框架模式（文本识别、条形码检测、文档扫描），请参见
vision-framework
技能。

Performance Profiling

性能分析

MLComputePlan (iOS 17.4+)

MLComputePlan（iOS 17.4+）

Inspect which compute device each operation will use before running predictions.

swift

let computePlan = try await MLComputePlan.load(
    contentsOf: modelURL, configuration: config
)
guard case let .program(program) = computePlan.modelStructure else { return }
guard let mainFunction = program.functions["main"] else { return }

for operation in mainFunction.block.operations {
    let deviceUsage = computePlan.deviceUsage(for: operation)
    let estimatedCost = computePlan.estimatedCost(of: operation)
    print("\(operation.operatorName): \(deviceUsage?.preferredComputeDevice ?? "unknown")")
}

在运行预测之前，检查每个操作将使用哪个计算设备。

swift

let computePlan = try await MLComputePlan.load(
    contentsOf: modelURL, configuration: config
)
guard case let .program(program) = computePlan.modelStructure else { return }
guard let mainFunction = program.functions["main"] else { return }

for operation in mainFunction.block.operations {
    let deviceUsage = computePlan.deviceUsage(for: operation)
    let estimatedCost = computePlan.estimatedCost(of: operation)
    print("\(operation.operatorName): \(deviceUsage?.preferredComputeDevice ?? "unknown")")
}

Instruments

Instruments工具

Use the Core ML instrument template in Instruments to profile:

Model load time
Prediction latency (per-operation breakdown)
Compute device dispatch (CPU/GPU/ANE per operation)
Memory allocation

Run outside the debugger for accurate results (Xcode: Product > Profile).

使用Instruments中的Core ML工具模板进行性能分析：

模型加载时间
预测延迟（按操作细分）
计算设备调度（每个操作的CPU/GPU/ANE使用情况）
内存分配

为获得准确结果，请在调试器外运行（Xcode：Product > Profile）。

Model Deployment

模型部署

Bundle vs On-Demand Resources

包内集成 vs 按需资源

Strategy	Pros	Cons
Bundle in app	Instant availability, works offline	Increases app download size
On-demand resources	Smaller initial download	Requires download before first use
Background Assets (iOS 16+)	Downloads ahead of time	More complex setup
CloudKit / server	Maximum flexibility	Requires network, longer setup

策略	优点	缺点
集成到应用包中	立即可用，离线可用	增加应用下载大小
按需资源	初始下载包更小	首次使用前需要下载
Background Assets（iOS 16+）	提前下载	设置更复杂
CloudKit / 服务器	灵活性最高	需要网络，设置耗时更长

Size Considerations

大小考量

App Store limit: 4 GB for app bundle
Cellular download limit: 200 MB (can request exception)
Use ODR tags for models > 50 MB
Pre-compile to
```
.mlmodelc
```
to skip on-device compilation

swift

// On-demand resource loading
let request = NSBundleResourceRequest(tags: ["ml-model-v2"])
try await request.beginAccessingResources()
let modelURL = Bundle.main.url(forResource: "LargeModel", withExtension: "mlmodelc")!
let model = try await MLModel.load(contentsOf: modelURL, configuration: config)
// Call request.endAccessingResources() when done

App Store限制：应用包最大4 GB
蜂窝网络下载限制：200 MB（可申请例外）
对于大于50 MB的模型，使用ODR标签
预编译为
```
.mlmodelc
```
以跳过设备端编译

swift

// 按需资源加载
let request = NSBundleResourceRequest(tags: ["ml-model-v2"])
try await request.beginAccessingResources()
let modelURL = Bundle.main.url(forResource: "LargeModel", withExtension: "mlmodelc")!
let model = try await MLModel.load(contentsOf: modelURL, configuration: config)
// 使用完成后调用request.endAccessingResources()

Memory Management

内存管理

Unload on background: Release model references when the app enters background to free GPU/ANE memory. Reload on foreground return.
Use
.cpuOnly
for background tasks: Background processing cannot use GPU or ANE; setting
```
.cpuOnly
```
avoids silent fallback and resource contention.
Share model instances: Never create multiple
```
MLModel
```
instances from the same compiled model. Use an actor to provide shared access.
Monitor memory pressure: Large models (>100 MB) can trigger memory warnings. Register for
```
UIApplication.didReceiveMemoryWarningNotification
```
and release cached models when under pressure.

See

references/coreml-swift-integration.md

for an actor-based model manager with lifecycle-aware loading and cache eviction.

后台时卸载： 当应用进入后台时释放模型引用，以释放GPU/ANE内存。回到前台时重新加载。
后台任务使用
.cpuOnly
：后台处理无法使用GPU或ANE；设置
```
.cpuOnly
```
可避免静默回退和资源竞争。
共享模型实例： 切勿从同一编译模型创建多个
```
MLModel
```
实例。使用actor提供共享访问。
监控内存压力： 大型模型（>100 MB）可能触发内存警告。注册
```
UIApplication.didReceiveMemoryWarningNotification
```
，在内存压力下释放缓存的模型。

有关基于actor的模型管理器（支持生命周期感知加载和缓存淘汰），请参见

references/coreml-swift-integration.md

。

Common Mistakes

常见错误

DON'T: Load models on the main thread. DO: Use

MLModel.load(contentsOf:configuration:)

async API or load on a background actor. Why: Large models can take seconds to load, freezing the UI.

DON'T: Recompile

.mlpackage

.mlmodelc

on every app launch. DO: Compile once with

MLModel.compileModel(at:)

and cache the compiled URL persistently. Why: Compilation is expensive. Cache the

.mlmodelc

in Application Support.

DON'T: Hardcode

.cpuOnly

unless you have a specific reason. DO: Use

.all

and let the system choose the optimal compute unit. Why:

.all

enables Neural Engine and GPU, which are faster and more energy-efficient.

DON'T: Ignore

MLFeatureValue

type mismatches between input and model expectations. DO: Match types exactly -- use

MLFeatureValue(pixelBuffer:)

for images, not raw data. Why: Type mismatches cause cryptic runtime crashes or silent incorrect results.

DON'T: Create a new

MLModel

instance for every prediction. DO: Load once and reuse. Use an actor to manage the model lifecycle. Why: Model loading allocates significant memory and compute resources.

DON'T: Skip error handling for model loading and prediction. DO: Catch errors and provide fallback behavior when the model fails. Why: Models can fail to load on older devices or when resources are constrained.

DON'T: Assume all operations run on the Neural Engine. DO: Use

MLComputePlan

(iOS 17.4+) to verify device dispatch per operation. Why: Unsupported operations fall back to CPU, which may bottleneck the pipeline.

DON'T: Process images manually before passing to Vision + Core ML. DO: Use

CoreMLRequest

(iOS 18+) or

VNCoreMLRequest

(legacy) to let Vision handle preprocessing. Why: Vision handles orientation, scaling, and pixel format conversion correctly.

不要： 在主线程加载模型。要：使用

MLModel.load(contentsOf:configuration:)

异步API或在后台actor上加载。 原因： 大型模型加载可能需要数秒，会冻结UI。

不要： 每次应用启动时重新将

.mlpackage

编译为

.mlmodelc

。要：使用

MLModel.compileModel(at:)

编译一次，并持久化缓存编译后的URL。 原因： 编译成本高。将

.mlmodelc

缓存到Application Support中。

不要： 除非有特定原因，否则不要硬编码

.cpuOnly

。要：使用

.all

，让系统选择最优计算单元。 原因：

.all

启用神经引擎和GPU，速度更快且能效更高。

不要： 忽略输入与模型期望之间的

MLFeatureValue

类型不匹配。要：完全匹配类型——使用

MLFeatureValue(pixelBuffer:)

处理图像，而非原始数据。 原因： 类型不匹配会导致模糊的运行时崩溃或静默的错误结果。

不要： 每次预测都创建新的

MLModel

实例。要：加载一次并复用。使用actor管理模型生命周期。 原因： 模型加载会分配大量内存和计算资源。

不要： 跳过模型加载和预测的错误处理。要：捕获错误并在模型加载失败时提供回退行为。 原因： 模型可能在旧设备或资源受限的情况下加载失败。

不要： 假设所有操作都在神经引擎上运行。要：使用

MLComputePlan

（iOS 17.4+）验证每个操作的设备调度。 原因： 不支持的操作会回退到CPU，这可能会成为流水线的瓶颈。

不要： 在将图像传递给Vision + Core ML之前手动处理图像。要：使用

CoreMLRequest

（iOS 18+）或

VNCoreMLRequest

（传统方式），让Vision处理预处理。 原因： Vision可以正确处理方向、缩放和像素格式转换。

Review Checklist

检查清单

Model loaded asynchronously (not blocking main thread)
```
MLModelConfiguration.computeUnits
```
set appropriately for use case
Model instance reused across predictions (not recreated each time)
Auto-generated class used when available (typed inputs/outputs)
Error handling for model loading and prediction failures
Compiled model cached persistently if compiled at runtime
Image inputs use Vision pipeline (
```
CoreMLRequest
```
iOS 18+ or
```
VNCoreMLRequest
```
) for correct preprocessing
```
MLComputePlan
```
checked to verify compute device dispatch (iOS 17.4+)
Batch predictions used when processing multiple inputs
Model size appropriate for deployment strategy (bundle vs ODR)
Memory tested on target devices (especially older devices with less RAM)
Predictions run outside debugger for accurate performance measurement

模型异步加载（不阻塞主线程）
根据用例正确设置
```
MLModelConfiguration.computeUnits
```
模型实例在多次预测之间复用（不每次重新创建）
尽可能使用自动生成类（类型化输入/输出）
为模型加载和预测失败添加错误处理
如果在运行时编译模型，持久化缓存编译后的模型
图像输入使用Vision流水线（iOS 18+用
```
CoreMLRequest
```
或传统方式用
```
VNCoreMLRequest
```
）以确保正确预处理
使用
```
MLComputePlan
```
验证计算设备调度（iOS 17.4+）
处理多个输入时使用批量预测
模型大小与部署策略（包内集成vs ODR）匹配
在目标设备上测试内存（尤其是RAM较小的旧设备）
在调试器外运行预测以获得准确的性能测量结果

References

参考资料

Patterns and code:
```
references/coreml-swift-integration.md
```

Model conversion (Python):

apple-on-device-ai

skill,

../apple-on-device-ai/references/coreml-conversion.md

Model optimization:

apple-on-device-ai

skill,

../apple-on-device-ai/references/coreml-optimization.md

Apple docs: Core ML | MLModel | MLComputePlan

模式与代码：
```
references/coreml-swift-integration.md
```

模型转换（Python）：

apple-on-device-ai

技能，

../apple-on-device-ai/references/coreml-conversion.md

模型优化：

apple-on-device-ai

技能，

../apple-on-device-ai/references/coreml-optimization.md

Apple文档：Core ML | MLModel | MLComputePlan

coreml

Original

Translation

Core ML Swift Integration

Core ML Swift 集成

Contents

目录

Loading Models

加载模型

Auto-Generated Classes

自动生成类

Manual Loading

手动加载

Async Loading (iOS 16+)

异步加载（iOS 16+）

Compile at Runtime

运行时编译

Model Configuration

模型配置

Compute Units Decision Table

计算单元决策表

Configuration Properties

配置属性

Making Predictions

执行预测

With Auto-Generated Classes

使用自动生成类

With MLDictionaryFeatureProvider

使用MLDictionaryFeatureProvider

Async Prediction (iOS 17+)

异步预测（iOS 17+）

Batch Prediction

批量预测

Stateful Prediction (iOS 18+)

有状态预测（iOS 18+）

MLTensor (iOS 18+)

MLTensor（iOS 18+）

Working with MLMultiArray

MLMultiArray 用法

Image Preprocessing

图像预处理

Multi-Model Pipelines

多模型流水线

Vision Integration

Vision 集成

Modern: CoreMLRequest (iOS 18+)

现代方式：CoreMLRequest（iOS 18+）

Legacy: VNCoreMLRequest

传统方式：VNCoreMLRequest

Performance Profiling

性能分析

MLComputePlan (iOS 17.4+)

MLComputePlan（iOS 17.4+）

Instruments

Instruments工具

Model Deployment

模型部署

Bundle vs On-Demand Resources

包内集成 vs 按需资源

Size Considerations

大小考量

Memory Management

内存管理

Common Mistakes

常见错误

Review Checklist

检查清单

References

参考资料