coreml

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Core ML Swift Integration

Core ML Swift 集成

Load, configure, and run Core ML models in iOS apps. This skill covers the Swift side: model loading, prediction, MLTensor, profiling, and deployment. Target iOS 26+ with Swift 6.2, backward-compatible to iOS 14 unless noted.
Scope boundary: Python-side model conversion, optimization (quantization, palettization, pruning), and framework selection live in the
apple-on-device-ai
skill. This skill owns Swift integration only.
See
references/coreml-swift-integration.md
for complete code patterns including actor-based caching, batch inference, image preprocessing, and testing.
在iOS应用中加载、配置并运行Core ML模型。本技能涵盖Swift端的内容:模型加载、预测、MLTensor、性能分析以及部署。目标平台为iOS 26+,使用Swift 6.2,除非特别说明,否则向后兼容至iOS 14。
范围边界: Python端的模型转换、优化(量化、调色板化、剪枝)以及框架选择属于
apple-on-device-ai
技能的范畴。本技能仅负责Swift端的集成。
完整的代码模式(包括基于actor的缓存、批量推理、图像预处理和测试)请参见
references/coreml-swift-integration.md

Contents

目录

Loading Models

加载模型

Auto-Generated Classes

自动生成类

When you drag a
.mlpackage
or
.mlmodelc
into Xcode, it generates a Swift class with typed input/output. Use this whenever possible.
swift
import CoreML

let config = MLModelConfiguration()
config.computeUnits = .all

let model = try MyImageClassifier(configuration: config)
.mlpackage
.mlmodelc
拖入Xcode时,它会生成一个带有类型化输入/输出的Swift类。尽可能使用该类。
swift
import CoreML

let config = MLModelConfiguration()
config.computeUnits = .all

let model = try MyImageClassifier(configuration: config)

Manual Loading

手动加载

Load from a URL when the model is downloaded at runtime or stored outside the bundle.
swift
let modelURL = Bundle.main.url(
    forResource: "MyModel", withExtension: "mlmodelc"
)!
let model = try MLModel(contentsOf: modelURL, configuration: config)
当模型在运行时下载或存储在包外时,从URL加载模型。
swift
let modelURL = Bundle.main.url(
    forResource: "MyModel", withExtension: "mlmodelc"
)!
let model = try MLModel(contentsOf: modelURL, configuration: config)

Async Loading (iOS 16+)

异步加载(iOS 16+)

Load models without blocking the main thread. Prefer this for large models.
swift
let model = try await MLModel.load(
    contentsOf: modelURL,
    configuration: config
)
在不阻塞主线程的情况下加载模型。对于大型模型,建议使用此方式。
swift
let model = try await MLModel.load(
    contentsOf: modelURL,
    configuration: config
)

Compile at Runtime

运行时编译

Compile a
.mlpackage
or
.mlmodel
to
.mlmodelc
on device. Useful for models downloaded from a server.
swift
let compiledURL = try await MLModel.compileModel(at: packageURL)
let model = try MLModel(contentsOf: compiledURL, configuration: config)
Cache the compiled URL -- recompiling on every launch wastes time. Copy
compiledURL
to a persistent location (e.g., Application Support).
在设备上将
.mlpackage
.mlmodel
编译为
.mlmodelc
。适用于从服务器下载的模型。
swift
let compiledURL = try await MLModel.compileModel(at: packageURL)
let model = try MLModel(contentsOf: compiledURL, configuration: config)
缓存编译后的URL——每次启动时重新编译会浪费时间。将
compiledURL
复制到持久化位置(例如Application Support)。

Model Configuration

模型配置

MLModelConfiguration
controls compute units, GPU access, and model parameters.
MLModelConfiguration
控制计算单元、GPU访问和模型参数。

Compute Units Decision Table

计算单元决策表

ValueUsesWhen to Choose
.all
CPU + GPU + Neural EngineDefault. Let the system decide.
.cpuOnly
CPUBackground tasks, audio sessions, or when GPU is busy.
.cpuAndGPU
CPU + GPUNeed GPU but model has ops unsupported by ANE.
.cpuAndNeuralEngine
CPU + Neural EngineBest energy efficiency for compatible models.
swift
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine

// Allow low-priority background inference
config.computeUnits = .cpuOnly
取值使用资源适用场景
.all
CPU + GPU + 神经引擎默认选项。由系统决定。
.cpuOnly
CPU后台任务、音频会话或GPU繁忙时。
.cpuAndGPU
CPU + GPU需要GPU但模型包含ANE不支持的操作时。
.cpuAndNeuralEngine
CPU + 神经引擎兼容模型的最佳能效选择。
swift
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine

// 允许低优先级后台推理
config.computeUnits = .cpuOnly

Configuration Properties

配置属性

swift
let config = MLModelConfiguration()
config.computeUnits = .all
config.allowLowPrecisionAccumulationOnGPU = true // faster, slight precision loss
swift
let config = MLModelConfiguration()
config.computeUnits = .all
config.allowLowPrecisionAccumulationOnGPU = true // 速度更快,精度略有损失

Making Predictions

执行预测

With Auto-Generated Classes

使用自动生成类

The generated class provides typed input/output structs.
swift
let model = try MyImageClassifier(configuration: config)
let input = MyImageClassifierInput(image: pixelBuffer)
let output = try model.prediction(input: input)
print(output.classLabel)        // "golden_retriever"
print(output.classLabelProbs)   // ["golden_retriever": 0.95, ...]
生成的类提供类型化的输入/输出结构体。
swift
let model = try MyImageClassifier(configuration: config)
let input = MyImageClassifierInput(image: pixelBuffer)
let output = try model.prediction(input: input)
print(output.classLabel)        // "golden_retriever"
print(output.classLabelProbs)   // ["golden_retriever": 0.95, ...]

With MLDictionaryFeatureProvider

使用MLDictionaryFeatureProvider

Use when inputs are dynamic or not known at compile time.
swift
let inputFeatures = try MLDictionaryFeatureProvider(dictionary: [
    "image": MLFeatureValue(pixelBuffer: pixelBuffer),
    "confidence_threshold": MLFeatureValue(double: 0.5),
])
let output = try model.prediction(from: inputFeatures)
let label = output.featureValue(for: "classLabel")?.stringValue
当输入是动态的或在编译时未知时使用。
swift
let inputFeatures = try MLDictionaryFeatureProvider(dictionary: [
    "image": MLFeatureValue(pixelBuffer: pixelBuffer),
    "confidence_threshold": MLFeatureValue(double: 0.5),
])
let output = try model.prediction(from: inputFeatures)
let label = output.featureValue(for: "classLabel")?.stringValue

Async Prediction (iOS 17+)

异步预测(iOS 17+)

swift
let output = try await model.prediction(from: inputFeatures)
swift
let output = try await model.prediction(from: inputFeatures)

Batch Prediction

批量预测

Process multiple inputs in one call for better throughput.
swift
let batchInputs = try MLArrayBatchProvider(array: inputs.map { input in
    try MLDictionaryFeatureProvider(dictionary: ["image": MLFeatureValue(pixelBuffer: input)])
})
let batchOutput = try model.predictions(from: batchInputs)
for i in 0..<batchOutput.count {
    let result = batchOutput.features(at: i)
    print(result.featureValue(for: "classLabel")?.stringValue ?? "unknown")
}
一次性处理多个输入以提高吞吐量。
swift
let batchInputs = try MLArrayBatchProvider(array: inputs.map { input in
    try MLDictionaryFeatureProvider(dictionary: ["image": MLFeatureValue(pixelBuffer: input)])
})
let batchOutput = try model.predictions(from: batchInputs)
for i in 0..<batchOutput.count {
    let result = batchOutput.features(at: i)
    print(result.featureValue(for: "classLabel")?.stringValue ?? "unknown")
}

Stateful Prediction (iOS 18+)

有状态预测(iOS 18+)

Use
MLState
for models that maintain state across predictions (sequence models, LLMs, audio accumulators). Create state once and pass it to each prediction call.
swift
let state = model.makeState()

// Each prediction carries forward the internal model state
for frame in audioFrames {
    let input = try MLDictionaryFeatureProvider(dictionary: [
        "audio_features": MLFeatureValue(multiArray: frame)
    ])
    let output = try await model.prediction(from: input, using: state)
    let classification = output.featureValue(for: "label")?.stringValue
}
State is not
Sendable
-- use it from a single actor or task. Call
model.makeState()
to create independent state for concurrent streams.
对于在预测之间保持状态的模型(序列模型、LLM、音频累加器),使用
MLState
。创建一次状态并将其传递给每个预测调用。
swift
let state = model.makeState()

// 每个预测都会延续模型的内部状态
for frame in audioFrames {
    let input = try MLDictionaryFeatureProvider(dictionary: [
        "audio_features": MLFeatureValue(multiArray: frame)
    ])
    let output = try await model.prediction(from: input, using: state)
    let classification = output.featureValue(for: "label")?.stringValue
}
状态不支持
Sendable
——请从单个actor或任务中使用它。调用
model.makeState()
可为并发流创建独立状态。

MLTensor (iOS 18+)

MLTensor(iOS 18+)

MLTensor
is a Swift-native multidimensional array for pre/post-processing. Operations run lazily -- call
.shapedArray(of:)
to materialize results.
swift
import CoreML

// Creation
let tensor = MLTensor([1.0, 2.0, 3.0, 4.0])
let zeros = MLTensor(zeros: [3, 224, 224], scalarType: Float.self)

// Reshaping
let reshaped = tensor.reshaped(to: [2, 2])

// Math operations
let softmaxed = tensor.softmax()
let normalized = (tensor - tensor.mean()) / tensor.standardDeviation()

// Interop with MLMultiArray
let multiArray = try MLMultiArray([1.0, 2.0, 3.0, 4.0])
let fromMultiArray = MLTensor(multiArray)
let backToArray = tensor.shapedArray(of: Float.self)
MLTensor
是Swift原生的多维数组,用于预处理/后处理。操作会延迟执行——调用
.shapedArray(of:)
以生成结果。
swift
import CoreML

// 创建
let tensor = MLTensor([1.0, 2.0, 3.0, 4.0])
let zeros = MLTensor(zeros: [3, 224, 224], scalarType: Float.self)

// 重塑形状
let reshaped = tensor.reshaped(to: [2, 2])

// 数学运算
let softmaxed = tensor.softmax()
let normalized = (tensor - tensor.mean()) / tensor.standardDeviation()

// 与MLMultiArray互操作
let multiArray = try MLMultiArray([1.0, 2.0, 3.0, 4.0])
let fromMultiArray = MLTensor(multiArray)
let backToArray = tensor.shapedArray(of: Float.self)

Working with MLMultiArray

MLMultiArray 用法

MLMultiArray
is the primary data exchange type for non-image model inputs and outputs. Use it when the auto-generated class expects array-type features.
swift
// Create a 3D array: [batch, sequence, features]
let array = try MLMultiArray(shape: [1, 128, 768], dataType: .float32)

// Write values
for i in 0..<128 {
    array[[0, i, 0] as [NSNumber]] = NSNumber(value: Float(i))
}

// Read values
let value = array[[0, 0, 0] as [NSNumber]].floatValue

// Create from data pointer for zero-copy interop
let data: [Float] = [1.0, 2.0, 3.0]
let fromData = try MLMultiArray(dataPointer: UnsafeMutableRawPointer(mutating: data),
                                 shape: [3],
                                 dataType: .float32,
                                 strides: [1])
See
references/coreml-swift-integration.md
for advanced MLMultiArray patterns including NLP tokenization and audio feature extraction.
MLMultiArray
是非图像模型输入和输出的主要数据交换类型。当自动生成的类期望数组类型的特征时使用它。
swift
// 创建一个3D数组:[批次, 序列, 特征]
let array = try MLMultiArray(shape: [1, 128, 768], dataType: .float32)

// 写入值
for i in 0..<128 {
    array[[0, i, 0] as [NSNumber]] = NSNumber(value: Float(i))
}

// 读取值
let value = array[[0, 0, 0] as [NSNumber]].floatValue

// 从数据指针创建以实现零拷贝互操作
let data: [Float] = [1.0, 2.0, 3.0]
let fromData = try MLMultiArray(dataPointer: UnsafeMutableRawPointer(mutating: data),
                                 shape: [3],
                                 dataType: .float32,
                                 strides: [1])
有关高级MLMultiArray模式(包括NLP分词和音频特征提取),请参见
references/coreml-swift-integration.md

Image Preprocessing

图像预处理

Image models expect
CVPixelBuffer
input. Use
CGImage
conversion for photos from the camera or photo library. Vision's
VNCoreMLRequest
handles this automatically; manual conversion is needed only for direct
MLModel
prediction.
swift
import CoreVideo

func createPixelBuffer(from cgImage: CGImage, width: Int, height: Int) -> CVPixelBuffer? {
    var pixelBuffer: CVPixelBuffer?
    let attrs: [CFString: Any] = [
        kCVPixelBufferCGImageCompatibilityKey: true,
        kCVPixelBufferCGBitmapContextCompatibilityKey: true,
    ]
    CVPixelBufferCreate(kCFAllocatorDefault, width, height,
                        kCVPixelFormatType_32ARGB, attrs as CFDictionary, &pixelBuffer)

    guard let buffer = pixelBuffer else { return nil }
    CVPixelBufferLockBaseAddress(buffer, [])
    let context = CGContext(
        data: CVPixelBufferGetBaseAddress(buffer),
        width: width, height: height,
        bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(buffer),
        space: CGColorSpaceCreateDeviceRGB(),
        bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
    )
    context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))
    CVPixelBufferUnlockBaseAddress(buffer, [])
    return buffer
}
For additional preprocessing patterns (normalization, center-cropping), see
references/coreml-swift-integration.md
.
图像模型期望输入
CVPixelBuffer
。对于来自相机或照片库的照片,使用
CGImage
转换。Vision的
VNCoreMLRequest
会自动处理此操作;仅当直接使用
MLModel
进行预测时才需要手动转换。
swift
import CoreVideo

func createPixelBuffer(from cgImage: CGImage, width: Int, height: Int) -> CVPixelBuffer? {
    var pixelBuffer: CVPixelBuffer?
    let attrs: [CFString: Any] = [
        kCVPixelBufferCGImageCompatibilityKey: true,
        kCVPixelBufferCGBitmapContextCompatibilityKey: true,
    ]
    CVPixelBufferCreate(kCFAllocatorDefault, width, height,
                        kCVPixelFormatType_32ARGB, attrs as CFDictionary, &pixelBuffer)

    guard let buffer = pixelBuffer else { return nil }
    CVPixelBufferLockBaseAddress(buffer, [])
    let context = CGContext(
        data: CVPixelBufferGetBaseAddress(buffer),
        width: width, height: height,
        bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(buffer),
        space: CGColorSpaceCreateDeviceRGB(),
        bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
    )
    context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))
    CVPixelBufferUnlockBaseAddress(buffer, [])
    return buffer
}
有关其他预处理模式(归一化、中心裁剪),请参见
references/coreml-swift-integration.md

Multi-Model Pipelines

多模型流水线

Chain models when preprocessing or postprocessing requires a separate model.
swift
// Sequential inference: preprocessor -> main model -> postprocessor
let preprocessed = try preprocessor.prediction(from: rawInput)
let mainOutput = try mainModel.prediction(from: preprocessed)
let finalOutput = try postprocessor.prediction(from: mainOutput)
For Xcode-managed pipelines, use the pipeline model type in the
.mlpackage
. Each sub-model runs on its optimal compute unit.
当预处理或后处理需要单独模型时,将模型链式调用。
swift
// 顺序推理:预处理器 -> 主模型 -> 后处理器
let preprocessed = try preprocessor.prediction(from: rawInput)
let mainOutput = try mainModel.prediction(from: preprocessed)
let finalOutput = try postprocessor.prediction(from: mainOutput)
对于Xcode管理的流水线,在
.mlpackage
中使用流水线模型类型。每个子模型都会在其最优计算单元上运行。

Vision Integration

Vision 集成

Use Vision to run Core ML image models with automatic image preprocessing (resizing, normalization, color space, orientation).
使用Vision运行Core ML图像模型,自动处理图像预处理(调整大小、归一化、颜色空间、方向)。

Modern: CoreMLRequest (iOS 18+)

现代方式:CoreMLRequest(iOS 18+)

swift
import Vision
import CoreML

let model = try MLModel(contentsOf: modelURL, configuration: config)
let request = CoreMLRequest(model: .init(model))
let results = try await request.perform(on: cgImage)

if let classification = results.first as? ClassificationObservation {
    print("\(classification.identifier): \(classification.confidence)")
}
swift
import Vision
import CoreML

let model = try MLModel(contentsOf: modelURL, configuration: config)
let request = CoreMLRequest(model: .init(model))
let results = try await request.perform(on: cgImage)

if let classification = results.first as? ClassificationObservation {
    print("\(classification.identifier): \(classification.confidence)")
}

Legacy: VNCoreMLRequest

传统方式:VNCoreMLRequest

swift
let vnModel = try VNCoreMLModel(for: model)
let request = VNCoreMLRequest(model: vnModel) { request, error in
    guard let results = request.results as? [VNRecognizedObjectObservation] else { return }
    for observation in results {
        let label = observation.labels.first?.identifier ?? "unknown"
        let confidence = observation.labels.first?.confidence ?? 0
        let boundingBox = observation.boundingBox // normalized coordinates
        print("\(label): \(confidence) at \(boundingBox)")
    }
}
request.imageCropAndScaleOption = .scaleFill

let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer)
try handler.perform([request])
For complete Vision framework patterns (text recognition, barcode detection, document scanning), see the
vision-framework
skill.
swift
let vnModel = try VNCoreMLModel(for: model)
let request = VNCoreMLRequest(model: vnModel) { request, error in
    guard let results = request.results as? [VNRecognizedObjectObservation] else { return }
    for observation in results {
        let label = observation.labels.first?.identifier ?? "unknown"
        let confidence = observation.labels.first?.confidence ?? 0
        let boundingBox = observation.boundingBox // 归一化坐标
        print("\(label): \(confidence) at \(boundingBox)")
    }
}
request.imageCropAndScaleOption = .scaleFill

let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer)
try handler.perform([request])
有关完整的Vision框架模式(文本识别、条形码检测、文档扫描),请参见
vision-framework
技能。

Performance Profiling

性能分析

MLComputePlan (iOS 17.4+)

MLComputePlan(iOS 17.4+)

Inspect which compute device each operation will use before running predictions.
swift
let computePlan = try await MLComputePlan.load(
    contentsOf: modelURL, configuration: config
)
guard case let .program(program) = computePlan.modelStructure else { return }
guard let mainFunction = program.functions["main"] else { return }

for operation in mainFunction.block.operations {
    let deviceUsage = computePlan.deviceUsage(for: operation)
    let estimatedCost = computePlan.estimatedCost(of: operation)
    print("\(operation.operatorName): \(deviceUsage?.preferredComputeDevice ?? "unknown")")
}
在运行预测之前,检查每个操作将使用哪个计算设备。
swift
let computePlan = try await MLComputePlan.load(
    contentsOf: modelURL, configuration: config
)
guard case let .program(program) = computePlan.modelStructure else { return }
guard let mainFunction = program.functions["main"] else { return }

for operation in mainFunction.block.operations {
    let deviceUsage = computePlan.deviceUsage(for: operation)
    let estimatedCost = computePlan.estimatedCost(of: operation)
    print("\(operation.operatorName): \(deviceUsage?.preferredComputeDevice ?? "unknown")")
}

Instruments

Instruments工具

Use the Core ML instrument template in Instruments to profile:
  • Model load time
  • Prediction latency (per-operation breakdown)
  • Compute device dispatch (CPU/GPU/ANE per operation)
  • Memory allocation
Run outside the debugger for accurate results (Xcode: Product > Profile).
使用Instruments中的Core ML工具模板进行性能分析:
  • 模型加载时间
  • 预测延迟(按操作细分)
  • 计算设备调度(每个操作的CPU/GPU/ANE使用情况)
  • 内存分配
为获得准确结果,请在调试器外运行(Xcode:Product > Profile)。

Model Deployment

模型部署

Bundle vs On-Demand Resources

包内集成 vs 按需资源

StrategyProsCons
Bundle in appInstant availability, works offlineIncreases app download size
On-demand resourcesSmaller initial downloadRequires download before first use
Background Assets (iOS 16+)Downloads ahead of timeMore complex setup
CloudKit / serverMaximum flexibilityRequires network, longer setup
策略优点缺点
集成到应用包中立即可用,离线可用增加应用下载大小
按需资源初始下载包更小首次使用前需要下载
Background Assets(iOS 16+)提前下载设置更复杂
CloudKit / 服务器灵活性最高需要网络,设置耗时更长

Size Considerations

大小考量

  • App Store limit: 4 GB for app bundle
  • Cellular download limit: 200 MB (can request exception)
  • Use ODR tags for models > 50 MB
  • Pre-compile to
    .mlmodelc
    to skip on-device compilation
swift
// On-demand resource loading
let request = NSBundleResourceRequest(tags: ["ml-model-v2"])
try await request.beginAccessingResources()
let modelURL = Bundle.main.url(forResource: "LargeModel", withExtension: "mlmodelc")!
let model = try await MLModel.load(contentsOf: modelURL, configuration: config)
// Call request.endAccessingResources() when done
  • App Store限制:应用包最大4 GB
  • 蜂窝网络下载限制:200 MB(可申请例外)
  • 对于大于50 MB的模型,使用ODR标签
  • 预编译为
    .mlmodelc
    以跳过设备端编译
swift
// 按需资源加载
let request = NSBundleResourceRequest(tags: ["ml-model-v2"])
try await request.beginAccessingResources()
let modelURL = Bundle.main.url(forResource: "LargeModel", withExtension: "mlmodelc")!
let model = try await MLModel.load(contentsOf: modelURL, configuration: config)
// 使用完成后调用request.endAccessingResources()

Memory Management

内存管理

  • Unload on background: Release model references when the app enters background to free GPU/ANE memory. Reload on foreground return.
  • Use
    .cpuOnly
    for background tasks:
    Background processing cannot use GPU or ANE; setting
    .cpuOnly
    avoids silent fallback and resource contention.
  • Share model instances: Never create multiple
    MLModel
    instances from the same compiled model. Use an actor to provide shared access.
  • Monitor memory pressure: Large models (>100 MB) can trigger memory warnings. Register for
    UIApplication.didReceiveMemoryWarningNotification
    and release cached models when under pressure.
See
references/coreml-swift-integration.md
for an actor-based model manager with lifecycle-aware loading and cache eviction.
  • 后台时卸载: 当应用进入后台时释放模型引用,以释放GPU/ANE内存。回到前台时重新加载。
  • 后台任务使用
    .cpuOnly
    后台处理无法使用GPU或ANE;设置
    .cpuOnly
    可避免静默回退和资源竞争。
  • 共享模型实例: 切勿从同一编译模型创建多个
    MLModel
    实例。使用actor提供共享访问。
  • 监控内存压力: 大型模型(>100 MB)可能触发内存警告。注册
    UIApplication.didReceiveMemoryWarningNotification
    ,在内存压力下释放缓存的模型。
有关基于actor的模型管理器(支持生命周期感知加载和缓存淘汰),请参见
references/coreml-swift-integration.md

Common Mistakes

常见错误

DON'T: Load models on the main thread. DO: Use
MLModel.load(contentsOf:configuration:)
async API or load on a background actor. Why: Large models can take seconds to load, freezing the UI.
DON'T: Recompile
.mlpackage
to
.mlmodelc
on every app launch. DO: Compile once with
MLModel.compileModel(at:)
and cache the compiled URL persistently. Why: Compilation is expensive. Cache the
.mlmodelc
in Application Support.
DON'T: Hardcode
.cpuOnly
unless you have a specific reason. DO: Use
.all
and let the system choose the optimal compute unit. Why:
.all
enables Neural Engine and GPU, which are faster and more energy-efficient.
DON'T: Ignore
MLFeatureValue
type mismatches between input and model expectations. DO: Match types exactly -- use
MLFeatureValue(pixelBuffer:)
for images, not raw data. Why: Type mismatches cause cryptic runtime crashes or silent incorrect results.
DON'T: Create a new
MLModel
instance for every prediction. DO: Load once and reuse. Use an actor to manage the model lifecycle. Why: Model loading allocates significant memory and compute resources.
DON'T: Skip error handling for model loading and prediction. DO: Catch errors and provide fallback behavior when the model fails. Why: Models can fail to load on older devices or when resources are constrained.
DON'T: Assume all operations run on the Neural Engine. DO: Use
MLComputePlan
(iOS 17.4+) to verify device dispatch per operation. Why: Unsupported operations fall back to CPU, which may bottleneck the pipeline.
DON'T: Process images manually before passing to Vision + Core ML. DO: Use
CoreMLRequest
(iOS 18+) or
VNCoreMLRequest
(legacy) to let Vision handle preprocessing. Why: Vision handles orientation, scaling, and pixel format conversion correctly.
不要: 在主线程加载模型。 要: 使用
MLModel.load(contentsOf:configuration:)
异步API或在后台actor上加载。 原因: 大型模型加载可能需要数秒,会冻结UI。
不要: 每次应用启动时重新将
.mlpackage
编译为
.mlmodelc
要: 使用
MLModel.compileModel(at:)
编译一次,并持久化缓存编译后的URL。 原因: 编译成本高。将
.mlmodelc
缓存到Application Support中。
不要: 除非有特定原因,否则不要硬编码
.cpuOnly
要: 使用
.all
,让系统选择最优计算单元。 原因:
.all
启用神经引擎和GPU,速度更快且能效更高。
不要: 忽略输入与模型期望之间的
MLFeatureValue
类型不匹配。 要: 完全匹配类型——使用
MLFeatureValue(pixelBuffer:)
处理图像,而非原始数据。 原因: 类型不匹配会导致模糊的运行时崩溃或静默的错误结果。
不要: 每次预测都创建新的
MLModel
实例。 要: 加载一次并复用。使用actor管理模型生命周期。 原因: 模型加载会分配大量内存和计算资源。
不要: 跳过模型加载和预测的错误处理。 要: 捕获错误并在模型加载失败时提供回退行为。 原因: 模型可能在旧设备或资源受限的情况下加载失败。
不要: 假设所有操作都在神经引擎上运行。 要: 使用
MLComputePlan
(iOS 17.4+)验证每个操作的设备调度。 原因: 不支持的操作会回退到CPU,这可能会成为流水线的瓶颈。
不要: 在将图像传递给Vision + Core ML之前手动处理图像。 要: 使用
CoreMLRequest
(iOS 18+)或
VNCoreMLRequest
(传统方式),让Vision处理预处理。 原因: Vision可以正确处理方向、缩放和像素格式转换。

Review Checklist

检查清单

  • Model loaded asynchronously (not blocking main thread)
  • MLModelConfiguration.computeUnits
    set appropriately for use case
  • Model instance reused across predictions (not recreated each time)
  • Auto-generated class used when available (typed inputs/outputs)
  • Error handling for model loading and prediction failures
  • Compiled model cached persistently if compiled at runtime
  • Image inputs use Vision pipeline (
    CoreMLRequest
    iOS 18+ or
    VNCoreMLRequest
    ) for correct preprocessing
  • MLComputePlan
    checked to verify compute device dispatch (iOS 17.4+)
  • Batch predictions used when processing multiple inputs
  • Model size appropriate for deployment strategy (bundle vs ODR)
  • Memory tested on target devices (especially older devices with less RAM)
  • Predictions run outside debugger for accurate performance measurement
  • 模型异步加载(不阻塞主线程)
  • 根据用例正确设置
    MLModelConfiguration.computeUnits
  • 模型实例在多次预测之间复用(不每次重新创建)
  • 尽可能使用自动生成类(类型化输入/输出)
  • 为模型加载和预测失败添加错误处理
  • 如果在运行时编译模型,持久化缓存编译后的模型
  • 图像输入使用Vision流水线(iOS 18+用
    CoreMLRequest
    或传统方式用
    VNCoreMLRequest
    )以确保正确预处理
  • 使用
    MLComputePlan
    验证计算设备调度(iOS 17.4+)
  • 处理多个输入时使用批量预测
  • 模型大小与部署策略(包内集成vs ODR)匹配
  • 在目标设备上测试内存(尤其是RAM较小的旧设备)
  • 在调试器外运行预测以获得准确的性能测量结果

References

参考资料

  • Patterns and code:
    references/coreml-swift-integration.md
  • Model conversion (Python):
    apple-on-device-ai
    skill,
    ../apple-on-device-ai/references/coreml-conversion.md
  • Model optimization:
    apple-on-device-ai
    skill,
    ../apple-on-device-ai/references/coreml-optimization.md
  • Apple docs: Core ML | MLModel | MLComputePlan
  • 模式与代码:
    references/coreml-swift-integration.md
  • 模型转换(Python):
    apple-on-device-ai
    技能,
    ../apple-on-device-ai/references/coreml-conversion.md
  • 模型优化:
    apple-on-device-ai
    技能,
    ../apple-on-device-ai/references/coreml-optimization.md
  • Apple文档:Core ML | MLModel | MLComputePlan