ml-engineering

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ML Engineering Guide

ML工程指南

Production-grade ML/AI systems, MLOps, and model deployment.

生产级ML/AI系统、MLOps与模型部署。

When to Use

适用场景

Deploying ML models to production
Building ML platforms and infrastructure
Implementing MLOps pipelines
Integrating LLMs into production systems
Setting up model monitoring and drift detection

将ML模型部署到生产环境
构建ML平台与基础设施
实现MLOps流水线
将LLM集成到生产系统
设置模型监控与漂移检测

Tech Stack

技术栈

Category	Tools
ML Frameworks	PyTorch, TensorFlow, Scikit-learn, XGBoost
LLM Frameworks	LangChain, LlamaIndex, DSPy
Data Tools	Spark, Airflow, dbt, Kafka, Databricks
Deployment	Docker, Kubernetes, AWS/GCP/Azure
Monitoring	MLflow, Weights & Biases, Prometheus
Databases	PostgreSQL, BigQuery, Snowflake, Pinecone

类别	工具
ML框架	PyTorch, TensorFlow, Scikit-learn, XGBoost
LLM框架	LangChain, LlamaIndex, DSPy
数据工具	Spark, Airflow, dbt, Kafka, Databricks
部署工具	Docker, Kubernetes, AWS/GCP/Azure
监控工具	MLflow, Weights & Biases, Prometheus
数据库	PostgreSQL, BigQuery, Snowflake, Pinecone

Production Patterns

生产模式

Model Deployment Pipeline

模型部署流水线

python

undefined

python

undefined

Model serving with FastAPI

from fastapi import FastAPI import torch

app = FastAPI() model = torch.load("model.pth")

@app.post("/predict") async def predict(data: dict): tensor = preprocess(data) with torch.no_grad(): prediction = model(tensor) return {"prediction": prediction.tolist()}

undefined

from fastapi import FastAPI import torch

app = FastAPI() model = torch.load("model.pth")

@app.post("/predict") async def predict(data: dict): tensor = preprocess(data) with torch.no_grad(): prediction = model(tensor) return {"prediction": prediction.tolist()}

undefined

Feature Store Integration

特征存储集成

python

undefined

python

undefined

Feast feature store

from feast import FeatureStore

store = FeatureStore(repo_path=".") features = store.get_online_features( features=["user_features:age", "user_features:location"], entity_rows=[{"user_id": 123}] ).to_dict()

undefined

from feast import FeatureStore

store = FeatureStore(repo_path=".") features = store.get_online_features( features=["user_features:age", "user_features:location"], entity_rows=[{"user_id": 123}] ).to_dict()

undefined

Model Monitoring

模型监控

python

undefined

python

undefined

Drift detection

from evidently import ColumnMapping from evidently.report import Report from evidently.metric_preset import DataDriftPreset

report = Report(metrics=[DataDriftPreset()]) report.run(reference_data=ref_df, current_data=curr_df)

undefined

from evidently import ColumnMapping from evidently.report import Report from evidently.metric_preset import DataDriftPreset

report = Report(metrics=[DataDriftPreset()]) report.run(reference_data=ref_df, current_data=curr_df)

undefined

MLOps Best Practices

MLOps最佳实践

Development

开发阶段

Test-driven development for ML pipelines
Version control models and data
Reproducible experiments with MLflow

ML流水线的测试驱动开发
版本控制模型与数据
使用MLflow实现可复现的实验

Production

生产阶段

A/B testing infrastructure
Canary deployments for models
Automated retraining pipelines
Model monitoring and drift detection

A/B测试基础设施
模型的金丝雀部署
自动化重训练流水线
模型监控与漂移检测

Performance Targets

性能指标目标

Metric	Target
P50 Latency	< 50ms
P95 Latency	< 100ms
P99 Latency	< 200ms
Throughput	> 1000 RPS
Availability	99.9%

指标	目标
P50延迟	< 50毫秒
P95延迟	< 100毫秒
P99延迟	< 200毫秒
吞吐量	> 1000 RPS
可用性	99.9%

LLM Integration Patterns

LLM集成模式

RAG System

RAG系统

python

undefined

python

undefined

Basic RAG with LangChain

from langchain.vectorstores import Pinecone from langchain.embeddings import OpenAIEmbeddings from langchain.chains import RetrievalQA

vectorstore = Pinecone.from_existing_index( index_name="docs", embedding=OpenAIEmbeddings() ) qa = RetrievalQA.from_chain_type( llm=llm, retriever=vectorstore.as_retriever() )

undefined

from langchain.vectorstores import Pinecone from langchain.embeddings import OpenAIEmbeddings from langchain.chains import RetrievalQA

vectorstore = Pinecone.from_existing_index( index_name="docs", embedding=OpenAIEmbeddings() ) qa = RetrievalQA.from_chain_type( llm=llm, retriever=vectorstore.as_retriever() )

undefined

Prompt Management

提示词管理

python

undefined

python

undefined

Structured prompts with DSPy

import dspy

class QA(dspy.Signature): """Answer questions based on context.""" context = dspy.InputField() question = dspy.InputField() answer = dspy.OutputField()

qa = dspy.Predict(QA)

undefined

import dspy

class QA(dspy.Signature): """Answer questions based on context.""" context = dspy.InputField() question = dspy.InputField() answer = dspy.OutputField()

qa = dspy.Predict(QA)

undefined

Common Commands

常用命令

bash

undefined

bash

undefined

Development

python -m pytest tests/ -v --cov python -m black src/ python -m pylint src/

Training

python scripts/train.py --config prod.yaml mlflow run . -P epochs=10

Deployment

docker build -t model:v1 . kubectl apply -f k8s/model-serving.yaml

Monitoring

mlflow ui --port 5000

undefined

mlflow ui --port 5000

undefined

Security & Compliance

安全与合规

Authentication for model endpoints
Data encryption (at rest & in transit)
PII handling and anonymization
GDPR/CCPA compliance
Model access audit logging

模型端点的身份验证
数据加密（静态存储与传输过程中）
PII数据处理与匿名化
GDPR/CCPA合规
模型访问审计日志