ml-engineering
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese<!-- Adapted from: claude-skills/engineering-team/senior-ml-engineer -->
<!-- 改编自:claude-skills/engineering-team/senior-ml-engineer -->
ML Engineering Guide
ML工程指南
Production-grade ML/AI systems, MLOps, and model deployment.
生产级ML/AI系统、MLOps与模型部署。
When to Use
适用场景
- Deploying ML models to production
- Building ML platforms and infrastructure
- Implementing MLOps pipelines
- Integrating LLMs into production systems
- Setting up model monitoring and drift detection
- 将ML模型部署到生产环境
- 构建ML平台与基础设施
- 实现MLOps流水线
- 将LLM集成到生产系统
- 设置模型监控与漂移检测
Tech Stack
技术栈
| Category | Tools |
|---|---|
| ML Frameworks | PyTorch, TensorFlow, Scikit-learn, XGBoost |
| LLM Frameworks | LangChain, LlamaIndex, DSPy |
| Data Tools | Spark, Airflow, dbt, Kafka, Databricks |
| Deployment | Docker, Kubernetes, AWS/GCP/Azure |
| Monitoring | MLflow, Weights & Biases, Prometheus |
| Databases | PostgreSQL, BigQuery, Snowflake, Pinecone |
| 类别 | 工具 |
|---|---|
| ML框架 | PyTorch, TensorFlow, Scikit-learn, XGBoost |
| LLM框架 | LangChain, LlamaIndex, DSPy |
| 数据工具 | Spark, Airflow, dbt, Kafka, Databricks |
| 部署工具 | Docker, Kubernetes, AWS/GCP/Azure |
| 监控工具 | MLflow, Weights & Biases, Prometheus |
| 数据库 | PostgreSQL, BigQuery, Snowflake, Pinecone |
Production Patterns
生产模式
Model Deployment Pipeline
模型部署流水线
python
undefinedpython
undefinedModel serving with FastAPI
Model serving with FastAPI
from fastapi import FastAPI
import torch
app = FastAPI()
model = torch.load("model.pth")
@app.post("/predict")
async def predict(data: dict):
tensor = preprocess(data)
with torch.no_grad():
prediction = model(tensor)
return {"prediction": prediction.tolist()}
undefinedfrom fastapi import FastAPI
import torch
app = FastAPI()
model = torch.load("model.pth")
@app.post("/predict")
async def predict(data: dict):
tensor = preprocess(data)
with torch.no_grad():
prediction = model(tensor)
return {"prediction": prediction.tolist()}
undefinedFeature Store Integration
特征存储集成
python
undefinedpython
undefinedFeast feature store
Feast feature store
from feast import FeatureStore
store = FeatureStore(repo_path=".")
features = store.get_online_features(
features=["user_features:age", "user_features:location"],
entity_rows=[{"user_id": 123}]
).to_dict()
undefinedfrom feast import FeatureStore
store = FeatureStore(repo_path=".")
features = store.get_online_features(
features=["user_features:age", "user_features:location"],
entity_rows=[{"user_id": 123}]
).to_dict()
undefinedModel Monitoring
模型监控
python
undefinedpython
undefinedDrift detection
Drift detection
from evidently import ColumnMapping
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
report = Report(metrics=[DataDriftPreset()])
report.run(reference_data=ref_df, current_data=curr_df)
undefinedfrom evidently import ColumnMapping
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
report = Report(metrics=[DataDriftPreset()])
report.run(reference_data=ref_df, current_data=curr_df)
undefinedMLOps Best Practices
MLOps最佳实践
Development
开发阶段
- Test-driven development for ML pipelines
- Version control models and data
- Reproducible experiments with MLflow
- ML流水线的测试驱动开发
- 版本控制模型与数据
- 使用MLflow实现可复现的实验
Production
生产阶段
- A/B testing infrastructure
- Canary deployments for models
- Automated retraining pipelines
- Model monitoring and drift detection
- A/B测试基础设施
- 模型的金丝雀部署
- 自动化重训练流水线
- 模型监控与漂移检测
Performance Targets
性能指标目标
| Metric | Target |
|---|---|
| P50 Latency | < 50ms |
| P95 Latency | < 100ms |
| P99 Latency | < 200ms |
| Throughput | > 1000 RPS |
| Availability | 99.9% |
| 指标 | 目标 |
|---|---|
| P50延迟 | < 50毫秒 |
| P95延迟 | < 100毫秒 |
| P99延迟 | < 200毫秒 |
| 吞吐量 | > 1000 RPS |
| 可用性 | 99.9% |
LLM Integration Patterns
LLM集成模式
RAG System
RAG系统
python
undefinedpython
undefinedBasic RAG with LangChain
Basic RAG with LangChain
from langchain.vectorstores import Pinecone
from langchain.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA
vectorstore = Pinecone.from_existing_index(
index_name="docs",
embedding=OpenAIEmbeddings()
)
qa = RetrievalQA.from_chain_type(
llm=llm,
retriever=vectorstore.as_retriever()
)
undefinedfrom langchain.vectorstores import Pinecone
from langchain.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA
vectorstore = Pinecone.from_existing_index(
index_name="docs",
embedding=OpenAIEmbeddings()
)
qa = RetrievalQA.from_chain_type(
llm=llm,
retriever=vectorstore.as_retriever()
)
undefinedPrompt Management
提示词管理
python
undefinedpython
undefinedStructured prompts with DSPy
Structured prompts with DSPy
import dspy
class QA(dspy.Signature):
"""Answer questions based on context."""
context = dspy.InputField()
question = dspy.InputField()
answer = dspy.OutputField()
qa = dspy.Predict(QA)
undefinedimport dspy
class QA(dspy.Signature):
"""Answer questions based on context."""
context = dspy.InputField()
question = dspy.InputField()
answer = dspy.OutputField()
qa = dspy.Predict(QA)
undefinedCommon Commands
常用命令
bash
undefinedbash
undefinedDevelopment
Development
python -m pytest tests/ -v --cov
python -m black src/
python -m pylint src/
python -m pytest tests/ -v --cov
python -m black src/
python -m pylint src/
Training
Training
python scripts/train.py --config prod.yaml
mlflow run . -P epochs=10
python scripts/train.py --config prod.yaml
mlflow run . -P epochs=10
Deployment
Deployment
docker build -t model:v1 .
kubectl apply -f k8s/model-serving.yaml
docker build -t model:v1 .
kubectl apply -f k8s/model-serving.yaml
Monitoring
Monitoring
mlflow ui --port 5000
undefinedmlflow ui --port 5000
undefinedSecurity & Compliance
安全与合规
- Authentication for model endpoints
- Data encryption (at rest & in transit)
- PII handling and anonymization
- GDPR/CCPA compliance
- Model access audit logging
- 模型端点的身份验证
- 数据加密(静态存储与传输过程中)
- PII数据处理与匿名化
- GDPR/CCPA合规
- 模型访问审计日志