Back to Details

compare-models

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Docs

文档

Reference: https://replicate.com/docs/llms.txt
OpenAPI schema: https://api.replicate.com/openapi.json
MCP server: https://mcp.replicate.com

Per-model docs:

https://replicate.com/{owner}/{model}/llms.txt

Set
```
Accept: text/markdown
```
when requesting docs pages for Markdown responses.

参考文档：https://replicate.com/docs/llms.txt
OpenAPI 架构：https://api.replicate.com/openapi.json
MCP 服务器：https://mcp.replicate.com

单个模型文档：

https://replicate.com/{owner}/{model}/llms.txt

请求文档页面时设置
```
Accept: text/markdown
```
可获取Markdown格式的响应。

Workflow

工作流程

Search or browse collections to build a shortlist of candidate models.
Fetch each model's schema to compare inputs, outputs, and capabilities.
Check pricing from model metadata or the Replicate website.
Run a small batch of test predictions to compare output quality.
Pick the model that best fits your constraints (cost, latency, quality).

搜索或浏览模型合集，筛选出候选模型短名单。
获取每个模型的架构，对比其输入、输出和功能。
从模型元数据或Replicate官网查看定价。
运行一小批测试预测，对比输出质量。
选择最符合你约束条件（成本、延迟、质量）的模型。

What to compare

对比维度

Speed: Check
```
metrics.predict_time
```
on completed predictions for actual inference time. Official models are always warm. Community models can cold-boot.
Cost: Official models have predictable per-run pricing. Community models charge by compute time (GPU-seconds). Run a few predictions and check the
```
metrics
```
field for actual cost.
Quality: Run the same prompts through each model and compare outputs. Quality is subjective. Match it to your use case, not a leaderboard.
Capabilities: Compare input schemas for supported features (reference images, masks, aspect ratios, streaming, multi-image input). Check output formats.

速度：查看已完成预测的
```
metrics.predict_time
```
字段获取实际推理时间。官方模型始终处于预热状态，社区模型可能需要冷启动。
成本：官方模型的单次运行定价可预测，社区模型按计算时间（GPU秒）收费。运行几次预测后，查看
```
metrics
```
字段获取实际成本。
质量：使用相同提示词在每个模型上运行，对比输出结果。质量具有主观性，需匹配你的使用场景，而非盲目参考排行榜。
功能：对比输入架构支持的特性（参考图片、蒙版、宽高比、流式输出、多图输入），检查输出格式。

Key tradeoffs

核心权衡

Lowest cost: smaller/distilled models. Accept slower inference and lower quality.
Lowest latency: official models or schnell/turbo variants. Accept higher cost per run.
Highest quality: pro/max/quality variants. Accept slower inference and higher cost.
Most control: models with ControlNet, masks, or reference images. Accept more complex input setup.

最低成本：小型/蒸馏模型。需接受较慢的推理速度和较低的质量。
最低延迟：官方模型或schnell/turbo变体。需接受更高的单次运行成本。
最高质量：pro/max/quality变体。需接受较慢的推理速度和更高的成本。
最多控制：支持ControlNet、蒙版或参考图片的模型。需接受更复杂的输入设置。

Official vs community models

官方模型 vs 社区模型

Official models: always warm, stable APIs, predictable pricing, maintained by Replicate.
Community models: may cold-boot, require version pinning, maintained by the author.
If a community model meets your needs and an official model doesn't, consider creating a deployment for consistent uptime.

官方模型：始终预热、API稳定、定价可预测，由Replicate维护。
社区模型：可能需要冷启动、需固定版本，由作者维护。
如果社区模型满足你的需求而官方模型不满足，可以考虑创建部署以保证稳定的可用性。

Prompting guidance

提示词指导

For prompting techniques and task-specific guidance:

Image generation and editing: see the prompt-images skill.
Video generation: see the prompt-videos skill.

如需提示词技巧和特定任务指导：

图像生成与编辑：查看prompt-images技能文档。
视频生成：查看prompt-videos技能文档。