huggingface-spaces
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHugging Face Spaces
Hugging Face Spaces
Hugging Face Spaces host machine-learning applications. There are 1M+ today; each Space is a git repo. This skill covers creating, building, debugging, and maintaining them.
Hugging Face Spaces用于托管机器学习应用,目前已有超过100万个;每个Space都是一个Git仓库。本技能涵盖Space的创建、构建、调试与维护全流程。
0. Getting ready
0. 准备工作
Before anything else:
- Check the CLI is installed:
hf. If not,which hf.pip install -U huggingface_hub - Check the user is logged in: . If not, ask them to run
hf auth whoamiin this session — they'll need a write-scoped token from https://huggingface.co/settings/tokens.! hf auth login - Note 's
whoamiandcanPayflags — they gate hardware choices below.isPro
The skill teaches an agent every command and is the recommended companion to this one. Install it with (add to install for Claude Code as well, user-level).
hf-clihfhf skills add hf-cli--claude --global在开始操作前,请完成以下步骤:
- 检查CLI是否已安装:执行
hf。若未安装,运行which hf。pip install -U huggingface_hub - 检查用户是否已登录:执行。若未登录,请用户在当前会话中运行
hf auth whoami——他们需要从https://huggingface.co/settings/tokens获取一个具有写入权限的令牌。! hf auth login - 记录返回的
whoami和canPay标识——这些将决定后续可选择的硬件方案。isPro
hf-clihfhf skills add hf-cli--claude --global1. What a Space is
1. Space是什么
A Space is a git repo with three possible SDKs:
- Gradio — most Spaces. Python, fast iteration, supports ZeroGPU.
- Docker — arbitrary container. Use when you need a non-Python stack or a pre-built template (Streamlit, Argilla, Shiny, etc. — full list at https://huggingface.co/docs/hub/spaces-sdks-docker). Does not support ZeroGPU.
- Static — plain HTML, or a React/Svelte/Vue project built at deploy time. Use for in-browser ML (transformers.js / WebGPU / WebAssembly / onnxruntime-web), project pages, interactive reports, or Spaces that orchestrate other Spaces. No hardware needed.
Space是一个Git仓库,支持三种SDK:
- Gradio——最常用的Space类型。基于Python,迭代速度快,支持ZeroGPU。
- Docker——支持任意容器。当你需要非Python技术栈或预构建模板(如Streamlit、Argilla、Shiny等,完整列表见https://huggingface.co/docs/hub/spaces-sdks-docker)时使用。**不支持ZeroGPU**。
- Static——纯HTML,或部署时构建的React/Svelte/Vue项目。适用于浏览器端机器学习(transformers.js / WebGPU / WebAssembly / onnxruntime-web)、项目页面、交互式报告,或用于编排其他Space的场景。无需硬件。
Hardware tiers
硬件层级
Free, no creator cost: and (ZeroGPU). Static Spaces are also free and don't need hardware.
cpu-basiczero-a10gcpu-basicZeroGPU () — dynamic, per-request GPU allocation on NVIDIA RTX PRO 6000 Blackwell (sm_120). Two sizes: (half MIG, 48 GB, 1× quota) and (full, 96 GB, 2× quota). Free for the Space creator; Space visitors consume their own daily quota (~5 min free / 40 min Pro / 60 min Enterprise). Gradio-only, PyTorch-first. Requires the creator to be on a PRO / Team / Enterprise plan.
zero-a10glargexlargeDedicated GPU (T4, L4, A10G, L40S, A100, H200) — billed to the Space creator by the hour. List + pricing: . Only the creator can attach these, and only if . Use when ZeroGPU genuinely doesn't fit — non-PyTorch main model with heavy init, very-large-model long-context inference, etc.
hf spaces hardwarecanPay=TrueIf a non-PRO user has a use case that wants ZeroGPU, you can still build it: create a Space, code the app for ZeroGPU, push, then request a community grant. See .
cpu-basicreferences/grants.mdFor the authoritative reference: https://huggingface.co/docs/hub/spaces-overview
免费(创作者无需付费):和(ZeroGPU)。Static Space同样免费且无需硬件。
cpu-basiczero-a10gcpu-basicZeroGPU ()——动态按请求分配NVIDIA RTX PRO 6000 Blackwell(sm_120)GPU资源。提供两种规格:(半MIG,48GB显存,1倍配额)和(全MIG,96GB显存,2倍配额)。Space创作者免费使用;访问者消耗自身每日配额(普通用户约5分钟免费/Pro用户40分钟/企业用户60分钟)。仅支持Gradio,优先支持PyTorch。要求创作者拥有PRO/团队/企业计划。
zero-a10glargexlarge专属GPU(T4、L4、A10G、L40S、A100、H200)——按小时向Space创作者计费。完整列表及定价:执行。仅创作者可绑定此类硬件,且需满足。当ZeroGPU确实无法满足需求时使用,例如主模型为非PyTorch且初始化负载高、超大规模长上下文推理等场景。
hf spaces hardwarecanPay=True若非PRO用户有使用ZeroGPU的需求,仍可构建Space:创建一个类型的Space,按ZeroGPU标准编写应用代码并推送,然后申请社区资助。详情见。
cpu-basicreferences/grants.md2. Look for an existing demo first
2. 先查找已有示例
Before deciding how to build anything, search for prior art:
bash
hf spaces search "<model name or task>" --sdk gradio --limit 10If someone has built a similar Space, read its and — that gives you the working pattern. Saves a lot of blind iteration. Mention to the user what you found before committing to an approach.
app.pyrequirements.txt在决定构建方案前,先搜索已有案例:
bash
hf spaces search "<模型名称或任务>" --sdk gradio --limit 10若已有类似Space,查看其和——这能提供现成的可行方案,避免盲目试错。在确定方案前,需告知用户你找到的内容。
app.pyrequirements.txt3. Decide SDK and hardware
3. 选择SDK与硬件
Follow the user's explicit request first. If they were vague:
- Default for a public ML demo: Gradio + ZeroGPU. Use this unless something below applies.
- The model's only inference path is non-PyTorch (ONNX / TF / JAX / vLLM as the MAIN model, with heavy init): dedicated GPU.
- But: marginal non-torch tools (a small ONNX preprocessor, a TF utility) inside a torch-main pipeline are fine on ZeroGPU. The hijack only patches torch; init the non-torch lib inside and pay the short per-call init cost.
@spaces.GPU
- But: marginal non-torch tools (a small ONNX preprocessor, a TF utility) inside a torch-main pipeline are fine on ZeroGPU. The hijack only patches torch; init the non-torch lib inside
- Tiny / CPU-bound model, or API-proxy Space: (
cpu-basic-free isn't applicable to Gradio).hardware - Browser-side ML or project page: Static.
- Container with non-Python stack: Docker.
优先遵循用户明确要求。若用户需求模糊:
- 公开机器学习演示默认方案:Gradio + ZeroGPU。除非符合以下情况,否则优先选择此方案。
- 模型唯一推理路径为非PyTorch(ONNX / TF / JAX / vLLM作为主模型,且初始化负载高):使用专属GPU。
- 例外:在以PyTorch为主的流程中,少量非PyTorch工具(如小型ONNX预处理程序、TF实用工具)可在ZeroGPU上运行。只需在内初始化非PyTorch库,仅需承担每次调用的短暂初始化成本。
@spaces.GPU
- 例外:在以PyTorch为主的流程中,少量非PyTorch工具(如小型ONNX预处理程序、TF实用工具)可在ZeroGPU上运行。只需在
- 小型/CPU绑定模型,或API代理类Space:(Gradio不支持无硬件模式)。
cpu-basic - 浏览器端机器学习或项目页面:Static。
- 非Python技术栈容器:Docker。
Sourcing the model
模型来源
- GitHub repo — clone locally to read structure. If it already has a Gradio demo, the minimal viable path is to adapt it onto ZeroGPU (see ). Otherwise: read the README + inference code, prefer the PyTorch path, estimate VRAM (bf16 ≈
references/zerogpu.mdGB; 48 GB fits ≤24B params at bf16, or much larger with quantization — seeparams_B × 2for quantization on ZeroGPU).references/zerogpu.md - HF model repo — read its README, follow any linked GitHub.
- Paper / blog post — look for an official or unofficial implementation. Don't reimplement unless trivial or the user explicitly asks.
- Vague request — search Spaces first; surface results.
If the model genuinely won't fit, check Inference Providers as an alternative: see . This avoids hosting the model at all.
references/inference-providers.md- GitHub仓库——克隆到本地查看结构。若已有Gradio演示,最简路径是将其适配到ZeroGPU(见)。否则:查看README和推理代码,优先选择PyTorch路径,估算显存需求(bf16精度≈
references/zerogpu.mdGB;48GB显存可容纳≤240亿参数的bf16模型,或通过量化支持更大模型——ZeroGPU量化详情见参数数量(十亿) × 2)。references/zerogpu.md - HF模型仓库——查看其README,跟随链接到GitHub仓库。
- 论文/博客文章——寻找官方或非官方实现。除非需求简单或用户明确要求,否则不要重新实现。
- 模糊需求——先搜索Space,展示结果。
若模型确实无法适配,可考虑推理提供商作为替代方案:见。此方案无需自行托管模型。
references/inference-providers.md4. Create the Space
4. 创建Space
bash
hf repos create <namespace>/<name> --type space --space-sdk <gradio|docker|static> \
[--flavor zero-a10g|cpu-basic|<paid-flavor>] \
[--secrets KEY=val] [--env KEY=val] \
--public|--private|--protected \
--exist-ok- is required.
--space-sdk - selects hardware.
--flavoris the (legacy) identifier for ZeroGPU. Omit forzero-a10g. Runcpu-basicfor the full paid list and pricing.hf spaces hardware - Visibility: (anyone can view),
--public(only you),--private(app is reachable but git repo / Files tab is private).--protected - becomes an environment variable inside the Space and is not visible to visitors. Use for API keys, gated-repo tokens (
--secrets KEY=val), etc. Can also be set later viaHF_TOKEN=hf_….hf spaces secrets set <id> KEY=val - is visible to visitors — use only for non-sensitive config (
--env KEY=val,GRADIO_SSR_MODE=false, etc.).PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
Note:in the README YAML is silently ignored — hardware is only set viahardware:at creation, or later via--flavor.hf spaces settings <id> --hardware <name>
bash
hf repos create <命名空间>/<名称> --type space --space-sdk <gradio|docker|static> \
[--flavor zero-a10g|cpu-basic|<付费规格>] \
[--secrets KEY=val] [--env KEY=val] \
--public|--private|--protected \
--exist-ok- 为必填项。
--space-sdk - 选择硬件规格。
--flavor是ZeroGPU的(旧版)标识。省略则默认使用zero-a10g。执行cpu-basic查看完整付费规格列表及定价。hf spaces hardware - 可见性:(任何人可查看)、
--public(仅自己可见)、--private(应用可访问,但Git仓库/文件标签页私有)。--protected - 会成为Space内部的环境变量,对访问者不可见。适用于API密钥、私有仓库令牌(
--secrets KEY=val)等敏感信息。也可后续通过HF_TOKEN=hf_…设置。hf spaces secrets set <ID> KEY=val - 对访问者可见——仅用于非敏感配置(如
--env KEY=val、GRADIO_SSR_MODE=false等)。PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
注意:README YAML中的设置会被静默忽略——硬件仅在创建时通过hardware:设置,或后续通过--flavor修改。hf spaces settings <ID> --hardware <名称>
5. Build the app
5. 构建应用
The Space now exists at but is empty.
https://huggingface.co/spaces/<namespace>/<name>Space已创建,地址为,但目前为空。
https://huggingface.co/spaces/<命名空间>/<名称>README.md frontmatter
README.md前置元数据
Always required:
yaml
---
title: ...
emoji: 🚀 # pick something representative
colorFrom: blue # red|yellow|green|blue|indigo|purple|pink|gray (only these)
colorTo: indigo
sdk: gradio # gradio | docker | static
sdk_version: 6.15.1 # latest stable unless you have a reason*
app_file: app.py # gradio only (docker / static use Dockerfile / index.html)
short_description: ... # ≤ 60 chars (server rejects longer)
python_version: "3.12" # ZeroGPU officially supports 3.10.13 and 3.12.12
startup_duration_timeout: 30m # default; bump to 1h for big LLMs / heavy downloads
---* Reasons to use an older Gradio: a custom component pins it, or you're adapting an existing demo and don't want to rewrite for 5.x→6.x breaking changes. If you need a 5.x, pick (latest of the series; still supports custom components).
5.50.0All frontmatter options: https://huggingface.co/docs/hub/spaces-config-reference
必填项:
yaml
---
title: ...
emoji: 🚀 # 选择具有代表性的表情
colorFrom: blue # 可选值:red|yellow|green|blue|indigo|purple|pink|gray
colorTo: indigo
sdk: gradio # 可选值:gradio | docker | static
sdk_version: 6.15.1 # 除非有特殊原因,否则使用最新稳定版*
app_file: app.py # 仅Gradio需要(Docker/Static使用Dockerfile/index.html)
short_description: ... # ≤60字符(过长会被服务器拒绝)
python_version: "3.12" # ZeroGPU官方支持3.10.13和3.12.12
startup_duration_timeout: 30m # 默认值;大型LLM/大文件下载场景可调整为1h
---* 使用旧版Gradio的原因:自定义组件依赖特定版本,或适配已有演示时不想因5.x→6.x的破坏性变更重写代码。若需使用5.x版本,选择(该系列最新版,仍支持自定义组件)。
5.50.0Minimal ZeroGPU Gradio app
最简ZeroGPU Gradio应用
python
import spaces # MUST come before torch / diffusers / transformers
import torch
import gradio as gr
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("<repo>", torch_dtype=torch.bfloat16).to("cuda")
@spaces.GPU(duration=60)
def generate(prompt):
return pipe(prompt).images[0]
gr.Interface(fn=generate, inputs=gr.Text(), outputs=gr.Image()).launch()Three rules — full treatment in :
references/zerogpu.md- before torch / any CUDA-touching import. It monkey-patches
import spaces; once CUDA is initialized in the main process, it's too late.torch.cuda.* - Load the model at module scope, eagerly. ZeroGPU intercepts the call, packs weights to disk, and streams them into VRAM on the first
.to("cuda")entry. Lazy loading inside the decorator costs every user.@spaces.GPU - Decorate the function Gradio binds. Estimate to the realistic worst case (smaller = higher queue priority and tighter quota check). For input-dependent runtime, pass a callable.
duration
python
import spaces # 必须在torch / diffusers / transformers之前导入
import torch
import gradio as gr
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("<仓库名>", torch_dtype=torch.bfloat16).to("cuda")
@spaces.GPU(duration=60)
def generate(prompt):
return pipe(prompt).images[0]
gr.Interface(fn=generate, inputs=gr.Text(), outputs=gr.Image()).launch()三条核心规则——详细说明见:
references/zerogpu.md- 必须在torch/任何涉及CUDA的导入之前。它会通过猴子补丁修改
import spaces;一旦主进程初始化CUDA,就无法再进行补丁。torch.cuda.* - 在模块作用域加载模型,并立即。ZeroGPU会拦截此调用,将权重打包到磁盘,并在首次进入
.to("cuda")装饰的函数时将权重流式传输到显存。若在装饰器内延迟加载,会导致每个用户都需承担加载成本。@spaces.GPU - 为Gradio绑定的函数添加装饰器。估算为实际最坏情况(值越小,队列优先级越高,配额检查越严格)。若运行时取决于输入,可传入一个可调用对象。
duration
requirements.txt
requirements.txt
Short version:
- Do NOT list: ,
gradio,spaces(preinstalled and platform-managed; pinning them causes resolution failures or silently breaks the ZeroGPU runtime).huggingface_hub - Do list if you use them: ,
torchvision(not preinstalled), plus everything else (torchaudio,diffusers,transformers,accelerate, …).sentencepiece - ZeroGPU only accepts torch ,
2.8.0,2.9.1,2.10.0. Default to leaving torch unpinned (the runtime preinstalls the latest). Only pin when a dep forces it.2.11.0 - For prebuilt CUDA-extension wheels (,
flash_attn,xformers,pytorch3d,nvdiffrast,diff_gaussian_rasterization): use the prebuilt Blackwell wheels attorchmcubes. Full mapping + caveats inhttps://huggingface.co/datasets/multimodalart/zerogpu-blackwell-wheels/tree/main/wheels.references/requirements.md
简化版规则:
- 请勿列出:、
gradio、spaces(已预安装且由平台管理;固定版本会导致依赖解析失败或静默破坏ZeroGPU运行时)。huggingface_hub - 若使用则需列出:、
torchvision(未预安装),以及其他所有依赖(torchaudio、diffusers、transformers、accelerate等)。sentencepiece - ZeroGPU仅接受torch版本、
2.8.0、2.9.1、2.10.0。默认不固定torch版本(运行时会预安装最新版)。仅当依赖强制要求时才固定版本。2.11.0 - 对于预构建的CUDA扩展轮子(、
flash_attn、xformers、pytorch3d、nvdiffrast、diff_gaussian_rasterization):使用https://huggingface.co/datasets/multimodalart/zerogpu-blackwell-wheels/tree/main/wheels提供的Blackwell预构建轮子。完整映射及注意事项见[`references/requirements.md`](references/requirements.md)。torchmcubes
Per-SDK depth
各SDK详细指南
- Gradio patterns (themes, , streaming, custom HTML components,
gr.Examples):gr.Server.references/gradio.md - Docker: https://huggingface.co/docs/hub/spaces-sdks-docker. Examples: .
hf spaces list --filter docker - Static: https://huggingface.co/docs/hub/spaces-sdks-static. For built SPAs, set and
app_build_command: npm run buildin frontmatter.app_file: dist/index.html - ZeroGPU specifics (decorator semantics, sizing, AoTI, generators, concurrency, pickle / across the worker boundary):
gr.State— read this whenever the Space targets ZeroGPU.references/zerogpu.md
- Gradio模式(主题、、流式传输、自定义HTML组件、
gr.Examples):见gr.Server。references/gradio.md - Docker:https://huggingface.co/docs/hub/spaces-sdks-docker。示例:`hf spaces list --filter docker`。
- Static:https://huggingface.co/docs/hub/spaces-sdks-static。对于构建后的SPA,在前置元数据中设置`app_build_command: npm run buildapp_file: dist/index.html`。
和 - ZeroGPU专属细节(装饰器语义、规格调整、AoTI、生成器、并发、跨工作进程边界的pickle / ):见
gr.State——当Space目标为ZeroGPU时,请务必阅读此文档。references/zerogpu.md
6. Iterate on the Space, not locally
6. 在Space上迭代,而非本地
Try to build a release candidate from the user quest locally and push it — then use the live URL as your test loop. The Space environment is the only one that matters; do not try to test locally. is the maximum local check worth doing before pushing.
python3 -m py_compile app.pyOnce pushed, pick the cheapest update mechanism for each change — hot-reload for pure Python edits, for code-only files hot-reload can't touch, full rebuild only when / / README frontmatter actually changed. Full ladder + footguns (hot-reload poisoning factory reboot, runtime.sha lag, etc.) in .
hf uploadrequirements.txtDockerfilereferences/debugging.md尝试在本地构建一个候选版本并推送——然后使用在线URL作为测试循环。只有Space环境才是关键;请勿尝试本地测试。推送前最多只需执行进行本地检查。
python3 -m py_compile app.py推送后,根据变更类型选择最经济的更新方式:纯Python编辑使用热重载,代码文件(热重载无法处理的)使用,仅当//README前置元数据实际变更时才进行完整重建。完整更新层级及注意事项(热重载污染导致工厂重启、runtime.sha延迟等)见。
hf uploadrequirements.txtDockerfilereferences/debugging.md7. Verify
7. 验证
Don't trust alone — the app can be running but broken. Four steps, in order:
RUNNINGA. Alive? Stage + hardware:
bash
hf spaces info <ns>/<name> --expand runtimeB. Logs clean post-boot? Read the run log to confirm startup finished without warnings or silent fallbacks:
bash
hf spaces logs <ns>/<name> --tail 200Look for model-load completion, no import warnings, no "falling back to CPU" / dtype downgrade messages, no masking a half-broken app.
RUNNINGC. API actually responds. With logs still tailing in another terminal (), call the endpoint:
hf spaces logs <ns>/<name> --followpython
from gradio_client import Client, handle_file
import os
c = Client("<ns>/<name>", token=os.environ["HF_TOKEN"], httpx_kwargs={"timeout": 600})
print(c.view_api()) # discover endpoints — don't guess
result = c.predict(..., api_name="/generate")D. Sniff output AND logs. HTTP 200 ≠ correct output. Check both:
python
head = open(result, "rb").read(16)不要仅依赖状态——应用可能已启动但存在故障。按以下四步验证:
RUNNING**A. 应用是否存活?**查看阶段及硬件信息:
bash
hf spaces info <命名空间>/<名称> --expand runtime**B. 启动后日志是否干净?**查看运行日志,确认启动过程无警告或静默降级:
bash
hf spaces logs <命名空间>/<名称> --tail 200需确认模型加载完成、无导入警告、无“回退到CPU”/数据类型降级信息、无状态掩盖半故障应用的情况。
RUNNING**C. API是否实际响应?**在另一个终端持续查看日志(),调用接口:
hf spaces logs <命名空间>/<名称> --followpython
from gradio_client import Client, handle_file
import os
c = Client("<命名空间>/<名称>", token=os.environ["HF_TOKEN"], httpx_kwargs={"timeout": 600})
print(c.view_api()) # 发现接口——不要猜测
result = c.predict(..., api_name="/generate")**D. 检查输出和日志。**HTTP 200≠输出正确。需同时检查两者:
python
head = open(result, "rb").read(16)glTF / \x89PNG / RIFF…WEBP / RIFF…WAVE / [4:8]==b"ftyp" → png/jpg/webp/wav/mp4
glTF / \x89PNG / RIFF…WEBP / RIFF…WAVE / [4:8]==b"ftyp" → png/jpg/webp/wav/mp4
And look at the run log emitted during the call — silent fallbacks (model snapping to a different size, missing optional dep, dtype downgrade) only show up there.
Full smoke-test patterns (streaming endpoints, OAuth-gated Spaces, `gr.Server` custom routes): [`references/debugging.md`](references/debugging.md).同时查看调用过程中生成的运行日志——静默降级(模型切换到其他规格、缺失可选依赖、数据类型降级)仅会在日志中显示。
完整冒烟测试模式(流式接口、OAuth授权的Space、`gr.Server`自定义路由):见[`references/debugging.md`](references/debugging.md)。8. Permanent storage (buckets)
8. 永久存储(存储桶)
Spaces are stateless — is wiped on restart. If the Space needs to persist user uploads, generations, logs, or interact with a long-lived store, mount a bucket:
/databash
hf buckets create <ns>/<bucket-name> # --private optional
hf spaces volumes set <ns>/<space> -v hf://buckets/<ns>/<bucket-name>:/data # read-write at /dataBuckets are paid storage; check and confirm with the user. Full patterns (read-fast / write-durable, public bucket URLs, model-cache anti-pattern): .
canPayreferences/buckets.mdSpace是无状态的——目录会在重启时被清空。若Space需要持久化用户上传内容、生成结果、日志,或与长期存储交互,可挂载存储桶:
/databash
hf buckets create <命名空间>/<存储桶名称> # 可选--private参数
hf spaces volumes set <命名空间>/<Space名称> -v hf://buckets/<命名空间>/<存储桶名称>:/data # 在/data目录挂载读写权限存储桶为付费服务;需检查状态并与用户确认。完整使用模式(快速读取/持久化写入、公开存储桶URL、模型缓存反模式)见。
canPayreferences/buckets.md9. When things break
9. 故障排查
Order of operations:
- Read the logs: (build error) or
hf spaces logs <id> --build --follow(runtime error). Find the first error, not the last.hf spaces logs <id> --follow - Grep for the error string. Check if this is a known issue before trying your own fix — most common ZeroGPU / Gradio / dependency errors have a 1–2 line fix there.
references/known-errors.md - Iterate using the cheapest rung from . The vast majority of issues resolve with log-reading + smoke-test loops; interactive dev mode + SSH is a heavy-hammer last resort.
references/debugging.md
If you solve an error that wasn't in the known-errors list, suggest the user PR it back to this skill so future runs benefit.
操作顺序:
- 查看日志:(构建错误)或
hf spaces logs <ID> --build --follow(运行时错误)。找到第一个错误,而非最后一个。hf spaces logs <ID> --follow - 在中搜索错误字符串。在尝试自行修复前,先确认是否为已知问题——大多数常见的ZeroGPU/Gradio/依赖错误都有1-2行的修复方案。
references/known-errors.md - 使用中最经济的方式迭代。绝大多数问题可通过查看日志+冒烟测试循环解决;交互式开发模式+SSH是最后的手段。
references/debugging.md
若你解决了一个未在已知错误列表中的问题,建议用户向本技能提交PR,以便后续使用此技能的用户受益。
Reference index
参考索引
| When to read | File |
|---|---|
| How ZeroGPU works + correct patterns (decorator, sizing, pickle, generators, real-time, AoTI) | |
| Iterate + debug: logs, rung ladder, smoke testing (and dev mode + SSH as a last resort) | |
| Error-string lookup — the single place for all error symptoms (Spaces, ZeroGPU, Gradio, deps) | |
| Pinning deps, picking wheels, torch-family alignment | |
| |
| Persistent storage, public bucket URLs | |
| Community grant requests (non-PRO needing ZeroGPU) | |
| Provider proxy (zero-VRAM big LLM via Cerebras / Fireworks / Together / etc.) | |
| 阅读场景 | 文件 |
|---|---|
| ZeroGPU工作原理 + 正确使用模式(装饰器、规格调整、pickle、生成器、实时处理、AoTI) | |
| 迭代与调试:日志、更新层级、冒烟测试(以及作为最后手段的开发模式+SSH) | |
| 错误字符串查询——所有错误症状(Space、ZeroGPU、Gradio、依赖)的统一查询入口 | |
| 依赖版本固定、轮子选择、Torch家族版本对齐 | |
| |
| 持久化存储、公开存储桶URL | |
| 社区资助申请(非PRO用户需使用ZeroGPU) | |
| 提供商代理(通过Cerebras / Fireworks / Together等实现零显存大模型推理) | |