huggingface-spaces

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Hugging Face Spaces

Hugging Face Spaces

Hugging Face Spaces host machine-learning applications. There are 1M+ today; each Space is a git repo. This skill covers creating, building, debugging, and maintaining them.
Hugging Face Spaces用于托管机器学习应用,目前已有超过100万个;每个Space都是一个Git仓库。本技能涵盖Space的创建、构建、调试与维护全流程。

0. Getting ready

0. 准备工作

Before anything else:
  1. Check the
    hf
    CLI is installed:
    which hf
    . If not,
    pip install -U huggingface_hub
    .
  2. Check the user is logged in:
    hf auth whoami
    . If not, ask them to run
    ! hf auth login
    in this session — they'll need a write-scoped token from https://huggingface.co/settings/tokens.
  3. Note
    whoami
    's
    canPay
    and
    isPro
    flags — they gate hardware choices below.
The
hf-cli
skill teaches an agent every
hf
command and is the recommended companion to this one. Install it with
hf skills add hf-cli
(add
--claude --global
to install for Claude Code as well, user-level).
在开始操作前,请完成以下步骤:
  1. 检查
    hf
    CLI是否已安装:执行
    which hf
    。若未安装,运行
    pip install -U huggingface_hub
  2. 检查用户是否已登录:执行
    hf auth whoami
    。若未登录,请用户在当前会话中运行
    ! hf auth login
    ——他们需要从https://huggingface.co/settings/tokens获取一个具有写入权限的令牌。
  3. 记录
    whoami
    返回的
    canPay
    isPro
    标识——这些将决定后续可选择的硬件方案。
hf-cli
技能会向Agent教授所有
hf
命令,是本技能的推荐配套技能。可通过
hf skills add hf-cli
安装(添加
--claude --global
可同时为Claude Code安装,作用于用户级别)。

1. What a Space is

1. Space是什么

A Space is a git repo with three possible SDKs:
  • Gradio — most Spaces. Python, fast iteration, supports ZeroGPU.
  • Docker — arbitrary container. Use when you need a non-Python stack or a pre-built template (Streamlit, Argilla, Shiny, etc. — full list at https://huggingface.co/docs/hub/spaces-sdks-docker). Does not support ZeroGPU.
  • Static — plain HTML, or a React/Svelte/Vue project built at deploy time. Use for in-browser ML (transformers.js / WebGPU / WebAssembly / onnxruntime-web), project pages, interactive reports, or Spaces that orchestrate other Spaces. No hardware needed.
Space是一个Git仓库,支持三种SDK:
  • Gradio——最常用的Space类型。基于Python,迭代速度快,支持ZeroGPU。
  • Docker——支持任意容器。当你需要非Python技术栈或预构建模板(如Streamlit、Argilla、Shiny等,完整列表见https://huggingface.co/docs/hub/spaces-sdks-docker)时使用。**不支持ZeroGPU**。
  • Static——纯HTML,或部署时构建的React/Svelte/Vue项目。适用于浏览器端机器学习(transformers.js / WebGPU / WebAssembly / onnxruntime-web)、项目页面、交互式报告,或用于编排其他Space的场景。无需硬件。

Hardware tiers

硬件层级

Free, no creator cost:
cpu-basic
and
zero-a10g
(ZeroGPU). Static Spaces are also free and don't need hardware.
cpu-basic
— 2 vCPU / 16 GB. For data viz, API-proxy Spaces, small CPU-bound models.
ZeroGPU (
zero-a10g
)
— dynamic, per-request GPU allocation on NVIDIA RTX PRO 6000 Blackwell (sm_120). Two sizes:
large
(half MIG, 48 GB, 1× quota) and
xlarge
(full, 96 GB, 2× quota). Free for the Space creator; Space visitors consume their own daily quota (~5 min free / 40 min Pro / 60 min Enterprise). Gradio-only, PyTorch-first. Requires the creator to be on a PRO / Team / Enterprise plan.
Dedicated GPU (T4, L4, A10G, L40S, A100, H200) — billed to the Space creator by the hour. List + pricing:
hf spaces hardware
. Only the creator can attach these, and only if
canPay=True
. Use when ZeroGPU genuinely doesn't fit — non-PyTorch main model with heavy init, very-large-model long-context inference, etc.
If a non-PRO user has a use case that wants ZeroGPU, you can still build it: create a
cpu-basic
Space, code the app for ZeroGPU, push, then request a community grant. See
references/grants.md
.
For the authoritative reference: https://huggingface.co/docs/hub/spaces-overview
免费(创作者无需付费):
cpu-basic
zero-a10g
(ZeroGPU)。Static Space同样免费且无需硬件。
cpu-basic
——2核CPU / 16GB内存。适用于数据可视化、API代理类Space、小型CPU绑定模型。
ZeroGPU (
zero-a10g
)
——动态按请求分配NVIDIA RTX PRO 6000 Blackwell(sm_120)GPU资源。提供两种规格:
large
(半MIG,48GB显存,1倍配额)和
xlarge
(全MIG,96GB显存,2倍配额)。Space创作者免费使用;访问者消耗自身每日配额(普通用户约5分钟免费/Pro用户40分钟/企业用户60分钟)。仅支持Gradio优先支持PyTorch。要求创作者拥有PRO/团队/企业计划。
专属GPU(T4、L4、A10G、L40S、A100、H200)——按小时向Space创作者计费。完整列表及定价:执行
hf spaces hardware
。仅创作者可绑定此类硬件,且需满足
canPay=True
。当ZeroGPU确实无法满足需求时使用,例如主模型为非PyTorch且初始化负载高、超大规模长上下文推理等场景。
若非PRO用户有使用ZeroGPU的需求,仍可构建Space:创建一个
cpu-basic
类型的Space,按ZeroGPU标准编写应用代码并推送,然后申请社区资助。详情见
references/grants.md

2. Look for an existing demo first

2. 先查找已有示例

Before deciding how to build anything, search for prior art:
bash
hf spaces search "<model name or task>" --sdk gradio --limit 10
If someone has built a similar Space, read its
app.py
and
requirements.txt
— that gives you the working pattern. Saves a lot of blind iteration. Mention to the user what you found before committing to an approach.
在决定构建方案前,先搜索已有案例:
bash
hf spaces search "<模型名称或任务>" --sdk gradio --limit 10
若已有类似Space,查看其
app.py
requirements.txt
——这能提供现成的可行方案,避免盲目试错。在确定方案前,需告知用户你找到的内容。

3. Decide SDK and hardware

3. 选择SDK与硬件

Follow the user's explicit request first. If they were vague:
  • Default for a public ML demo: Gradio + ZeroGPU. Use this unless something below applies.
  • The model's only inference path is non-PyTorch (ONNX / TF / JAX / vLLM as the MAIN model, with heavy init): dedicated GPU.
    • But: marginal non-torch tools (a small ONNX preprocessor, a TF utility) inside a torch-main pipeline are fine on ZeroGPU. The hijack only patches torch; init the non-torch lib inside
      @spaces.GPU
      and pay the short per-call init cost.
  • Tiny / CPU-bound model, or API-proxy Space:
    cpu-basic
    (
    hardware
    -free isn't applicable to Gradio).
  • Browser-side ML or project page: Static.
  • Container with non-Python stack: Docker.
优先遵循用户明确要求。若用户需求模糊:
  • 公开机器学习演示默认方案:Gradio + ZeroGPU。除非符合以下情况,否则优先选择此方案。
  • 模型唯一推理路径为非PyTorch(ONNX / TF / JAX / vLLM作为主模型,且初始化负载高):使用专属GPU。
    • 例外:在以PyTorch为主的流程中,少量非PyTorch工具(如小型ONNX预处理程序、TF实用工具)可在ZeroGPU上运行。只需在
      @spaces.GPU
      内初始化非PyTorch库,仅需承担每次调用的短暂初始化成本。
  • 小型/CPU绑定模型,或API代理类Space
    cpu-basic
    (Gradio不支持无硬件模式)。
  • 浏览器端机器学习或项目页面:Static。
  • 非Python技术栈容器:Docker。

Sourcing the model

模型来源

  • GitHub repo — clone locally to read structure. If it already has a Gradio demo, the minimal viable path is to adapt it onto ZeroGPU (see
    references/zerogpu.md
    ). Otherwise: read the README + inference code, prefer the PyTorch path, estimate VRAM (bf16 ≈
    params_B × 2
    GB; 48 GB fits ≤24B params at bf16, or much larger with quantization — see
    references/zerogpu.md
    for quantization on ZeroGPU).
  • HF model repo — read its README, follow any linked GitHub.
  • Paper / blog post — look for an official or unofficial implementation. Don't reimplement unless trivial or the user explicitly asks.
  • Vague request — search Spaces first; surface results.
If the model genuinely won't fit, check Inference Providers as an alternative: see
references/inference-providers.md
. This avoids hosting the model at all.
  • GitHub仓库——克隆到本地查看结构。若已有Gradio演示,最简路径是将其适配到ZeroGPU(见
    references/zerogpu.md
    )。否则:查看README和推理代码,优先选择PyTorch路径,估算显存需求(bf16精度≈
    参数数量(十亿) × 2
    GB;48GB显存可容纳≤240亿参数的bf16模型,或通过量化支持更大模型——ZeroGPU量化详情见
    references/zerogpu.md
    )。
  • HF模型仓库——查看其README,跟随链接到GitHub仓库。
  • 论文/博客文章——寻找官方或非官方实现。除非需求简单或用户明确要求,否则不要重新实现。
  • 模糊需求——先搜索Space,展示结果。
若模型确实无法适配,可考虑推理提供商作为替代方案:见
references/inference-providers.md
。此方案无需自行托管模型。

4. Create the Space

4. 创建Space

bash
hf repos create <namespace>/<name> --type space --space-sdk <gradio|docker|static> \
    [--flavor zero-a10g|cpu-basic|<paid-flavor>] \
    [--secrets KEY=val] [--env KEY=val] \
    --public|--private|--protected \
    --exist-ok
  • --space-sdk
    is required.
  • --flavor
    selects hardware.
    zero-a10g
    is the (legacy) identifier for ZeroGPU. Omit for
    cpu-basic
    . Run
    hf spaces hardware
    for the full paid list and pricing.
  • Visibility:
    --public
    (anyone can view),
    --private
    (only you),
    --protected
    (app is reachable but git repo / Files tab is private).
  • --secrets KEY=val
    becomes an environment variable inside the Space and is not visible to visitors. Use for API keys, gated-repo tokens (
    HF_TOKEN=hf_…
    ), etc. Can also be set later via
    hf spaces secrets set <id> KEY=val
    .
  • --env KEY=val
    is visible to visitors — use only for non-sensitive config (
    GRADIO_SSR_MODE=false
    ,
    PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
    , etc.).
Note:
hardware:
in the README YAML is silently ignored — hardware is only set via
--flavor
at creation, or later via
hf spaces settings <id> --hardware <name>
.
bash
hf repos create <命名空间>/<名称> --type space --space-sdk <gradio|docker|static> \
    [--flavor zero-a10g|cpu-basic|<付费规格>] \
    [--secrets KEY=val] [--env KEY=val] \
    --public|--private|--protected \
    --exist-ok
  • --space-sdk
    为必填项。
  • --flavor
    选择硬件规格。
    zero-a10g
    是ZeroGPU的(旧版)标识。省略则默认使用
    cpu-basic
    。执行
    hf spaces hardware
    查看完整付费规格列表及定价。
  • 可见性:
    --public
    (任何人可查看)、
    --private
    (仅自己可见)、
    --protected
    (应用可访问,但Git仓库/文件标签页私有)。
  • --secrets KEY=val
    会成为Space内部的环境变量,对访问者不可见。适用于API密钥、私有仓库令牌(
    HF_TOKEN=hf_…
    )等敏感信息。也可后续通过
    hf spaces secrets set <ID> KEY=val
    设置。
  • --env KEY=val
    对访问者可见——仅用于非敏感配置(如
    GRADIO_SSR_MODE=false
    PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
    等)。
注意:README YAML中的
hardware:
设置会被静默忽略——硬件仅在创建时通过
--flavor
设置,或后续通过
hf spaces settings <ID> --hardware <名称>
修改。

5. Build the app

5. 构建应用

The Space now exists at
https://huggingface.co/spaces/<namespace>/<name>
but is empty.
Space已创建,地址为
https://huggingface.co/spaces/<命名空间>/<名称>
,但目前为空。

README.md frontmatter

README.md前置元数据

Always required:
yaml
---
title: ...
emoji: 🚀                # pick something representative
colorFrom: blue          # red|yellow|green|blue|indigo|purple|pink|gray (only these)
colorTo: indigo
sdk: gradio              # gradio | docker | static
sdk_version: 6.15.1      # latest stable unless you have a reason*
app_file: app.py         # gradio only (docker / static use Dockerfile / index.html)
short_description: ...   # ≤ 60 chars (server rejects longer)
python_version: "3.12"   # ZeroGPU officially supports 3.10.13 and 3.12.12
startup_duration_timeout: 30m   # default; bump to 1h for big LLMs / heavy downloads
---
* Reasons to use an older Gradio: a custom component pins it, or you're adapting an existing demo and don't want to rewrite for 5.x→6.x breaking changes. If you need a 5.x, pick
5.50.0
(latest of the series; still supports custom components).
必填项:
yaml
---
title: ...
emoji: 🚀                # 选择具有代表性的表情
colorFrom: blue          # 可选值:red|yellow|green|blue|indigo|purple|pink|gray
colorTo: indigo
sdk: gradio              # 可选值:gradio | docker | static
sdk_version: 6.15.1      # 除非有特殊原因,否则使用最新稳定版*
app_file: app.py         # 仅Gradio需要(Docker/Static使用Dockerfile/index.html)
short_description: ...   # ≤60字符(过长会被服务器拒绝)
python_version: "3.12"   # ZeroGPU官方支持3.10.13和3.12.12
startup_duration_timeout: 30m   # 默认值;大型LLM/大文件下载场景可调整为1h
---
* 使用旧版Gradio的原因:自定义组件依赖特定版本,或适配已有演示时不想因5.x→6.x的破坏性变更重写代码。若需使用5.x版本,选择
5.50.0
(该系列最新版,仍支持自定义组件)。

Minimal ZeroGPU Gradio app

最简ZeroGPU Gradio应用

python
import spaces           # MUST come before torch / diffusers / transformers
import torch
import gradio as gr
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("<repo>", torch_dtype=torch.bfloat16).to("cuda")

@spaces.GPU(duration=60)
def generate(prompt):
    return pipe(prompt).images[0]

gr.Interface(fn=generate, inputs=gr.Text(), outputs=gr.Image()).launch()
Three rules — full treatment in
references/zerogpu.md
:
  1. import spaces
    before torch / any CUDA-touching import.
    It monkey-patches
    torch.cuda.*
    ; once CUDA is initialized in the main process, it's too late.
  2. Load the model at module scope,
    .to("cuda")
    eagerly.
    ZeroGPU intercepts the call, packs weights to disk, and streams them into VRAM on the first
    @spaces.GPU
    entry. Lazy loading inside the decorator costs every user.
  3. Decorate the function Gradio binds. Estimate
    duration
    to the realistic worst case (smaller = higher queue priority and tighter quota check). For input-dependent runtime, pass a callable.
python
import spaces           # 必须在torch / diffusers / transformers之前导入
import torch
import gradio as gr
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("<仓库名>", torch_dtype=torch.bfloat16).to("cuda")

@spaces.GPU(duration=60)
def generate(prompt):
    return pipe(prompt).images[0]

gr.Interface(fn=generate, inputs=gr.Text(), outputs=gr.Image()).launch()
三条核心规则——详细说明见
references/zerogpu.md
  1. import spaces
    必须在torch/任何涉及CUDA的导入之前
    。它会通过猴子补丁修改
    torch.cuda.*
    ;一旦主进程初始化CUDA,就无法再进行补丁。
  2. 在模块作用域加载模型,并立即
    .to("cuda")
    。ZeroGPU会拦截此调用,将权重打包到磁盘,并在首次进入
    @spaces.GPU
    装饰的函数时将权重流式传输到显存。若在装饰器内延迟加载,会导致每个用户都需承担加载成本。
  3. 为Gradio绑定的函数添加装饰器。估算
    duration
    为实际最坏情况(值越小,队列优先级越高,配额检查越严格)。若运行时取决于输入,可传入一个可调用对象。

requirements.txt

requirements.txt

Short version:
  • Do NOT list:
    gradio
    ,
    spaces
    ,
    huggingface_hub
    (preinstalled and platform-managed; pinning them causes resolution failures or silently breaks the ZeroGPU runtime).
  • Do list if you use them:
    torchvision
    ,
    torchaudio
    (not preinstalled), plus everything else (
    diffusers
    ,
    transformers
    ,
    accelerate
    ,
    sentencepiece
    , …).
  • ZeroGPU only accepts torch
    2.8.0
    ,
    2.9.1
    ,
    2.10.0
    ,
    2.11.0
    . Default to leaving torch unpinned (the runtime preinstalls the latest). Only pin when a dep forces it.
  • For prebuilt CUDA-extension wheels (
    flash_attn
    ,
    xformers
    ,
    pytorch3d
    ,
    nvdiffrast
    ,
    diff_gaussian_rasterization
    ,
    torchmcubes
    ): use the prebuilt Blackwell wheels at
    https://huggingface.co/datasets/multimodalart/zerogpu-blackwell-wheels/tree/main/wheels
    . Full mapping + caveats in
    references/requirements.md
    .
简化版规则:

Per-SDK depth

各SDK详细指南

  • Gradio patterns (themes,
    gr.Examples
    , streaming, custom HTML components,
    gr.Server
    ):
    references/gradio.md
    .
  • Docker: https://huggingface.co/docs/hub/spaces-sdks-docker. Examples:
    hf spaces list --filter docker
    .
  • Static: https://huggingface.co/docs/hub/spaces-sdks-static. For built SPAs, set
    app_build_command: npm run build
    and
    app_file: dist/index.html
    in frontmatter.
  • ZeroGPU specifics (decorator semantics, sizing, AoTI, generators, concurrency, pickle /
    gr.State
    across the worker boundary):
    references/zerogpu.md
    — read this whenever the Space targets ZeroGPU.

6. Iterate on the Space, not locally

6. 在Space上迭代,而非本地

Try to build a release candidate from the user quest locally and push it — then use the live URL as your test loop. The Space environment is the only one that matters; do not try to test locally.
python3 -m py_compile app.py
is the maximum local check worth doing before pushing.
Once pushed, pick the cheapest update mechanism for each change — hot-reload for pure Python edits,
hf upload
for code-only files hot-reload can't touch, full rebuild only when
requirements.txt
/
Dockerfile
/ README frontmatter actually changed. Full ladder + footguns (hot-reload poisoning factory reboot, runtime.sha lag, etc.) in
references/debugging.md
.
尝试在本地构建一个候选版本并推送——然后使用在线URL作为测试循环。只有Space环境才是关键;请勿尝试本地测试。推送前最多只需执行
python3 -m py_compile app.py
进行本地检查。
推送后,根据变更类型选择最经济的更新方式:纯Python编辑使用热重载,代码文件(热重载无法处理的)使用
hf upload
,仅当
requirements.txt
/
Dockerfile
/README前置元数据实际变更时才进行完整重建。完整更新层级及注意事项(热重载污染导致工厂重启、runtime.sha延迟等)见
references/debugging.md

7. Verify

7. 验证

Don't trust
RUNNING
alone — the app can be running but broken. Four steps, in order:
A. Alive? Stage + hardware:
bash
hf spaces info <ns>/<name> --expand runtime
B. Logs clean post-boot? Read the run log to confirm startup finished without warnings or silent fallbacks:
bash
hf spaces logs <ns>/<name> --tail 200
Look for model-load completion, no import warnings, no "falling back to CPU" / dtype downgrade messages, no
RUNNING
masking a half-broken app.
C. API actually responds. With logs still tailing in another terminal (
hf spaces logs <ns>/<name> --follow
), call the endpoint:
python
from gradio_client import Client, handle_file
import os
c = Client("<ns>/<name>", token=os.environ["HF_TOKEN"], httpx_kwargs={"timeout": 600})
print(c.view_api())                    # discover endpoints — don't guess
result = c.predict(..., api_name="/generate")
D. Sniff output AND logs. HTTP 200 ≠ correct output. Check both:
python
head = open(result, "rb").read(16)
不要仅依赖
RUNNING
状态——应用可能已启动但存在故障。按以下四步验证:
**A. 应用是否存活?**查看阶段及硬件信息:
bash
hf spaces info <命名空间>/<名称> --expand runtime
**B. 启动后日志是否干净?**查看运行日志,确认启动过程无警告或静默降级:
bash
hf spaces logs <命名空间>/<名称> --tail 200
需确认模型加载完成、无导入警告、无“回退到CPU”/数据类型降级信息、无
RUNNING
状态掩盖半故障应用的情况。
**C. API是否实际响应?**在另一个终端持续查看日志(
hf spaces logs <命名空间>/<名称> --follow
),调用接口:
python
from gradio_client import Client, handle_file
import os
c = Client("<命名空间>/<名称>", token=os.environ["HF_TOKEN"], httpx_kwargs={"timeout": 600})
print(c.view_api())                    # 发现接口——不要猜测
result = c.predict(..., api_name="/generate")
**D. 检查输出和日志。**HTTP 200≠输出正确。需同时检查两者:
python
head = open(result, "rb").read(16)

glTF / \x89PNG / RIFF…WEBP / RIFF…WAVE / [4:8]==b"ftyp" → png/jpg/webp/wav/mp4

glTF / \x89PNG / RIFF…WEBP / RIFF…WAVE / [4:8]==b"ftyp" → png/jpg/webp/wav/mp4

And look at the run log emitted during the call — silent fallbacks (model snapping to a different size, missing optional dep, dtype downgrade) only show up there.

Full smoke-test patterns (streaming endpoints, OAuth-gated Spaces, `gr.Server` custom routes): [`references/debugging.md`](references/debugging.md).
同时查看调用过程中生成的运行日志——静默降级(模型切换到其他规格、缺失可选依赖、数据类型降级)仅会在日志中显示。

完整冒烟测试模式(流式接口、OAuth授权的Space、`gr.Server`自定义路由):见[`references/debugging.md`](references/debugging.md)。

8. Permanent storage (buckets)

8. 永久存储(存储桶)

Spaces are stateless —
/data
is wiped on restart. If the Space needs to persist user uploads, generations, logs, or interact with a long-lived store, mount a bucket:
bash
hf buckets create <ns>/<bucket-name>                                          # --private optional
hf spaces volumes set <ns>/<space> -v hf://buckets/<ns>/<bucket-name>:/data   # read-write at /data
Buckets are paid storage; check
canPay
and confirm with the user. Full patterns (read-fast / write-durable, public bucket URLs, model-cache anti-pattern):
references/buckets.md
.
Space是无状态的——
/data
目录会在重启时被清空。若Space需要持久化用户上传内容、生成结果、日志,或与长期存储交互,可挂载存储桶
bash
hf buckets create <命名空间>/<存储桶名称>                                          # 可选--private参数
 hf spaces volumes set <命名空间>/<Space名称> -v hf://buckets/<命名空间>/<存储桶名称>:/data   # 在/data目录挂载读写权限
存储桶为付费服务;需检查
canPay
状态并与用户确认。完整使用模式(快速读取/持久化写入、公开存储桶URL、模型缓存反模式)见
references/buckets.md

9. When things break

9. 故障排查

Order of operations:
  1. Read the logs:
    hf spaces logs <id> --build --follow
    (build error) or
    hf spaces logs <id> --follow
    (runtime error). Find the first error, not the last.
  2. Grep
    references/known-errors.md
    for the error string. Check if this is a known issue before trying your own fix — most common ZeroGPU / Gradio / dependency errors have a 1–2 line fix there.
  3. Iterate using the cheapest rung from
    references/debugging.md
    . The vast majority of issues resolve with log-reading + smoke-test loops; interactive dev mode + SSH is a heavy-hammer last resort.
If you solve an error that wasn't in the known-errors list, suggest the user PR it back to this skill so future runs benefit.

操作顺序:
  1. 查看日志:
    hf spaces logs <ID> --build --follow
    (构建错误)或
    hf spaces logs <ID> --follow
    (运行时错误)。找到第一个错误,而非最后一个。
  2. references/known-errors.md
    中搜索错误字符串。在尝试自行修复前,先确认是否为已知问题——大多数常见的ZeroGPU/Gradio/依赖错误都有1-2行的修复方案。
  3. 使用
    references/debugging.md
    中最经济的方式迭代。绝大多数问题可通过查看日志+冒烟测试循环解决;交互式开发模式+SSH是最后的手段。
若你解决了一个未在已知错误列表中的问题,建议用户向本技能提交PR,以便后续使用此技能的用户受益。

Reference index

参考索引

When to readFile
How ZeroGPU works + correct patterns (decorator, sizing, pickle, generators, real-time, AoTI)
references/zerogpu.md
Iterate + debug: logs, rung ladder, smoke testing (and dev mode + SSH as a last resort)
references/debugging.md
Error-string lookup — the single place for all error symptoms (Spaces, ZeroGPU, Gradio, deps)
references/known-errors.md
Pinning deps, picking wheels, torch-family alignment
references/requirements.md
gr.Examples
caching, themes, custom HTML components,
gr.Server
references/gradio.md
Persistent storage, public bucket URLs
references/buckets.md
Community grant requests (non-PRO needing ZeroGPU)
references/grants.md
Provider proxy (zero-VRAM big LLM via Cerebras / Fireworks / Together / etc.)
references/inference-providers.md
阅读场景文件
ZeroGPU工作原理 + 正确使用模式(装饰器、规格调整、pickle、生成器、实时处理、AoTI)
references/zerogpu.md
迭代与调试:日志、更新层级、冒烟测试(以及作为最后手段的开发模式+SSH)
references/debugging.md
错误字符串查询——所有错误症状(Space、ZeroGPU、Gradio、依赖)的统一查询入口
references/known-errors.md
依赖版本固定、轮子选择、Torch家族版本对齐
references/requirements.md
gr.Examples
缓存、主题、自定义HTML组件、
gr.Server
references/gradio.md
持久化存储、公开存储桶URL
references/buckets.md
社区资助申请(非PRO用户需使用ZeroGPU)
references/grants.md
提供商代理(通过Cerebras / Fireworks / Together等实现零显存大模型推理)
references/inference-providers.md