huggingface-spaces

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Hugging Face Spaces

Hugging Face Spaces host machine-learning applications. There are 1M+ today; each Space is a git repo. This skill covers creating, building, debugging, and maintaining them.

Hugging Face Spaces用于托管机器学习应用，目前已有超过100万个；每个Space都是一个Git仓库。本技能涵盖Space的创建、构建、调试与维护全流程。

0. Getting ready

0. 准备工作

Before anything else:

Check the

hf

CLI is installed:

which hf

. If not,

pip install -U huggingface_hub

Check the user is logged in:
```
hf auth whoami
```
. If not, ask them to run
```
! hf auth login
```
in this session — they'll need a write-scoped token from https://huggingface.co/settings/tokens.
Note
```
whoami
```
's
```
canPay
```
and
```
isPro
```
flags — they gate hardware choices below.

The

hf-cli

skill teaches an agent every

hf

command and is the recommended companion to this one. Install it with

hf skills add hf-cli

(add

--claude --global

to install for Claude Code as well, user-level).

在开始操作前，请完成以下步骤：

检查
```
hf
```
CLI是否已安装：执行
```
which hf
```
。若未安装，运行
```
pip install -U huggingface_hub
```
。
检查用户是否已登录：执行
```
hf auth whoami
```
。若未登录，请用户在当前会话中运行
```
! hf auth login
```
——他们需要从https://huggingface.co/settings/tokens获取一个具有写入权限的令牌。
记录
```
whoami
```
返回的
```
canPay
```
和
```
isPro
```
标识——这些将决定后续可选择的硬件方案。

hf-cli

技能会向Agent教授所有

hf

命令，是本技能的推荐配套技能。可通过

hf skills add hf-cli

安装（添加

--claude --global

可同时为Claude Code安装，作用于用户级别）。

1. What a Space is

1. Space是什么

A Space is a git repo with three possible SDKs:

Gradio — most Spaces. Python, fast iteration, supports ZeroGPU.
Docker — arbitrary container. Use when you need a non-Python stack or a pre-built template (Streamlit, Argilla, Shiny, etc. — full list at https://huggingface.co/docs/hub/spaces-sdks-docker). Does not support ZeroGPU.
Static — plain HTML, or a React/Svelte/Vue project built at deploy time. Use for in-browser ML (transformers.js / WebGPU / WebAssembly / onnxruntime-web), project pages, interactive reports, or Spaces that orchestrate other Spaces. No hardware needed.

Space是一个Git仓库，支持三种SDK：

Gradio——最常用的Space类型。基于Python，迭代速度快，支持ZeroGPU。
Docker——支持任意容器。当你需要非Python技术栈或预构建模板（如Streamlit、Argilla、Shiny等，完整列表见https://huggingface.co/docs/hub/spaces-sdks-docker）时使用。**不支持ZeroGPU**。
Static——纯HTML，或部署时构建的React/Svelte/Vue项目。适用于浏览器端机器学习（transformers.js / WebGPU / WebAssembly / onnxruntime-web）、项目页面、交互式报告，或用于编排其他Space的场景。无需硬件。

Hardware tiers

硬件层级

Free, no creator cost: cpu-basic
and zero-a10g
(ZeroGPU). Static Spaces are also free and don't need hardware.

cpu-basic
— 2 vCPU / 16 GB. For data viz, API-proxy Spaces, small CPU-bound models.

ZeroGPU (
zero-a10g
) — dynamic, per-request GPU allocation on NVIDIA RTX PRO 6000 Blackwell (sm_120). Two sizes:

large

(half MIG, 48 GB, 1× quota) and

xlarge

(full, 96 GB, 2× quota). Free for the Space creator; Space visitors consume their own daily quota (~5 min free / 40 min Pro / 60 min Enterprise). Gradio-only, PyTorch-first. Requires the creator to be on a PRO / Team / Enterprise plan.

Dedicated GPU (T4, L4, A10G, L40S, A100, H200) — billed to the Space creator by the hour. List + pricing:

hf spaces hardware

. Only the creator can attach these, and only if

canPay=True

. Use when ZeroGPU genuinely doesn't fit — non-PyTorch main model with heavy init, very-large-model long-context inference, etc.

If a non-PRO user has a use case that wants ZeroGPU, you can still build it: create a

cpu-basic

Space, code the app for ZeroGPU, push, then request a community grant. See

references/grants.md

For the authoritative reference: https://huggingface.co/docs/hub/spaces-overview

免费（创作者无需付费）：cpu-basic
和
zero-a10g
（ZeroGPU）。Static Space同样免费且无需硬件。

cpu-basic
——2核CPU / 16GB内存。适用于数据可视化、API代理类Space、小型CPU绑定模型。

ZeroGPU (
zero-a10g
)——动态按请求分配NVIDIA RTX PRO 6000 Blackwell（sm_120）GPU资源。提供两种规格：

large

（半MIG，48GB显存，1倍配额）和

xlarge

（全MIG，96GB显存，2倍配额）。Space创作者免费使用；访问者消耗自身每日配额（普通用户约5分钟免费/Pro用户40分钟/企业用户60分钟）。仅支持Gradio，优先支持PyTorch。要求创作者拥有PRO/团队/企业计划。

专属GPU（T4、L4、A10G、L40S、A100、H200）——按小时向Space创作者计费。完整列表及定价：执行

hf spaces hardware

。仅创作者可绑定此类硬件，且需满足

canPay=True

。当ZeroGPU确实无法满足需求时使用，例如主模型为非PyTorch且初始化负载高、超大规模长上下文推理等场景。

若非PRO用户有使用ZeroGPU的需求，仍可构建Space：创建一个

cpu-basic

类型的Space，按ZeroGPU标准编写应用代码并推送，然后申请社区资助。详情见

references/grants.md

。

官方参考文档：https://huggingface.co/docs/hub/spaces-overview

2. Look for an existing demo first

2. 先查找已有示例

Before deciding how to build anything, search for prior art:

bash

hf spaces search "<model name or task>" --sdk gradio --limit 10

If someone has built a similar Space, read its

app.py

and

requirements.txt

— that gives you the working pattern. Saves a lot of blind iteration. Mention to the user what you found before committing to an approach.

在决定构建方案前，先搜索已有案例：

bash

hf spaces search "<模型名称或任务>" --sdk gradio --limit 10

若已有类似Space，查看其

app.py

和

requirements.txt

——这能提供现成的可行方案，避免盲目试错。在确定方案前，需告知用户你找到的内容。

3. Decide SDK and hardware

3. 选择SDK与硬件

Follow the user's explicit request first. If they were vague:

Default for a public ML demo: Gradio + ZeroGPU. Use this unless something below applies.
The model's only inference path is non-PyTorch (ONNX / TF / JAX / vLLM as the MAIN model, with heavy init): dedicated GPU.
- But: marginal non-torch tools (a small ONNX preprocessor, a TF utility) inside a torch-main pipeline are fine on ZeroGPU. The hijack only patches torch; init the non-torch lib inside
```
@spaces.GPU
```
  and pay the short per-call init cost.
Tiny / CPU-bound model, or API-proxy Space:
```
cpu-basic
```
(
```
hardware
```
-free isn't applicable to Gradio).
Browser-side ML or project page: Static.
Container with non-Python stack: Docker.

优先遵循用户明确要求。若用户需求模糊：

公开机器学习演示默认方案：Gradio + ZeroGPU。除非符合以下情况，否则优先选择此方案。
模型唯一推理路径为非PyTorch（ONNX / TF / JAX / vLLM作为主模型，且初始化负载高）：使用专属GPU。
- 例外：在以PyTorch为主的流程中，少量非PyTorch工具（如小型ONNX预处理程序、TF实用工具）可在ZeroGPU上运行。只需在
```
@spaces.GPU
```
  内初始化非PyTorch库，仅需承担每次调用的短暂初始化成本。
小型/CPU绑定模型，或API代理类Space：
```
cpu-basic
```
（Gradio不支持无硬件模式）。
浏览器端机器学习或项目页面：Static。
非Python技术栈容器：Docker。

Sourcing the model

模型来源

GitHub repo — clone locally to read structure. If it already has a Gradio demo, the minimal viable path is to adapt it onto ZeroGPU (see
```
references/zerogpu.md
```
). Otherwise: read the README + inference code, prefer the PyTorch path, estimate VRAM (bf16 ≈
```
params_B × 2
```
GB; 48 GB fits ≤24B params at bf16, or much larger with quantization — see
```
references/zerogpu.md
```
for quantization on ZeroGPU).
HF model repo — read its README, follow any linked GitHub.
Paper / blog post — look for an official or unofficial implementation. Don't reimplement unless trivial or the user explicitly asks.
Vague request — search Spaces first; surface results.

If the model genuinely won't fit, check Inference Providers as an alternative: see

references/inference-providers.md

. This avoids hosting the model at all.

GitHub仓库——克隆到本地查看结构。若已有Gradio演示，最简路径是将其适配到ZeroGPU（见
```
references/zerogpu.md
```
）。否则：查看README和推理代码，优先选择PyTorch路径，估算显存需求（bf16精度≈
```
参数数量(十亿) × 2
```
GB；48GB显存可容纳≤240亿参数的bf16模型，或通过量化支持更大模型——ZeroGPU量化详情见
```
references/zerogpu.md
```
）。
HF模型仓库——查看其README，跟随链接到GitHub仓库。
论文/博客文章——寻找官方或非官方实现。除非需求简单或用户明确要求，否则不要重新实现。
模糊需求——先搜索Space，展示结果。

若模型确实无法适配，可考虑推理提供商作为替代方案：见

references/inference-providers.md

。此方案无需自行托管模型。

4. Create the Space

4. 创建Space

bash

hf repos create <namespace>/<name> --type space --space-sdk <gradio|docker|static> \
    [--flavor zero-a10g|cpu-basic|<paid-flavor>] \
    [--secrets KEY=val] [--env KEY=val] \
    --public|--private|--protected \
    --exist-ok

```
--space-sdk
```
is required.
```
--flavor
```
selects hardware.
```
zero-a10g
```
is the (legacy) identifier for ZeroGPU. Omit for
```
cpu-basic
```
. Run
```
hf spaces hardware
```
for the full paid list and pricing.
Visibility:
```
--public
```
(anyone can view),
```
--private
```
(only you),
```
--protected
```
(app is reachable but git repo / Files tab is private).
```
--secrets KEY=val
```
becomes an environment variable inside the Space and is not visible to visitors. Use for API keys, gated-repo tokens (
```
HF_TOKEN=hf_…
```
), etc. Can also be set later via
```
hf spaces secrets set <id> KEY=val
```
.

--env KEY=val

is visible to visitors — use only for non-sensitive config (

GRADIO_SSR_MODE=false

PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

, etc.).

Note:
hardware:
in the README YAML is silently ignored — hardware is only set via
--flavor
at creation, or later via
hf spaces settings <id> --hardware <name>
.

bash

hf repos create <命名空间>/<名称> --type space --space-sdk <gradio|docker|static> \
    [--flavor zero-a10g|cpu-basic|<付费规格>] \
    [--secrets KEY=val] [--env KEY=val] \
    --public|--private|--protected \
    --exist-ok

```
--space-sdk
```
为必填项。
```
--flavor
```
选择硬件规格。
```
zero-a10g
```
是ZeroGPU的（旧版）标识。省略则默认使用
```
cpu-basic
```
。执行
```
hf spaces hardware
```
查看完整付费规格列表及定价。
可见性：
```
--public
```
（任何人可查看）、
```
--private
```
（仅自己可见）、
```
--protected
```
（应用可访问，但Git仓库/文件标签页私有）。
```
--secrets KEY=val
```
会成为Space内部的环境变量，对访问者不可见。适用于API密钥、私有仓库令牌（
```
HF_TOKEN=hf_…
```
）等敏感信息。也可后续通过
```
hf spaces secrets set <ID> KEY=val
```
设置。

--env KEY=val

对访问者可见——仅用于非敏感配置（如

GRADIO_SSR_MODE=false

、

PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

等）。

注意：README YAML中的
hardware:
设置会被静默忽略——硬件仅在创建时通过
--flavor
设置，或后续通过
hf spaces settings <ID> --hardware <名称>
修改。

5. Build the app

5. 构建应用

The Space now exists at

https://huggingface.co/spaces/<namespace>/<name>

but is empty.

Space已创建，地址为

https://huggingface.co/spaces/<命名空间>/<名称>

，但目前为空。

README.md frontmatter

README.md前置元数据

Always required:

yaml

---
title: ...
emoji: 🚀                # pick something representative
colorFrom: blue          # red|yellow|green|blue|indigo|purple|pink|gray (only these)
colorTo: indigo
sdk: gradio              # gradio | docker | static
sdk_version: 6.15.1      # latest stable unless you have a reason*
app_file: app.py         # gradio only (docker / static use Dockerfile / index.html)
short_description: ...   # ≤ 60 chars (server rejects longer)
python_version: "3.12"   # ZeroGPU officially supports 3.10.13 and 3.12.12
startup_duration_timeout: 30m   # default; bump to 1h for big LLMs / heavy downloads
---

* Reasons to use an older Gradio: a custom component pins it, or you're adapting an existing demo and don't want to rewrite for 5.x→6.x breaking changes. If you need a 5.x, pick

5.50.0

(latest of the series; still supports custom components).

All frontmatter options: https://huggingface.co/docs/hub/spaces-config-reference

必填项：

yaml

---
title: ...
emoji: 🚀                # 选择具有代表性的表情
colorFrom: blue          # 可选值：red|yellow|green|blue|indigo|purple|pink|gray
colorTo: indigo
sdk: gradio              # 可选值：gradio | docker | static
sdk_version: 6.15.1      # 除非有特殊原因，否则使用最新稳定版*
app_file: app.py         # 仅Gradio需要（Docker/Static使用Dockerfile/index.html）
short_description: ...   # ≤60字符（过长会被服务器拒绝）
python_version: "3.12"   # ZeroGPU官方支持3.10.13和3.12.12
startup_duration_timeout: 30m   # 默认值；大型LLM/大文件下载场景可调整为1h
---

* 使用旧版Gradio的原因：自定义组件依赖特定版本，或适配已有演示时不想因5.x→6.x的破坏性变更重写代码。若需使用5.x版本，选择

5.50.0

（该系列最新版，仍支持自定义组件）。

所有前置元数据选项：https://huggingface.co/docs/hub/spaces-config-reference

Minimal ZeroGPU Gradio app

最简ZeroGPU Gradio应用

python

import spaces           # MUST come before torch / diffusers / transformers
import torch
import gradio as gr
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("<repo>", torch_dtype=torch.bfloat16).to("cuda")

@spaces.GPU(duration=60)
def generate(prompt):
    return pipe(prompt).images[0]

gr.Interface(fn=generate, inputs=gr.Text(), outputs=gr.Image()).launch()

Three rules — full treatment in

references/zerogpu.md

import spaces
before torch / any CUDA-touching import. It monkey-patches
```
torch.cuda.*
```
; once CUDA is initialized in the main process, it's too late.
Load the model at module scope,
.to("cuda")
eagerly. ZeroGPU intercepts the call, packs weights to disk, and streams them into VRAM on the first
```
@spaces.GPU
```
entry. Lazy loading inside the decorator costs every user.
Decorate the function Gradio binds. Estimate
```
duration
```
to the realistic worst case (smaller = higher queue priority and tighter quota check). For input-dependent runtime, pass a callable.

python

import spaces           # 必须在torch / diffusers / transformers之前导入
import torch
import gradio as gr
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("<仓库名>", torch_dtype=torch.bfloat16).to("cuda")

@spaces.GPU(duration=60)
def generate(prompt):
    return pipe(prompt).images[0]

gr.Interface(fn=generate, inputs=gr.Text(), outputs=gr.Image()).launch()

三条核心规则——详细说明见

references/zerogpu.md

：

import spaces
必须在torch/任何涉及CUDA的导入之前。它会通过猴子补丁修改
```
torch.cuda.*
```
；一旦主进程初始化CUDA，就无法再进行补丁。
在模块作用域加载模型，并立即
.to("cuda")
。ZeroGPU会拦截此调用，将权重打包到磁盘，并在首次进入
```
@spaces.GPU
```
装饰的函数时将权重流式传输到显存。若在装饰器内延迟加载，会导致每个用户都需承担加载成本。
为Gradio绑定的函数添加装饰器。估算
```
duration
```
为实际最坏情况（值越小，队列优先级越高，配额检查越严格）。若运行时取决于输入，可传入一个可调用对象。

requirements.txt

Short version:

Do NOT list:
```
gradio
```
,
```
spaces
```
,
```
huggingface_hub
```
(preinstalled and platform-managed; pinning them causes resolution failures or silently breaks the ZeroGPU runtime).
Do list if you use them:
```
torchvision
```
,
```
torchaudio
```
(not preinstalled), plus everything else (
```
diffusers
```
,
```
transformers
```
,
```
accelerate
```
,
```
sentencepiece
```
, …).
ZeroGPU only accepts torch
```
2.8.0
```
,
```
2.9.1
```
,
```
2.10.0
```
,
```
2.11.0
```
. Default to leaving torch unpinned (the runtime preinstalls the latest). Only pin when a dep forces it.

For prebuilt CUDA-extension wheels (

flash_attn

xformers

pytorch3d

nvdiffrast

diff_gaussian_rasterization

torchmcubes

): use the prebuilt Blackwell wheels at

https://huggingface.co/datasets/multimodalart/zerogpu-blackwell-wheels/tree/main/wheels

. Full mapping + caveats in

references/requirements.md

简化版规则：

请勿列出：
```
gradio
```
、
```
spaces
```
、
```
huggingface_hub
```
（已预安装且由平台管理；固定版本会导致依赖解析失败或静默破坏ZeroGPU运行时）。
若使用则需列出：
```
torchvision
```
、
```
torchaudio
```
（未预安装），以及其他所有依赖（
```
diffusers
```
、
```
transformers
```
、
```
accelerate
```
、
```
sentencepiece
```
等）。
ZeroGPU仅接受torch版本
```
2.8.0
```
、
```
2.9.1
```
、
```
2.10.0
```
、
```
2.11.0
```
。默认不固定torch版本（运行时会预安装最新版）。仅当依赖强制要求时才固定版本。
对于预构建的CUDA扩展轮子（
```
flash_attn
```
、
```
xformers
```
、
```
pytorch3d
```
、
```
nvdiffrast
```
、
```
diff_gaussian_rasterization
```
、
```
torchmcubes
```
）：使用https://huggingface.co/datasets/multimodalart/zerogpu-blackwell-wheels/tree/main/wheels提供的Blackwell预构建轮子。完整映射及注意事项见[`references/requirements.md`](references/requirements.md)。

Per-SDK depth

各SDK详细指南

Gradio patterns (themes,
```
gr.Examples
```
, streaming, custom HTML components,
```
gr.Server
```
):
```
references/gradio.md
```
.
Docker: https://huggingface.co/docs/hub/spaces-sdks-docker. Examples:
```
hf spaces list --filter docker
```
.
Static: https://huggingface.co/docs/hub/spaces-sdks-static. For built SPAs, set
```
app_build_command: npm run build
```
and
```
app_file: dist/index.html
```
in frontmatter.
ZeroGPU specifics (decorator semantics, sizing, AoTI, generators, concurrency, pickle /
```
gr.State
```
across the worker boundary):
```
references/zerogpu.md
```
— read this whenever the Space targets ZeroGPU.

Gradio模式（主题、
```
gr.Examples
```
、流式传输、自定义HTML组件、
```
gr.Server
```
）：见
```
references/gradio.md
```
。
Docker：https://huggingface.co/docs/hub/spaces-sdks-docker。示例：`hf spaces list --filter docker`。
Static：https://huggingface.co/docs/hub/spaces-sdks-static。对于构建后的SPA，在前置元数据中设置`app_build_command: npm run build
```
和
```
app_file: dist/index.html`。
ZeroGPU专属细节（装饰器语义、规格调整、AoTI、生成器、并发、跨工作进程边界的pickle /
```
gr.State
```
）：见
```
references/zerogpu.md
```
——当Space目标为ZeroGPU时，请务必阅读此文档。

6. Iterate on the Space, not locally

6. 在Space上迭代，而非本地

Try to build a release candidate from the user quest locally and push it — then use the live URL as your test loop. The Space environment is the only one that matters; do not try to test locally.

python3 -m py_compile app.py

is the maximum local check worth doing before pushing.

Once pushed, pick the cheapest update mechanism for each change — hot-reload for pure Python edits,

hf upload

for code-only files hot-reload can't touch, full rebuild only when

requirements.txt

Dockerfile

/ README frontmatter actually changed. Full ladder + footguns (hot-reload poisoning factory reboot, runtime.sha lag, etc.) in

references/debugging.md

尝试在本地构建一个候选版本并推送——然后使用在线URL作为测试循环。只有Space环境才是关键；请勿尝试本地测试。推送前最多只需执行

python3 -m py_compile app.py

进行本地检查。

推送后，根据变更类型选择最经济的更新方式：纯Python编辑使用热重载，代码文件（热重载无法处理的）使用

hf upload

，仅当

requirements.txt

Dockerfile

/README前置元数据实际变更时才进行完整重建。完整更新层级及注意事项（热重载污染导致工厂重启、runtime.sha延迟等）见

references/debugging.md

。

7. Verify

7. 验证

Don't trust

RUNNING

alone — the app can be running but broken. Four steps, in order:

A. Alive? Stage + hardware:

bash

hf spaces info <ns>/<name> --expand runtime

B. Logs clean post-boot? Read the run log to confirm startup finished without warnings or silent fallbacks:

bash

hf spaces logs <ns>/<name> --tail 200

Look for model-load completion, no import warnings, no "falling back to CPU" / dtype downgrade messages, no

RUNNING

masking a half-broken app.

C. API actually responds. With logs still tailing in another terminal (

hf spaces logs <ns>/<name> --follow

), call the endpoint:

python

from gradio_client import Client, handle_file
import os
c = Client("<ns>/<name>", token=os.environ["HF_TOKEN"], httpx_kwargs={"timeout": 600})
print(c.view_api())                    # discover endpoints — don't guess
result = c.predict(..., api_name="/generate")

D. Sniff output AND logs. HTTP 200 ≠ correct output. Check both:

python

head = open(result, "rb").read(16)

不要仅依赖

RUNNING

状态——应用可能已启动但存在故障。按以下四步验证：

**A. 应用是否存活？**查看阶段及硬件信息：

bash

hf spaces info <命名空间>/<名称> --expand runtime

**B. 启动后日志是否干净？**查看运行日志，确认启动过程无警告或静默降级：

bash

hf spaces logs <命名空间>/<名称> --tail 200

需确认模型加载完成、无导入警告、无“回退到CPU”/数据类型降级信息、无

RUNNING

状态掩盖半故障应用的情况。

**C. API是否实际响应？**在另一个终端持续查看日志（

hf spaces logs <命名空间>/<名称> --follow

），调用接口：

python

from gradio_client import Client, handle_file
import os
c = Client("<命名空间>/<名称>", token=os.environ["HF_TOKEN"], httpx_kwargs={"timeout": 600})
print(c.view_api())                    # 发现接口——不要猜测
result = c.predict(..., api_name="/generate")

**D. 检查输出和日志。**HTTP 200≠输出正确。需同时检查两者：

python

head = open(result, "rb").read(16)

glTF / \x89PNG / RIFF…WEBP / RIFF…WAVE / [4:8]==b"ftyp" → png/jpg/webp/wav/mp4

And look at the run log emitted during the call — silent fallbacks (model snapping to a different size, missing optional dep, dtype downgrade) only show up there.

Full smoke-test patterns (streaming endpoints, OAuth-gated Spaces, `gr.Server` custom routes): [`references/debugging.md`](references/debugging.md).

同时查看调用过程中生成的运行日志——静默降级（模型切换到其他规格、缺失可选依赖、数据类型降级）仅会在日志中显示。

完整冒烟测试模式（流式接口、OAuth授权的Space、`gr.Server`自定义路由）：见[`references/debugging.md`](references/debugging.md)。

8. Permanent storage (buckets)

8. 永久存储（存储桶）

Spaces are stateless —

/data

is wiped on restart. If the Space needs to persist user uploads, generations, logs, or interact with a long-lived store, mount a bucket:

bash

hf buckets create <ns>/<bucket-name>                                          # --private optional
hf spaces volumes set <ns>/<space> -v hf://buckets/<ns>/<bucket-name>:/data   # read-write at /data

Buckets are paid storage; check

canPay

and confirm with the user. Full patterns (read-fast / write-durable, public bucket URLs, model-cache anti-pattern):

references/buckets.md

Space是无状态的——

/data

目录会在重启时被清空。若Space需要持久化用户上传内容、生成结果、日志，或与长期存储交互，可挂载存储桶：

bash

hf buckets create <命名空间>/<存储桶名称>                                          # 可选--private参数
 hf spaces volumes set <命名空间>/<Space名称> -v hf://buckets/<命名空间>/<存储桶名称>:/data   # 在/data目录挂载读写权限

存储桶为付费服务；需检查

canPay

状态并与用户确认。完整使用模式（快速读取/持久化写入、公开存储桶URL、模型缓存反模式）见

references/buckets.md

。

9. When things break

9. 故障排查

Order of operations:

Read the logs:
```
hf spaces logs <id> --build --follow
```
(build error) or
```
hf spaces logs <id> --follow
```
(runtime error). Find the first error, not the last.
Grep
```
references/known-errors.md
```
for the error string. Check if this is a known issue before trying your own fix — most common ZeroGPU / Gradio / dependency errors have a 1–2 line fix there.
Iterate using the cheapest rung from
```
references/debugging.md
```
. The vast majority of issues resolve with log-reading + smoke-test loops; interactive dev mode + SSH is a heavy-hammer last resort.

If you solve an error that wasn't in the known-errors list, suggest the user PR it back to this skill so future runs benefit.

操作顺序：

查看日志：
```
hf spaces logs <ID> --build --follow
```
（构建错误）或
```
hf spaces logs <ID> --follow
```
（运行时错误）。找到第一个错误，而非最后一个。
在
```
references/known-errors.md
```
中搜索错误字符串。在尝试自行修复前，先确认是否为已知问题——大多数常见的ZeroGPU/Gradio/依赖错误都有1-2行的修复方案。
使用
```
references/debugging.md
```
中最经济的方式迭代。绝大多数问题可通过查看日志+冒烟测试循环解决；交互式开发模式+SSH是最后的手段。

若你解决了一个未在已知错误列表中的问题，建议用户向本技能提交PR，以便后续使用此技能的用户受益。

Reference index

参考索引

When to read	File
How ZeroGPU works + correct patterns (decorator, sizing, pickle, generators, real-time, AoTI)	`references/zerogpu.md`
Iterate + debug: logs, rung ladder, smoke testing (and dev mode + SSH as a last resort)	`references/debugging.md`
Error-string lookup — the single place for all error symptoms (Spaces, ZeroGPU, Gradio, deps)	`references/known-errors.md`
Pinning deps, picking wheels, torch-family alignment	`references/requirements.md`
`gr.Examples` caching, themes, custom HTML components, `gr.Server`	`references/gradio.md`
Persistent storage, public bucket URLs	`references/buckets.md`
Community grant requests (non-PRO needing ZeroGPU)	`references/grants.md`
Provider proxy (zero-VRAM big LLM via Cerebras / Fireworks / Together / etc.)	`references/inference-providers.md`

阅读场景	文件
ZeroGPU工作原理 + 正确使用模式（装饰器、规格调整、pickle、生成器、实时处理、AoTI）	`references/zerogpu.md`
迭代与调试：日志、更新层级、冒烟测试（以及作为最后手段的开发模式+SSH）	`references/debugging.md`
错误字符串查询——所有错误症状（Space、ZeroGPU、Gradio、依赖）的统一查询入口	`references/known-errors.md`
依赖版本固定、轮子选择、Torch家族版本对齐	`references/requirements.md`
`gr.Examples` 缓存、主题、自定义HTML组件、 `gr.Server`	`references/gradio.md`
持久化存储、公开存储桶URL	`references/buckets.md`
社区资助申请（非PRO用户需使用ZeroGPU）	`references/grants.md`
提供商代理（通过Cerebras / Fireworks / Together等实现零显存大模型推理）	`references/inference-providers.md`