truefoundry-notebooks

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Routing note: For ambiguous user intents, use the shared clarification templates in references/intent-clarification.md.
<objective>
路由说明:针对模糊的用户意图,请使用 references/intent-clarification.md 中的通用澄清模板。
<objective>

Jupyter Notebooks

Jupyter Notebooks

Launch Jupyter Notebooks on TrueFoundry with GPU support, persistent storage, auto-shutdown, and VS Code integration. Write a YAML manifest and apply with
tfy apply
. REST API fallback when CLI unavailable.
在TrueFoundry上启动Jupyter Notebooks,支持GPU、持久化存储、自动关机和VS Code集成。编写YAML清单并使用
tfy apply
命令应用。CLI不可用时可回退使用REST API。

When to Use

适用场景

  • User asks "launch a notebook", "start jupyter", "create notebook"
  • User needs a development environment with GPU access
  • User wants to explore data or prototype ML models
  • User asks about notebook images, auto-shutdown, or persistent storage
  • 用户询问“启动notebook”、“启动jupyter”、“创建notebook”
  • 用户需要带GPU访问权限的开发环境
  • 用户想要探索数据或原型化ML模型
  • 用户询问notebook镜像、自动关机或持久化存储相关问题

When NOT to Use

不适用场景

  • User wants to deploy a production service → prefer
    deploy
    skill; ask if the user wants another valid path
  • User wants to deploy a model → prefer
    llm-deploy
    skill; ask if the user wants another valid path
  • User wants an SSH server → prefer
    ssh-server
    skill; ask if the user wants another valid path
</objective> <context>
  • 用户想要部署生产服务 → 优先使用
    deploy
    skill;询问用户是否需要其他可行路径
  • 用户想要部署模型 → 优先使用
    llm-deploy
    skill;询问用户是否需要其他可行路径
  • 用户想要SSH服务器 → 优先使用
    ssh-server
    skill;询问用户是否需要其他可行路径
</objective> <context>

Prerequisites

前置条件

Always verify before launching a notebook:
  1. Credentials
    TFY_BASE_URL
    and
    TFY_API_KEY
    must be set (env or
    .env
    )
  2. Workspace
    TFY_WORKSPACE_FQN
    required. Never auto-pick. Ask the user if missing.
  3. CLI — Check
    tfy --version
    . Install if missing:
    pip install 'truefoundry==0.5.0' && tfy login --host "$TFY_BASE_URL"
For credential check commands and .env setup, see
references/prerequisites.md
.
启动notebook前请始终确认以下条件:
  1. 凭证 — 必须设置
    TFY_BASE_URL
    TFY_API_KEY
    (环境变量或
    .env
    文件中)
  2. 工作区 — 需要
    TFY_WORKSPACE_FQN
    切勿自动选择。如果缺失请询问用户。
  3. CLI — 检查
    tfy --version
    。如果缺失请安装:
    pip install 'truefoundry==0.5.0' && tfy login --host "$TFY_BASE_URL"
凭证检查命令和.env配置相关内容请查看
references/prerequisites.md

CLI Detection

CLI检测

bash
tfy --version
CLI OutputStatusAction
tfy version X.Y.Z
(>= 0.5.0)
CurrentUse
tfy apply
as documented below.
tfy version X.Y.Z
(0.3.x-0.4.x)
OutdatedUpgrade: install a pinned version (e.g.
pip install 'truefoundry==0.5.0'
). Core
tfy apply
should still work.
Command not foundNot installedInstall:
pip install 'truefoundry==0.5.0' && tfy login --host "$TFY_BASE_URL"
CLI unavailable (no pip/Python)FallbackUse REST API via
tfy-api.sh
. See
references/cli-fallback.md
.
</context> <instructions>
bash
tfy --version
CLI输出状态操作
tfy version X.Y.Z
(>= 0.5.0)
最新版本按照下文文档使用
tfy apply
tfy version X.Y.Z
(0.3.x-0.4.x)
版本过旧升级:安装固定版本(例如
pip install 'truefoundry==0.5.0'
)。核心
tfy apply
功能仍可正常使用。
找不到命令未安装安装:
pip install 'truefoundry==0.5.0' && tfy login --host "$TFY_BASE_URL"
CLI不可用(无pip/Python)回退方案通过
tfy-api.sh
使用REST API。请查看
references/cli-fallback.md
</context> <instructions>

Launch Notebook via UI

通过UI启动Notebook

The fastest way is through the TrueFoundry dashboard:
  1. Go to Deployments → New Deployment → Jupyter Notebook
  2. Select workspace and configure resources
  3. Click Deploy
最快的方式是通过TrueFoundry dashboard操作:
  1. 进入 部署 → 新建部署 → Jupyter Notebook
  2. 选择工作区并配置资源
  3. 点击部署

Launch Notebook via
tfy apply
(CLI — Recommended)

通过
tfy apply
启动Notebook(CLI — 推荐)

Configuration Questions

配置确认问题

Before generating the manifest, ask the user:
  1. Name — What to call the notebook
  2. GPU needed? — CPU notebook (default) or GPU notebook (for ML/training)
  3. Home directory size — How much persistent storage in GB (default: 20)
  4. Auto-shutdown — Enable auto-shutdown after inactivity? If yes, how many minutes? (default: 30 minutes). Set
    cull_timeout: 0
    to disable.
生成清单前,请询问用户以下信息:
  1. 名称 — notebook的名称
  2. 是否需要GPU? — CPU notebook(默认)或GPU notebook(用于ML/训练)
  3. 家目录大小 — 持久化存储容量,单位GB(默认:20)
  4. 自动关机 — 是否启用闲置自动关机?如果启用,闲置多少分钟后关机?(默认:30分钟)。设置
    cull_timeout: 0
    可禁用自动关机。

CPU Notebook

CPU Notebook

1. Generate the manifest:
yaml
undefined
1. 生成清单:
yaml
undefined

tfy-manifest.yaml — Jupyter Notebook

tfy-manifest.yaml — Jupyter Notebook

name: my-notebook type: notebook image: image_uri: public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo home_directory_size: 20 cull_timeout: 30 resources: node: type: node_selector capacity_type: on_demand cpu_request: 1 cpu_limit: 3 memory_request: 4000 memory_limit: 6000 ephemeral_storage_request: 5000 ephemeral_storage_limit: 10000 workspace_fqn: "YOUR_WORKSPACE_FQN"

**2. Preview:**

```bash
tfy apply -f tfy-manifest.yaml --dry-run --show-diff
3. Apply:
bash
tfy apply -f tfy-manifest.yaml
name: my-notebook type: notebook image: image_uri: public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo home_directory_size: 20 cull_timeout: 30 resources: node: type: node_selector capacity_type: on_demand cpu_request: 1 cpu_limit: 3 memory_request: 4000 memory_limit: 6000 ephemeral_storage_request: 5000 ephemeral_storage_limit: 10000 workspace_fqn: "YOUR_WORKSPACE_FQN"

**2. 预览:**

```bash
tfy apply -f tfy-manifest.yaml --dry-run --show-diff
3. 应用:
bash
tfy apply -f tfy-manifest.yaml

GPU Notebook

GPU Notebook

yaml
undefined
yaml
undefined

tfy-manifest.yaml — GPU Jupyter Notebook

tfy-manifest.yaml — GPU Jupyter Notebook

name: gpu-notebook type: notebook image: image_uri: public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo home_directory_size: 20 cull_timeout: 30 resources: node: type: node_selector capacity_type: on_demand cpu_request: 4 cpu_limit: 8 memory_request: 16000 memory_limit: 32000 ephemeral_storage_request: 10000 ephemeral_storage_limit: 20000 devices: - type: nvidia_gpu name: T4 count: 1 workspace_fqn: "YOUR_WORKSPACE_FQN"
undefined
name: gpu-notebook type: notebook image: image_uri: public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo home_directory_size: 20 cull_timeout: 30 resources: node: type: node_selector capacity_type: on_demand cpu_request: 4 cpu_limit: 8 memory_request: 16000 memory_limit: 32000 ephemeral_storage_request: 10000 ephemeral_storage_limit: 20000 devices: - type: nvidia_gpu name: T4 count: 1 workspace_fqn: "YOUR_WORKSPACE_FQN"
undefined

Launch Notebook via REST API (Fallback)

通过REST API启动Notebook(回退方案)

When CLI is not available, use
tfy-api.sh
. Set
TFY_API_SH
to the full path of this skill's
scripts/tfy-api.sh
. See
references/tfy-api-setup.md
for paths per agent.
当CLI不可用时,使用
tfy-api.sh
。将
TFY_API_SH
设置为该skill的
scripts/tfy-api.sh
的完整路径。各agent对应的路径请查看
references/tfy-api-setup.md

Create Notebook

创建Notebook

bash
TFY_API_SH=~/.claude/skills/truefoundry-notebooks/scripts/tfy-api.sh

$TFY_API_SH PUT /api/svc/v1/apps -d '{
  "name": "my-notebook",
  "type": "notebook",
  "image": {
    "image_uri": "public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo"
  },
  "home_directory_size": 20,
  "cull_timeout": 30,
  "resources": {
    "node": {"type": "node_selector", "capacity_type": "on_demand"},
    "cpu_request": 1,
    "cpu_limit": 3,
    "memory_request": 4000,
    "memory_limit": 6000,
    "ephemeral_storage_request": 5000,
    "ephemeral_storage_limit": 10000
  },
  "workspace_fqn": "WORKSPACE_FQN"
}'
bash
TFY_API_SH=~/.claude/skills/truefoundry-notebooks/scripts/tfy-api.sh

$TFY_API_SH PUT /api/svc/v1/apps -d '{
  "name": "my-notebook",
  "type": "notebook",
  "image": {
    "image_uri": "public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo"
  },
  "home_directory_size": 20,
  "cull_timeout": 30,
  "resources": {
    "node": {"type": "node_selector", "capacity_type": "on_demand"},
    "cpu_request": 1,
    "cpu_limit": 3,
    "memory_request": 4000,
    "memory_limit": 6000,
    "ephemeral_storage_request": 5000,
    "ephemeral_storage_limit": 10000
  },
  "workspace_fqn": "WORKSPACE_FQN"
}'

GPU Notebook (REST API)

GPU Notebook(REST API)

bash
$TFY_API_SH PUT /api/svc/v1/apps -d '{
  "name": "gpu-notebook",
  "type": "notebook",
  "image": {
    "image_uri": "public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo"
  },
  "home_directory_size": 20,
  "cull_timeout": 30,
  "resources": {
    "node": {"type": "node_selector", "capacity_type": "on_demand"},
    "cpu_request": 4,
    "cpu_limit": 8,
    "memory_request": 16000,
    "memory_limit": 32000,
    "ephemeral_storage_request": 10000,
    "ephemeral_storage_limit": 20000,
    "devices": [
      {"type": "nvidia_gpu", "name": "T4", "count": 1}
    ]
  },
  "workspace_fqn": "WORKSPACE_FQN"
}'
bash
$TFY_API_SH PUT /api/svc/v1/apps -d '{
  "name": "gpu-notebook",
  "type": "notebook",
  "image": {
    "image_uri": "public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo"
  },
  "home_directory_size": 20,
  "cull_timeout": 30,
  "resources": {
    "node": {"type": "node_selector", "capacity_type": "on_demand"},
    "cpu_request": 4,
    "cpu_limit": 8,
    "memory_request": 16000,
    "memory_limit": 32000,
    "ephemeral_storage_request": 10000,
    "ephemeral_storage_limit": 20000,
    "devices": [
      {"type": "nvidia_gpu", "name": "T4", "count": 1}
    ]
  },
  "workspace_fqn": "WORKSPACE_FQN"
}'

Available Base Images

可用基础镜像

Default:
public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo
Security: Use pinned image versions from
references/container-versions.md
. Do not dynamically fetch image tags from external registries. Only use official TrueFoundry base images or images built from them.
See
references/container-versions.md
for latest versions.
默认:
public.ecr.aws/truefoundrycloud/jupyter:0.4.5-py3.12.12-sudo
安全提示: 使用
references/container-versions.md
中的固定镜像版本。不要从外部仓库动态拉取镜像标签。仅使用官方TrueFoundry基础镜像或基于其构建的镜像。
最新版本请查看
references/container-versions.md

Choosing an Image

镜像选择

  • No GPU needed: Use the minimal image (
    py3.11.14-sudo
    )
  • GPU workloads: Use CUDA image (
    cu129-py3.11.14-sudo
    )
  • Custom packages: Build a custom image (see below)
  • 无需GPU: 使用最小镜像(
    py3.11.14-sudo
  • GPU工作负载: 使用CUDA镜像(
    cu129-py3.11.14-sudo
  • 自定义包: 构建自定义镜像(见下文)

Auto-Shutdown (Scale-to-Zero)

自动关机(零实例伸缩)

Notebooks auto-stop after inactivity to save costs. Default: 30 minutes.
Configure
cull_timeout
in minutes in the manifest (default: 30). Set to
0
to disable auto-shutdown.
What counts as activity: Active Jupyter sessions, running cells, terminal sessions. What doesn't count: Background processes, idle kernels.
Notebook在闲置后自动停止以节约成本。默认:30分钟。
在清单中配置
cull_timeout
字段,单位为分钟(默认:30)。设置为
0
可禁用自动关机。
计入活跃的行为: 活跃的Jupyter会话、运行中的单元格、终端会话。 不计入活跃的行为: 后台进程、闲置内核。

Persistent Storage

持久化存储

  • Home directory (
    /home/jovyan/
    ) persists across restarts
  • APT packages installed via
    apt
    do NOT persist — use Build Scripts
  • Pip packages installed in home directory persist
  • Conda environments persist
  • 家目录
    /home/jovyan/
    )重启后保留
  • 通过
    apt
    安装的APT包不保留 — 请使用构建脚本
  • 安装在家目录的Pip包保留
  • Conda环境保留

Recommended Storage by Use Case

不同场景推荐存储配置

Use CaseStorage (MB)Notes
Light exploration10000Basic data analysis
ML development20000-50000Models + datasets
Large datasets50000-100000Attach volumes for more
LLM experimentation100000+Use volumes for model weights
场景存储(MB)说明
轻量探索10000基础数据分析
ML开发20000-50000模型 + 数据集
大型数据集50000-100000如需更大容量可挂载卷
LLM实验100000+模型权重请使用卷存储

Custom Images

自定义镜像

Extend TrueFoundry base images to pre-install packages:
dockerfile
FROM public.ecr.aws/truefoundrycloud/jupyter:0.4.6-py3.11.14-sudo

USER root
RUN DEBIAN_FRONTEND=noninteractive apt install -y --no-install-recommends ffmpeg
USER jovyan

RUN python3 -m pip install --use-pep517 --no-cache-dir torch torchvision pandas scikit-learn
Critical: Do NOT modify ENTRYPOINT or CMD — TrueFoundry requires them.
扩展TrueFoundry基础镜像来预安装包:
dockerfile
FROM public.ecr.aws/truefoundrycloud/jupyter:0.4.6-py3.11.14-sudo

USER root
RUN DEBIAN_FRONTEND=noninteractive apt install -y --no-install-recommends ffmpeg
USER jovyan

RUN python3 -m pip install --use-pep517 --no-cache-dir torch torchvision pandas scikit-learn
重要提示: 不要修改ENTRYPOINT或CMD — TrueFoundry需要使用它们。

Build Scripts (Persistent APT Packages)

构建脚本(持久化APT包)

Instead of custom images, add a build script during deployment to install system packages on every start:
bash
sudo apt update
sudo apt install -y ffmpeg libsm6 libxext6
除了自定义镜像,你也可以在部署时添加构建脚本,在每次启动时安装系统包:
bash
sudo apt update
sudo apt install -y ffmpeg libsm6 libxext6

Cloud Storage Access

云存储访问

Via Environment Variables

通过环境变量

Set during deployment:
  • AWS S3:
    AWS_ACCESS_KEY_ID
    ,
    AWS_SECRET_ACCESS_KEY
  • GCS:
    GOOGLE_APPLICATION_CREDENTIALS
部署时设置:
  • AWS S3:
    AWS_ACCESS_KEY_ID
    ,
    AWS_SECRET_ACCESS_KEY
  • GCS:
    GOOGLE_APPLICATION_CREDENTIALS

Via IAM Service Account

通过IAM服务账号

Attach cloud-native IAM roles through service account integration for secure, credential-free access.
通过服务账号集成绑定云原生IAM角色,实现安全的免凭证访问。

Via Volumes

通过卷

Mount TrueFoundry persistent volumes for direct data access. See
volumes
skill.
挂载TrueFoundry持久化卷实现直接数据访问。请查看
volumes
skill。

Git Integration

Git集成

JupyterLab includes a built-in Git extension. Configure:
bash
git config --global user.name "Your Name"
git config --global user.email "you@example.com"
Use Personal Access Tokens or SSH keys for authentication.
JupyterLab内置Git扩展。配置方式:
bash
git config --global user.name "Your Name"
git config --global user.email "you@example.com"
使用个人访问令牌或SSH密钥进行认证。

Python Environment Management

Python环境管理

Default: Python 3.11. Create additional environments:
bash
conda create -y -n py39 python=3.9
Wait ~2 minutes for kernel sync, then hard-refresh JupyterLab.
默认:Python 3.11。创建额外环境:
bash
conda create -y -n py39 python=3.9
等待约2分钟让内核同步,然后硬刷新JupyterLab即可。

Presenting Notebooks

展示Notebook列表

Show as a table:
Notebooks:
| Name          | Status  | Image         | GPU  | Storage |
|---------------|---------|---------------|------|---------|
| dev-notebook  | Running | py3.11 + CUDA | T4   | 50 GB   |
| data-analysis | Stopped | py3.11        | None | 20 GB   |
</instructions>
<success_criteria>
以表格形式展示:
Notebooks:
| Name          | Status  | Image         | GPU  | Storage |
|---------------|---------|---------------|------|---------|
| dev-notebook  | Running | py3.11 + CUDA | T4   | 50 GB   |
| data-analysis | Stopped | py3.11        | None | 20 GB   |
</instructions>
<success_criteria>

Success Criteria

成功标准

  • The notebook is launched and accessible via its URL in the TrueFoundry dashboard
  • GPU resources are allocated as requested and visible inside the notebook (e.g.,
    nvidia-smi
    works)
  • Persistent storage is configured so the user's files survive restarts
  • Auto-shutdown is enabled to prevent unnecessary cost from idle notebooks
  • The user can install packages and access their data (cloud storage, volumes, or local upload)
</success_criteria>
<references>
  • Notebook已启动,可通过TrueFoundry dashboard中的URL访问
  • GPU资源已按请求分配,在notebook内可见(例如
    nvidia-smi
    可正常运行)
  • 已配置持久化存储,用户文件在重启后保留
  • 已启用自动关机,避免闲置notebook产生不必要成本
  • 用户可安装包并访问其数据(云存储、卷或本地上传)
</success_criteria>
<references>

Composability

组合使用

  • Need workspace: Use
    workspaces
    skill to find target workspace
  • Need GPU info: Use
    workspaces
    skill to check available GPU types on cluster
  • Need volumes: Use
    volumes
    skill to create persistent storage, then mount
  • Deploy model after prototyping: Use
    deploy
    or
    llm-deploy
    skill
  • Check status: Use
    applications
    skill to see notebook status
</references> <troubleshooting>
  • 需要工作区: 使用
    workspaces
    skill查找目标工作区
  • 需要GPU信息: 使用
    workspaces
    skill检查集群上可用的GPU类型
  • 需要卷: 使用
    volumes
    skill创建持久化存储,然后挂载
  • 原型完成后部署模型: 使用
    deploy
    llm-deploy
    skill
  • 检查状态: 使用
    applications
    skill查看notebook状态
</references> <troubleshooting>

Error Handling

错误处理

CLI Errors

CLI错误

tfy: command not found
Install the TrueFoundry CLI:
  pip install 'truefoundry==0.5.0'
  tfy login --host "$TFY_BASE_URL"
Manifest validation failed.
Check:
- YAML syntax is valid
- Required fields: name, type, workspace_fqn
- Image URI exists and is accessible
- Resource values use correct units (memory in MB)
tfy: command not found
安装TrueFoundry CLI:
  pip install 'truefoundry==0.5.0'
  tfy login --host "$TFY_BASE_URL"
清单验证失败。
检查:
- YAML语法有效
- 必填字段:name、type、workspace_fqn
- 镜像URI存在且可访问
- 资源值使用正确单位(内存单位为MB)

Notebook Not Starting

Notebook无法启动

Notebook stuck in pending. Check:
- Requested GPU type may not be available on cluster
- Insufficient cluster resources (CPU/memory)
- Image pull errors (check container registry access)
Notebook卡在待处理状态。检查:
- 请求的GPU类型在集群上可能不可用
- 集群资源不足(CPU/内存)
- 镜像拉取错误(检查容器仓库访问权限)

GPU Not Detected

GPU未检测到

GPU not visible in notebook. Verify:
- Used CUDA image (cu129-* variant)
- Requested GPU type is available (check workspaces skill)
- CUDA toolkit version matches your framework requirements
Notebook内看不到GPU。验证:
- 使用了CUDA镜像(cu129-* 变体)
- 请求的GPU类型可用(请查看workspaces skill)
- CUDA工具包版本符合框架要求

Storage Full

存储已满

Notebook storage full. Options:
- Clean up unused files in /home/jovyan/
- Increase storage allocation
- Mount an external volume for large datasets
Notebook存储已满。可选方案:
- 清理/home/jovyan/下的无用文件
- 增加存储分配额度
- 挂载外部卷存储大型数据集

REST API Fallback Errors

REST API回退错误

401 Unauthorized — Check TFY_API_KEY is valid
404 Not Found — Check TFY_BASE_URL and API endpoint path
422 Validation Error — Check manifest fields match expected schema
</troubleshooting>
401 未授权 — 检查TFY_API_KEY是否有效
404 未找到 — 检查TFY_BASE_URL和API端点路径
422 验证错误 — 检查清单字段是否符合预期 schema
</troubleshooting>