agentdeploy-deploy

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

AgentDeploy Deploy

AgentDeploy 部署

Use this skill when the user wants an application deployed onto AgentDeploy, or when an existing AgentDeploy deployment needs to be updated or debugged.

当用户需要将应用部署到AgentDeploy环境，或需要更新、调试已有的AgentDeploy部署时，可使用本技能。

What this skill covers

本技能涵盖的内容

infer the right split between
```
SharedInfra
```
and
```
Service
```
choose the correct workload type and minimum infrastructure
validate and dry-run before changing live state
deploy with
```
agentdeploy
```
through the Platform API when available, then poll structured status
debug policy, auth, infrastructure, and rollout failures

Read references/service-contract.md when writing or editing

SharedInfra

Service

. Read references/operations.md when running the CLI or handling failures.

Use the templates in

assets/

as the starting point:

```
assets/shared-infra.yaml
```
```
assets/service-web.yaml
```
```
assets/service-api.yaml
```
```
assets/service-worker.yaml
```
```
assets/service-cron.yaml
```

合理划分
```
SharedInfra
```
与
```
Service
```
的职责边界
选择正确的工作负载类型与最小化基础设施配置
在变更生产环境前进行验证与试运行
当可用时，通过Platform API使用
```
agentdeploy
```
进行部署，随后轮询结构化状态
调试策略、认证、基础设施与发布故障

编写或编辑

SharedInfra

或

Service

时，请阅读references/service-contract.md。运行CLI或处理故障时，请阅读references/operations.md。

以

assets/

目录下的模板为起点：

```
assets/shared-infra.yaml
```
```
assets/service-web.yaml
```
```
assets/service-api.yaml
```
```
assets/service-worker.yaml
```
```
assets/service-cron.yaml
```

Prerequisite: CLI availability

前置条件：CLI可用性

Before using this skill, make sure

agentdeploy

is installed and on

PATH

Current supported user install path:

bash

command -v brew

使用本技能前，请确保

agentdeploy

已安装并添加到

PATH

中。

当前支持的用户安装路径：

bash

command -v brew

if brew is missing, stop and ask the user to install Homebrew themselves from:

如果未安装brew，请停止操作并引导用户从以下地址自行安装Homebrew：

https://brew.sh/

continue only after brew is available on PATH

仅当brew已添加到PATH后再继续

command -v gh

if gh is missing:

如果未安装gh：

brew install gh

gh auth login gh auth setup-git gh auth status

brew tap elementx-ai/tap https://github.com/elementx-ai/homebrew-tap brew install --HEAD elementx-ai/tap/agentdeploy

brew install gh

gh auth login gh auth setup-git gh auth status

brew tap elementx-ai/tap https://github.com/elementx-ai/homebrew-tap brew install --HEAD elementx-ai/tap/agentdeploy

or, if it is already installed:

若已安装，执行升级：

brew upgrade --fetch-HEAD elementx-ai/tap/agentdeploy


This is the current private macOS install path. If `brew` is missing, direct the user to [brew.sh](https://brew.sh/) and wait for them to finish that install themselves before continuing. If `gh`, GitHub auth, or `agentdeploy` is still unavailable after that, stop and report the install blocker before attempting deploy commands.

Before debugging any feature mismatch between docs and the installed CLI, run:

```bash
agentdeploy version

The current CLI already carries the prototype Platform API URL and Entra scope by default. On the first API-backed command it will run its own Entra device-code login flow and cache the session token until it expires. You only need to set API flags or environment variables when overriding the installation defaults.

Treat

AGENTDEPLOY_CONFIG_REPO_REMOTE

as an explicit fallback only for intentional direct GitOps mode without the Platform API.

brew upgrade --fetch-HEAD elementx-ai/tap/agentdeploy


以上是当前适用于私有macOS环境的安装路径。若缺少brew，请引导用户访问[brew.sh](https://brew.sh/)完成安装，待安装完成后再继续。若完成上述步骤后`gh`、GitHub认证或`agentdeploy`仍不可用，请停止操作并报告安装阻塞问题，再尝试部署命令。

在调试文档与已安装CLI之间的功能不匹配问题前，先运行：

```bash
agentdeploy version

当前CLI默认已内置原型Platform API URL和Entra作用域。首次执行基于API的命令时，它会自动运行Entra设备码登录流程，并缓存会话令牌直至过期。仅当需要覆盖安装默认值时，才需设置API标志或环境变量。

仅在有意使用无Platform API的直接GitOps模式时，才将

AGENTDEPLOY_CONFIG_REPO_REMOTE

作为显式回退选项。

Workflow

工作流程

Confirm the CLI is available, then inspect the app before writing contracts.
- Run
```
command -v agentdeploy
```
  before using the workflow.
- Look for whether it serves HTTP, runs background work, or is purely scheduled.
- Check whether it already expects
```
PORT
```
  ,
```
DATABASE_URL
```
  ,
```
REDIS_URL
```
  , or migration commands.
- Prefer reusing an existing immutable image digest. Use a
```
build:
```
  block only when the user wants AgentDeploy to build from source.
Choose the application shape.
- If the app has exactly one service, no shared state, no
```
valueFrom.infrastructureRef
```
  , and no
```
valueFrom.serviceRef
```
  , a single
```
Service
```
  is enough.
- In that standalone shape, AgentDeploy can bootstrap
```
Namespace
```
  ,
```
ResourceQuota
```
  , and
```
LimitRange
```
  directly from the
```
Service
```
  record.
- If multiple services need to share a namespace, PostgreSQL, Redis, object storage, or service-to-service wiring, create a
```
SharedInfra
```
  record first and then point each service at it through
```
metadata.application
```
  .
- Treat
```
SharedInfra
```
  as the only place that owns built-in DB, Redis, or object-storage resources. Do not put
```
spec.infrastructure
```
  on a
```
Service
```
  ; that model is gone.
Choose the workload shape for each service.
- ```
web
```
  : browser-facing app with HTTP ingress.
- ```
api
```
  : browser-consumed backend with HTTP ingress.
- ```
worker
```
  : no ingress, background process.
- ```
cron
```
  : scheduled job, no ingress.
Create or update the contracts.
- Start from
```
assets/shared-infra.yaml
```
  when the app needs shared DB/Redis or multiple services.
- Start each service from the closest service template in
```
assets/
```
  .
- Keep the app name DNS-safe.
- If the repo does not already declare deploy metadata, derive a unique app name and subdomain from the repo or directory name so the deployment does not collide with an existing app.
- If
```
owner
```
  is missing from repo context, prefer a real maintainer email from git config, docs, or project context. Only invent a synthetic owner for an explicit smoke test.
- If
```
team
```
  is missing from repo context, prefer an obvious team name from the repo, parent directory, or surrounding project docs. If none exists, choose a clearly temporary team name for a smoke test and call out the assumption.
- Default to
```
dataClassification: internal
```
  unless the user explicitly says the data is more sensitive.
- Keep
```
metadata.application
```
  explicit on services when more than one service shares the same app.
- For a tiny standalone app, letting
```
metadata.application
```
  default to
```
metadata.name
```
  is fine.
- Default to
```
visibility: internal
```
  unless the user explicitly needs public reachability.
- Default to
```
authorization.mode: group-based
```
  unless the user explicitly wants
```
org-wide
```
  and policy allows it.
- Prefer dedicated Entra security groups for app access. Use broader team-wide groups only when the whole team should be able to use the app.
- For smoke tests or lab installs without real group IDs,
```
org-wide
```
  is acceptable only for
```
internal
```
  apps and only when the installation policy allows it. Do not use it for sensitive data.
- If the user explicitly wants a fully public app with no shared auth at all, warn them first that the app will be reachable anonymously on the public internet and will not receive any shared identity headers.
- For that mode, use:
  - ```
  spec.dataClassification: public
```
- ```
spec.access.auth: none
```
  - ```
  spec.access.authorization.mode: none
```
- In the current policy set, unauthenticated app access is allowed only for
```
dataClassification: public
```
  . Call out that the app will not receive
```
X-Auth-Request-*
```
  identity headers in that mode.
- If the app needs PostgreSQL, declare it on
```
SharedInfra
```
  under
```
spec.infrastructure.databases.<name>
```
  and wire one of the DB env variants from
```
valueFrom.infrastructureRef
```
  :
  - ```
  DATABASE_URL
```
  or
```
  DATABASE_URL_SYNC
```
  for libpq / sync clients
- ```
DATABASE_URL_ASYNC
```
    for common async Python stacks
- If the app needs Redis, declare it on
```
SharedInfra
```
  under
```
spec.infrastructure.redis.<name>
```
  and wire
```
REDIS_URL
```
  or
```
REDIS_URL_TLS
```
  from
```
valueFrom.infrastructureRef
```
  .
- If the app needs shared upload or document handoff storage across services, declare it on
```
SharedInfra
```
  under
```
spec.infrastructure.objects.<name>
```
  and wire it from
```
valueFrom.infrastructureRef
```
  with
```
kind: objectStorage
```
  .
- For split API/worker document flows, prefer object storage plus object keys over local-path handoff. Keep local disk only for scratch via
```
runtime.filesystem.writablePaths
```
  .
- If one service needs to call another service in the same application, wire that URL with
```
valueFrom.serviceRef
```
  instead of hardcoding domains or patching manifests.
- In the current prototype, Redis is only supported for
```
dataClassification: internal
```
  .
- Make sure the process actually listens on
```
runtime.port
```
  . If the app expects a
```
PORT
```
  env var, set it explicitly.
- ```
runtime.command
```
  and
```
runtime.args
```
  are supported. Use them when the workload needs an explicit startup command instead of baking a wrapper image only for process launch.
- If the app needs writable ephemeral directories, use
```
runtime.filesystem.writablePaths
```
  instead of asking users to patch manifests by hand.
- Only set
```
runtime.filesystem.readOnlyRootFilesystem: false
```
  when the app genuinely cannot work with explicit writable mounts. Treat that as a security tradeoff and call it out.
Validate before deploying.
- Prefer the API-owned path when the installation provides it. The current CLI already defaults to the prototype API URL and Entra scope, so only set flags or environment variables when you need to override those defaults.
- Validate
```
SharedInfra
```
  first when present, then validate each dependent
```
Service
```
  .
- Run
```
agentdeploy validate --file <contract>.yaml
```
  .
- If you intentionally want offline local-engine validation instead of the hosted API path, pass
```
--api-url=
```
  .
- Inspect
```
effective_service
```
  or
```
effective_infra
```
  , plus
```
manifest_files
```
  and
```
warnings
```
  , in the response. They are the fastest way to catch dropped or mismatched fields before a deploy.
- For a standalone service-only app,
```
manifest_files
```
  should include
```
namespace.yaml
```
  ,
```
resourcequota.yaml
```
  , and
```
limitrange.yaml
```
  . If those files are absent, the app is no longer on the standalone bootstrap path.
- Treat
```
QUOTA_*
```
  errors as pre-flight failures against the current rendered namespace limits, not as generic rollout failures.
- Treat capacity-related warnings as best-effort scheduler signals. They do not block deploy by themselves, but they mean the cluster may be too full to place the new pods.
- Fix errors by following the exact
```
field
```
  ,
```
allowed_values
```
  , and
```
suggested_value
```
  in the JSON response.
- Then run
```
agentdeploy deploy --file <contract>.yaml --dry-run
```
  .
- In the current prototype,
```
deploy --dry-run
```
  can still return
```
status: accepted
```
  and an
```
operation_id
```
  . Treat it as preview-only. Review
```
preview_only
```
  ,
```
effective_service
```
  or
```
effective_infra
```
  ,
```
manifest_files
```
  , and
```
warnings
```
  rather than assuming a live deployment started.
Deploy for real.
- Deploy
```
SharedInfra
```
  first when present, then deploy each dependent
```
Service
```
  .
- Run
```
agentdeploy deploy --file <contract>.yaml
```
  .
- Capture
```
operation_id
```
  , the reported record name, and the initial phase.
- Do not assume
```
git_commit
```
  or revision are returned immediately. Live deploys are queued and executed asynchronously.
- If local CLI mode returns
```
DEPLOY_NO_LIVE_TARGET
```
  , stop. Use the Platform API path or configure a real GitOps remote. Only use
```
AGENTDEPLOY_ALLOW_LOCAL_GITOPS=true
```
  when the user explicitly wants local-only GitOps testing.
- If the deploy returns
```
DEPLOY_MISSING_SHARED_INFRA
```
  , the app is not a true standalone service. Either deploy
```
SharedInfra
```
  first or remove the extra shared-state / service-ref coupling.
- If the deploy is rejected with
```
DEPLOY_OPERATION_ALREADY_IN_PROGRESS
```
  , check whether the active operation is still desirable. If it is stuck or obsolete, run
```
agentdeploy cancel <record>
```
  and then submit the replacement deploy.
- Poll with
```
agentdeploy status <record>
```
  until the phase is
```
live
```
  or an error is returned.
Verify the result.
- Start with the aggregate application view for multi-service apps:
  - ```
  agentdeploy applications
```
- ```
agentdeploy app-status <team> <application>
```
  - ```
  agentdeploy app-explain <team> <application>
```
- Use the aggregate view to confirm
```
SharedInfra
```
  plus all dependent services are converging together before drilling into a single record.
- Use the URLs returned by
```
status
```
  or
```
explain
```
  for services. Do not hardcode domains because each installation can differ.
- For a complete state dump, run
```
agentdeploy explain <record>
```
  .
- For runtime debugging without
```
kubectl
```
  , use:
  - ```
  agentdeploy describe <record>
```
  for pod names, restart counts, waiting or termination reasons, image digests, requests, limits, and service or endpoint visibility
- ```
agentdeploy events <record>
```
    for missing secrets, quota failures, probe failures, and scheduling errors
  - ```
  agentdeploy logs <record> [--follow] [--previous] [--pod <name>] [--container <name>] [--tail N]
```
  for live or previous container logs
- In
```
explain
```
  , inspect the live
```
infrastructureRef
```
  and
```
serviceRef
```
  sections when secret wiring or same-namespace traffic looks wrong.
- Compare
```
requested_revision
```
  against
```
observed_revision
```
  . If they differ, the control plane has accepted a newer revision than ArgoCD has actually reconciled in the cluster.
- Treat a stale-reconciliation warning as a real GitOps signal. The platform now requests a targeted Argo refresh automatically, and you can also run
```
agentdeploy refresh <app>
```
  if the warning persists.
- If the app depends on PostgreSQL, confirm
```
SharedInfra
```
  is healthy, the service injects the DB variant it actually uses, and the app-level readiness check matches the app’s real dependencies.
- If the app depends on Redis, confirm
```
SharedInfra
```
  is healthy, the service injects
```
REDIS_URL
```
  or
```
REDIS_URL_TLS
```
  , and the app-level readiness check actually exercises Redis.
- If the app depends on shared object storage, confirm
```
SharedInfra
```
  is healthy and the service injects the object-store keys it actually uses. Prefer
```
OBJECT_STORE_*
```
  for portable app wiring and fall back to
```
AZURE_STORAGE_*
```
  only when the runtime still needs provider-specific compatibility.
- Remember that
```
list
```
  ,
```
status
```
  , and
```
explain
```
  are usually filtered by team-scoped control-plane RBAC, not by app owner alone.

确认CLI可用，然后在编写配置文件前检查应用情况。
- 使用工作流前，先运行
```
command -v agentdeploy
```
  确认CLI存在。
- 判断应用是否提供HTTP服务、运行后台任务，或仅为定时任务。
- 检查应用是否已依赖
```
PORT
```
  、
```
DATABASE_URL
```
  、
```
REDIS_URL
```
  环境变量，或需要执行迁移命令。
- 优先复用现有不可变镜像摘要。仅当用户需要AgentDeploy从源码构建镜像时，才使用
```
build:
```
  块。
选择应用架构形态。
- 若应用仅有一个服务、无共享状态、无
```
valueFrom.infrastructureRef
```
  和
```
valueFrom.serviceRef
```
  ，则单个
```
Service
```
  即可满足需求。
- 在这种独立架构下，AgentDeploy可直接从
```
Service
```
  记录中自动创建
```
Namespace
```
  、
```
ResourceQuota
```
  和
```
LimitRange
```
  。
- 若多个服务需要共享命名空间、PostgreSQL、Redis、对象存储或服务间通信，则先创建
```
SharedInfra
```
  记录，再通过
```
metadata.application
```
  将每个服务关联到该记录。
- ```
SharedInfra
```
  是唯一可内置数据库、Redis或对象存储资源的载体。请勿在
```
Service
```
  中设置
```
spec.infrastructure
```
  ，该模式已被废弃。
为每个服务选择工作负载形态。
- ```
web
```
  ：面向浏览器的HTTP入口应用。
- ```
api
```
  ：供浏览器调用的后端HTTP API。
- ```
worker
```
  ：无入口的后台进程。
- ```
cron
```
  ：定时任务，无入口。
创建或更新配置文件。
- 当应用需要共享数据库/Redis或包含多个服务时，以
```
assets/shared-infra.yaml
```
  为起点。
- 每个服务以
```
assets/
```
  中最匹配的服务模板为起点。
- 应用名称需符合DNS命名规范。
- 若代码库未声明部署元数据，则从代码库或目录名称衍生唯一的应用名称与子域名，避免与现有应用冲突。
- 若代码库上下文缺失
```
owner
```
  信息，优先从git配置、文档或项目上下文获取真实维护者邮箱。仅在明确的冒烟测试场景下，才生成虚拟所有者信息。
- 若代码库上下文缺失
```
team
```
  信息，优先从代码库、父目录或项目文档中提取明确的团队名称。若不存在，则为冒烟测试选择临时团队名称，并说明该假设。
- 默认设置
```
dataClassification: internal
```
  ，除非用户明确说明数据敏感度更高。
- 当多个服务共享同一应用时，需显式设置服务的
```
metadata.application
```
  。
- 对于小型独立应用，允许
```
metadata.application
```
  默认使用
```
metadata.name
```
  。
- 默认设置
```
visibility: internal
```
  ，除非用户明确需要公网访问权限。
- 默认设置
```
authorization.mode: group-based
```
  ，除非用户明确需要
```
org-wide
```
  模式且策略允许。
- 优先使用专用Entra安全组进行应用访问控制。仅当整个团队都需要使用应用时，才使用更宽泛的团队级组。
- 对于无真实组ID的冒烟测试或实验室环境，仅当应用为
```
internal
```
  类型且安装策略允许时，才可使用
```
org-wide
```
  模式。请勿在敏感数据场景中使用该模式。
- 若用户明确需要完全公开、无共享认证的应用，需先警告用户：该应用将可被匿名访问公网，且不会接收任何共享身份头信息。
- 该模式需配置：
  - ```
  spec.dataClassification: public
```
- ```
spec.access.auth: none
```
  - ```
  spec.access.authorization.mode: none
```
- 在当前策略集中，仅
```
dataClassification: public
```
  的应用允许未认证访问。需说明该模式下应用将不会收到
```
X-Auth-Request-*
```
  身份头信息。
- 若应用需要PostgreSQL，需在
```
SharedInfra
```
  的
```
spec.infrastructure.databases.<name>
```
  下声明，并通过
```
valueFrom.infrastructureRef
```
  关联对应的数据库环境变量：
  - 对于libpq/同步客户端，使用
```
DATABASE_URL
```
    或
```
DATABASE_URL_SYNC
```
  - 对于常见异步Python栈，使用
```
DATABASE_URL_ASYNC
```
- 若应用需要Redis，需在
```
SharedInfra
```
  的
```
spec.infrastructure.redis.<name>
```
  下声明，并通过
```
valueFrom.infrastructureRef
```
  关联
```
REDIS_URL
```
  或
```
REDIS_URL_TLS
```
  。
- 若应用需要跨服务共享上传或文档传递存储，需在
```
SharedInfra
```
  的
```
spec.infrastructure.objects.<name>
```
  下声明，并通过
```
valueFrom.infrastructureRef
```
  的
```
kind: objectStorage
```
  进行关联。
- 对于拆分的API/worker文档流，优先使用对象存储加对象键的方式，而非本地路径传递。仅通过
```
runtime.filesystem.writablePaths
```
  设置临时可写目录。
- 若一个服务需要调用同一应用中的另一个服务，使用
```
valueFrom.serviceRef
```
  关联其URL，而非硬编码域名或手动修改清单。
- 在当前原型版本中，Redis仅支持
```
dataClassification: internal
```
  的应用。
- 确保进程实际监听
```
runtime.port
```
  。若应用依赖
```
PORT
```
  环境变量，需显式设置该变量。
- 支持
```
runtime.command
```
  和
```
runtime.args
```
  。当工作负载需要显式启动命令，而非仅通过包装镜像启动进程时，使用这两个配置。
- 若应用需要可写临时目录，使用
```
runtime.filesystem.writablePaths
```
  ，而非要求用户手动修改清单。
- 仅当应用确实无法通过显式可写挂载正常工作时，才设置
```
runtime.filesystem.readOnlyRootFilesystem: false
```
  。需将此视为安全权衡并明确说明。
部署前验证。
- 若环境提供API支持，优先使用API路径。当前CLI已默认使用原型API URL和Entra作用域，仅当需要覆盖默认值时才设置标志或环境变量。
- 若存在
```
SharedInfra
```
  ，先验证它，再验证每个依赖的
```
Service
```
  。
- 运行
```
agentdeploy validate --file <contract>.yaml
```
  进行验证。
- 若有意使用离线本地引擎验证而非托管API路径，需传递
```
--api-url=
```
  参数。
- 检查响应中的
```
effective_service
```
  或
```
effective_infra
```
  、
```
manifest_files
```
  和
```
warnings
```
  字段。这些是在部署前捕获缺失或不匹配字段的最快方式。
- 对于仅含服务的独立应用，
```
manifest_files
```
  应包含
```
namespace.yaml
```
  、
```
resourcequota.yaml
```
  和
```
limitrange.yaml
```
  。若这些文件缺失，说明应用已不再使用独立引导路径。
- 将
```
QUOTA_*
```
  错误视为针对当前渲染命名空间限制的预检查失败，而非通用发布故障。
- 将容量相关警告视为调度器的尽力而为信号。它们本身不会阻塞部署，但意味着集群可能没有足够资源放置新Pod。
- 根据JSON响应中的
```
field
```
  、
```
allowed_values
```
  和
```
suggested_value
```
  修复错误。
- 然后运行
```
agentdeploy deploy --file <contract>.yaml --dry-run
```
  。
- 在当前原型版本中，
```
deploy --dry-run
```
  仍可能返回
```
status: accepted
```
  和
```
operation_id
```
  。需将其视为仅预览模式。应查看
```
preview_only
```
  、
```
effective_service
```
  或
```
effective_infra
```
  、
```
manifest_files
```
  和
```
warnings
```
  字段，而非假设已启动实时部署。
正式部署。
- 若存在
```
SharedInfra
```
  ，先部署它，再部署每个依赖的
```
Service
```
  。
- 运行
```
agentdeploy deploy --file <contract>.yaml
```
  。
- 记录
```
operation_id
```
  、报告的记录名称和初始阶段。
- 不要假设
```
git_commit
```
  或版本号会立即返回。实时部署会被排队并异步执行。
- 若本地CLI模式返回
```
DEPLOY_NO_LIVE_TARGET
```
  ，请停止操作。使用Platform API路径或配置真实的GitOps远程仓库。仅当用户明确需要本地GitOps测试时，才设置
```
AGENTDEPLOY_ALLOW_LOCAL_GITOPS=true
```
  。
- 若部署返回
```
DEPLOY_MISSING_SHARED_INFRA
```
  ，说明该应用并非真正的独立服务。需先部署
```
SharedInfra
```
  ，或移除额外的共享状态/服务引用关联。
- 若部署被拒绝并返回
```
DEPLOY_OPERATION_ALREADY_IN_PROGRESS
```
  ，检查当前运行的操作是否仍有必要。若操作已停滞或过时，运行
```
agentdeploy cancel <record>
```
  ，然后提交新的部署请求。
- 运行
```
agentdeploy status <record>
```
  轮询状态，直至阶段变为
```
live
```
  或返回错误。
验证部署结果。
- 对于多服务应用，先查看聚合应用视图：
  - ```
  agentdeploy applications
```
- ```
agentdeploy app-status <team> <application>
```
  - ```
  agentdeploy app-explain <team> <application>
```
- 在深入单个记录前，先通过聚合视图确认
```
SharedInfra
```
  及所有依赖服务是否正在同步收敛。
- 使用
```
status
```
  或
```
explain
```
  返回的服务URL。请勿硬编码域名，因为每个环境的配置可能不同。
- 若需要完整状态导出，运行
```
agentdeploy explain <record>
```
  。
- 无需
```
kubectl
```
  即可进行运行时调试：
  - ```
  agentdeploy describe <record>
```
  ：查看Pod名称、重启次数、等待或终止原因、镜像摘要、资源请求与限制，以及服务或端点可见性
- ```
agentdeploy events <record>
```
    ：查看缺失的密钥、配额故障、探针失败和调度错误
  - ```
  agentdeploy logs <record> [--follow] [--previous] [--pod <name>] [--container <name>] [--tail N]
```
  ：查看实时或历史容器日志
- 在
```
explain
```
  中，当密钥关联或同命名空间流量出现问题时，检查实时的
```
infrastructureRef
```
  和
```
serviceRef
```
  部分。
- 比较
```
requested_revision
```
  与
```
observed_revision
```
  。若两者不同，说明控制平面已接受新版本，但ArgoCD尚未在集群中完成同步。
- 将过时同步警告视为真实的GitOps信号。平台会自动请求ArgoCD进行定向刷新，若警告持续存在，也可手动运行
```
agentdeploy refresh <app>
```
  。
- 若应用依赖PostgreSQL，需确认
```
SharedInfra
```
  运行正常、服务已注入实际使用的数据库环境变量，且应用级就绪检查与应用真实依赖匹配。
- 若应用依赖Redis，需确认
```
SharedInfra
```
  运行正常、服务已注入
```
REDIS_URL
```
  或
```
REDIS_URL_TLS
```
  ，且应用级就绪检查实际验证了Redis连接。
- 若应用依赖共享对象存储，需确认
```
SharedInfra
```
  运行正常、服务已注入实际使用的对象存储密钥。优先使用
```
OBJECT_STORE_*
```
  进行可移植应用关联，仅当运行时仍需提供商特定兼容性时，才回退使用
```
AZURE_STORAGE_*
```
  。
- 请注意，
```
list
```
  、
```
status
```
  和
```
explain
```
  通常受团队范围的控制平面RBAC过滤，而非仅按应用所有者过滤。

Default decisions

默认决策

Prefer immutable image digests over tags.
Prefer the smallest viable CPU and memory values; only raise them when the app clearly needs more.
Prefer PostgreSQL only when the app actually needs persistent relational storage.
Prefer Redis only when the app actually needs cache or ephemeral key-value state.
Prefer one service per app unless there is a clear need for a multi-service application with shared namespace and shared infra.
Prefer the standalone
```
Service
```
bootstrap only for a true one-service app. The moment the app needs shared state or a sibling service, switch to explicit
```
SharedInfra
```
.
Prefer internal ingress and group-based authorization for enterprise apps.

优先使用不可变镜像摘要而非标签。
优先使用最小可行的CPU和内存值；仅当应用明确需要更多资源时才提升配置。
仅当应用确实需要持久化关系型存储时，才使用PostgreSQL。
仅当应用确实需要缓存或临时键值存储时，才使用Redis。
优先为每个应用设置单个服务，除非明确需要多服务应用共享命名空间与基础设施。
仅针对真正的单服务应用使用独立
```
Service
```
引导模式。一旦应用需要共享状态或兄弟服务，立即切换为显式
```
SharedInfra
```
模式。
企业应用优先使用内部入口和基于组的授权模式。

High-value gotchas

高价值注意事项

Mutable image tags are rejected. Use
```
repo@sha256:...
```
or let AgentDeploy build the image.
```
allowedGroups
```
must contain stable group IDs, not human-readable names.
Only
```
internal
```
data classification can be
```
public
```
.
```
confidential
```
and
```
restricted
```
cannot use
```
org-wide
```
.
```
api
```
means browser-consumed HTTP API in this platform, not general service-to-service auth.

Public apps use shared auth by default, but an app can explicitly opt out with

spec.dataClassification: public

spec.access.auth: none

, and

spec.access.authorization.mode: none

For
```
group-based
```
access,
```
allowedGroups
```
should be the stable Entra object IDs of the groups that should be able to pass the shared auth proxy.
If more than one user set should be allowed, list all of their group object IDs in
```
allowedGroups
```
and keep the scope intentional. Prefer app-specific access groups over broad org-wide groups.
The app receives identity through
```
X-Auth-Request-*
```
headers, not raw Entra bearer tokens, by default.

The concrete headers are

X-Auth-Request-Email

X-Auth-Request-Groups

X-Auth-Request-Preferred-Username

, and

X-Auth-Request-User

Shared ingress auth no longer forwards bearer
```
Authorization
```
headers into apps by default. If an app expects raw OAuth access tokens, call that out as a platform mismatch instead of assuming they will be present.
```
auth: none
```
means no shared ingress auth and no injected identity headers. In the current policy set, that is allowed only for
```
dataClassification: public
```
.
If PostgreSQL is declared, it must live on
```
SharedInfra
```
, not
```
Service
```
. Wire the correct DB env variant with
```
valueFrom.infrastructureRef
```
.
```
DATABASE_URL
```
/
```
DATABASE_URL_SYNC
```
are libpq-style, while
```
DATABASE_URL_ASYNC
```
is meant for common async Python stacks.
If Redis is declared, it must live on
```
SharedInfra
```
, not
```
Service
```
. Wire
```
REDIS_URL
```
or
```
REDIS_URL_TLS
```
with
```
valueFrom.infrastructureRef
```
. The platform now includes
```
ssl_cert_reqs=required
```
, so most
```
redis-py
```
and Celery clients should not need app-side query rewriting.
If one service needs another service's base URL, use
```
valueFrom.serviceRef
```
. That resolves to a stable in-namespace URL like
```
http://expense-api/api
```
and avoids hardcoding installation domains.
A single standalone
```
Service
```
can bootstrap its own namespace policy, but that only works when there are no
```
infrastructureRef
```
or
```
serviceRef
```
bindings and no other services in the same application.
```
DEPLOY_MISSING_SHARED_INFRA
```
means the app has outgrown the standalone path and now needs an explicit
```
SharedInfra
```
owner.
In the current prototype, Redis is an
```
internal
```
-only infrastructure option and is exposed over TLS on port
```
6380
```
.
```
runtime.filesystem.writablePaths
```
creates
```
emptyDir
```
mounts at those paths. Existing image contents at those paths will be hidden at runtime.
```
runtime.filesystem.readOnlyRootFilesystem
```
defaults to
```
true
```
. Turning it off is a real security relaxation and should be deliberate.
If the app does not listen on
```
runtime.port
```
, the deployment will roll out but ingress health and readiness will still fail.
```
validate
```
,
```
deploy --dry-run
```
, and
```
explain
```
expose the effective normalized contract. Use that output to verify that infrastructure ownership, env wiring, and runtime overrides survived normalization.
In the intended product mode, deployers should use the Platform API path. They should not need direct Git push access or direct Kubernetes read access for normal lifecycle commands.
Team visibility is usually team-scoped, not owner-scoped. A caller typically sees all apps for teams they can view.
A second real deploy for the same app may be rejected while another non-terminal operation is queued or running.
```
agentdeploy cancel <app>
```
is the current escape hatch for a stuck or obsolete live operation. It cancels the active operation record so a replacement deploy can be accepted.
```
requested_revision
```
is the last revision the control plane accepted.
```
observed_revision
```
is the revision ArgoCD currently reports from the cluster. Treat them as different signals.
```
describe
```
,
```
events
```
, and
```
logs
```
depend on a recent hosted
```
platform-api
```
build when you are using
```
AGENTDEPLOY_API_URL
```
. If they return
```
HTTP_NOT_FOUND
```
, the CLI is newer than the live control plane.

可变镜像标签会被拒绝。请使用
```
repo@sha256:...
```
格式，或让AgentDeploy构建镜像。
```
allowedGroups
```
必须包含稳定的组ID，而非人类可读名称。
仅
```
internal
```
数据分类的应用可设置为
```
public
```
可见性。
```
confidential
```
和
```
restricted
```
类型的应用不可使用
```
org-wide
```
授权模式。
在此平台中，
```
api
```
指供浏览器调用的HTTP API，而非通用服务间认证。

公开应用默认使用共享认证，但可通过

spec.dataClassification: public

、

spec.access.auth: none

和

spec.access.authorization.mode: none

显式退出共享认证。

对于
```
group-based
```
访问模式，
```
allowedGroups
```
应设置为有权通过共享认证代理的组的稳定Entra对象ID。
若需允许多个用户组访问，将所有组的对象ID列入
```
allowedGroups
```
，并保持访问范围明确。优先使用应用专用访问组，而非宽泛的全组织组。
应用通过
```
X-Auth-Request-*
```
头接收身份信息，而非原始Entra Bearer令牌。

具体的身份头包括

X-Auth-Request-Email

、

X-Auth-Request-Groups

、

X-Auth-Request-Preferred-Username

和

X-Auth-Request-User

。

共享入口认证默认不再将Bearer
```
Authorization
```
头转发给应用。若应用依赖原始OAuth访问令牌，需说明这是平台不兼容问题，而非假设令牌会存在。
```
auth: none
```
表示无共享入口认证，且不会注入身份头。在当前策略集中，仅
```
dataClassification: public
```
的应用允许使用该模式。
若声明了PostgreSQL，必须将其配置在
```
SharedInfra
```
中，而非
```
Service
```
。通过
```
valueFrom.infrastructureRef
```
关联正确的数据库环境变量：
```
DATABASE_URL
```
/
```
DATABASE_URL_SYNC
```
为libpq风格，
```
DATABASE_URL_ASYNC
```
适用于常见异步Python栈。
若声明了Redis，必须将其配置在
```
SharedInfra
```
中，而非
```
Service
```
。通过
```
valueFrom.infrastructureRef
```
关联
```
REDIS_URL
```
或
```
REDIS_URL_TLS
```
。平台已默认包含
```
ssl_cert_reqs=required
```
，因此大多数
```
redis-py
```
和Celery客户端无需在应用层重写查询参数。
若一个服务需要调用另一个服务的基础URL，使用
```
valueFrom.serviceRef
```
。它会解析为稳定的同命名空间URL（如
```
http://expense-api/api
```
），避免硬编码环境域名。
单个独立
```
Service
```
可自动引导自身的命名空间策略，但仅当无
```
infrastructureRef
```
或
```
serviceRef
```
绑定，且同一应用中无其他服务时才有效。
```
DEPLOY_MISSING_SHARED_INFRA
```
表示应用已超出独立模式的能力范围，现在需要显式的
```
SharedInfra
```
所有者。
在当前原型版本中，Redis仅支持
```
internal
```
类型的基础设施，并通过TLS在端口
```
6380
```
提供服务。
```
runtime.filesystem.writablePaths
```
会在指定路径创建
```
emptyDir
```
挂载。镜像中该路径下的现有内容在运行时会被隐藏。
```
runtime.filesystem.readOnlyRootFilesystem
```
默认设置为
```
true
```
。关闭该选项会降低安全性，需谨慎操作。
若应用未监听
```
runtime.port
```
，部署会完成，但入口健康检查与就绪检查仍会失败。
```
validate
```
、
```
deploy --dry-run
```
和
```
explain
```
会展示标准化后的有效配置文件。可通过该输出验证基础设施所有权、环境变量关联和运行时覆盖是否在标准化后仍保持正确。
在预期的产品模式中，部署人员应使用Platform API路径。正常生命周期管理无需直接Git推送权限或Kubernetes读取权限。
团队可见性通常基于团队范围，而非所有者范围。调用者通常可查看其有权访问的所有团队的应用。
若同一应用已有非终端操作在排队或运行，第二次真实部署请求可能会被拒绝。
```
agentdeploy cancel <app>
```
是当前处理停滞或过时实时操作的应急方案。它会取消当前运行的操作记录，以便新的部署请求可被接受。
```
requested_revision
```
是控制平面最后接受的版本。
```
observed_revision
```
是ArgoCD当前从集群报告的版本。需将它们视为不同的信号。
当使用
```
AGENTDEPLOY_API_URL
```
时，
```
describe
```
、
```
events
```
和
```
logs
```
依赖于最新的托管
```
platform-api
```
构建版本。若返回
```
HTTP_NOT_FOUND
```
，说明CLI版本比当前控制平面版本新。

Debug loop

调试流程

Read
```
agentdeploy status <record>
```
first.
If the phase is not obviously actionable, read
```
agentdeploy explain <record>
```
.
For rollout or runtime failures, read
```
agentdeploy describe <record>
```
.
If the cause still is not obvious, read
```
agentdeploy events <record>
```
.

Use

agentdeploy logs <record> --previous

for crash loops and

agentdeploy logs <record> --follow

for live request or worker debugging.

Use the error code prefix to choose the next action:
- ```
SCHEMA_*
```
  : fix the contract.
- ```
POLICY_*
```
  : change the requested shape or access mode.
- ```
AUTH_*
```
  : fix group IDs or auth assumptions.
- ```
INFRA_*
```
  : inspect database or Redis claim and secret readiness.
- ```
DEPLOY_*
```
  : inspect the workload rollout and health checks.
- ```
QUOTA_*
```
  : lower requests or ask for a higher app tier.

首先查看
```
agentdeploy status <record>
```
。
若阶段信息不明确，查看
```
agentdeploy explain <record>
```
。
若为发布或运行时故障，查看
```
agentdeploy describe <record>
```
。
若仍无法确定原因，查看
```
agentdeploy events <record>
```
。
对于崩溃循环，使用
```
agentdeploy logs <record> --previous
```
；对于实时请求或worker调试，使用
```
agentdeploy logs <record> --follow
```
。
根据错误代码前缀选择下一步操作：
- ```
SCHEMA_*
```
  ：修复配置文件。
- ```
POLICY_*
```
  ：修改请求的架构或访问模式。
- ```
AUTH_*
```
  ：修复组ID或认证假设。
- ```
INFRA_*
```
  ：检查数据库或Redis声明及密钥就绪状态。
- ```
DEPLOY_*
```
  ：检查工作负载发布与健康检查。
- ```
QUOTA_*
```
  ：降低资源请求，或申请更高的应用配额层级。

Feedback loop

反馈流程

If a real deployment exposes a high-value platform bug, contract gap, or reliability issue, raise that feedback rather than treating it as one-off local friction.
If you have access to
```
elementx-ai/agentdeploy
```
, open or update a GitHub issue with:
- the affected app, record type, and workload type
- the relevant
```
SharedInfra
```
  or
```
Service
```
  shape
- operation ID, requested revision, and observed revision when available
- the exact failure mode, impact, and the smallest useful fix
Prefer issues for meaningful fixes or improvements. Do not create noise for already-documented prototype limitations unless the observed behavior is worse than documented.

若真实部署暴露了高价值的平台Bug、配置文件缺陷或可靠性问题，需反馈该问题，而非仅视为本地临时问题。
若有权访问
```
elementx-ai/agentdeploy
```
，请在GitHub上创建或更新Issue，包含：
- 受影响的应用、记录类型和工作负载类型
- 相关的
```
SharedInfra
```
  或
```
Service
```
  架构
- 操作ID、请求版本和观察版本（若可用）
- 具体的故障模式、影响范围和最小可行修复方案
优先为有意义的修复或改进创建Issue。对于已文档化的原型限制，除非实际行为比文档描述更差，否则无需创建无意义的Issue。

Output expectations

输出要求

When doing deployment work with this skill:

keep the contracts small and explicit
explain which workload type, application shape, and data classification you chose
surface the exact CLI commands you ran
quote the operation ID first, then the revision or Git commit once
```
status
```
or
```
explain
```
reports it
prefer actionable remediation over generic advice

使用本技能进行部署工作时：

保持配置文件简洁且明确
说明选择的工作负载类型、应用架构形态和数据分类
展示执行的具体CLI命令
优先引用操作ID，待
```
status
```
或
```
explain
```
返回版本或Git提交后再补充
优先提供可执行的修复建议，而非通用建议