sync-to-prod

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Sync to Production Skill

同步至生产环境Skill

This skill provides workflows for synchronizing Kubernetes kustomization configurations from staging to production environment in the simplex-gitops repository.
本Skill提供了在simplex-gitops仓库中将Kubernetes kustomization配置从预发布环境同步到生产环境的工作流。

⚠️ CRITICAL: Production Deployment Policy

⚠️ 重要:生产环境部署策略

生产环境部署必须手动执行,禁止自动同步。
The workflow is:
  1. ✅ Update kustomization.yaml (can be automated)
  2. ✅ Commit and push to GitLab (can be automated)
  3. ArgoCD sync to production cluster - MUST BE MANUAL
After pushing changes, inform the user:
  • Changes are pushed to the repository
  • Production ArgoCD app will detect the changes but will NOT auto-sync
  • User must manually trigger sync via ArgoCD UI or CLI when ready
bash
undefined
生产环境部署必须手动执行,禁止自动同步。
工作流如下:
  1. ✅ 更新kustomization.yaml(可自动化)
  2. ✅ 提交并推送到GitLab(可自动化)
  3. ArgoCD同步到生产集群 - 必须手动执行
推送变更后,需告知用户:
  • 变更已推送到仓库
  • 生产环境ArgoCD应用会检测到变更,但不会自动同步
  • 用户准备就绪后,必须通过ArgoCD UI或CLI手动触发同步
bash
undefined

View pending changes (safe, read-only)

查看待同步变更(安全,只读)

argocd app get simplex-aws-prod argocd app diff simplex-aws-prod
argocd app get simplex-aws-prod argocd app diff simplex-aws-prod

Manual sync (ONLY when user explicitly requests)

手动同步(仅当用户明确要求时执行)

argocd app sync simplex-aws-prod

**NEVER run `argocd app sync simplex-aws-prod` automatically.**
argocd app sync simplex-aws-prod

**绝对禁止自动运行`argocd app sync simplex-aws-prod`命令。**

File Locations

文件位置

kubernetes/overlays/aws-staging/kustomization.yaml  # Staging config
kubernetes/overlays/aws-prod/kustomization.yaml     # Production config
kubernetes/overlays/aws-staging/kustomization.yaml  # 预发布环境配置
kubernetes/overlays/aws-prod/kustomization.yaml     # 生产环境配置

Quick Commands

快速命令

View Image Differences

查看镜像差异

bash
undefined
bash
undefined

Using the sync script

使用同步脚本

python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --diff
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --diff

Or using make target (if in kubernetes/ directory)

或使用make目标(需在kubernetes/目录下)

make compare-images
undefined
make compare-images
undefined

Sync Images

同步镜像

bash
undefined
bash
undefined

Sync specific services

同步特定服务

python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images front,anotherme-agent
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images front,anotherme-agent

Sync all images (dry-run first)

同步所有镜像(先执行试运行)

python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --all --dry-run
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --all --dry-run

Sync all images (apply changes)

同步所有镜像(应用变更)

python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --all
undefined
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --all
undefined

Sync Workflow

同步工作流

Step 1: Compare Environments

步骤1:对比环境配置

Run the diff command to see what's different between staging and production:
bash
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --diff
This shows:
  • 🔄 DIFFERENT TAGS: Services with different versions
  • SAME TAGS: Services already in sync
  • ⚠️ STAGING ONLY: Services only in staging
  • ⚠️ PROD ONLY: Services only in production
执行diff命令查看预发布与生产环境的配置差异:
bash
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --diff
该命令会展示:
  • 🔄 标签不同:版本不一致的服务
  • 标签一致:已同步的服务
  • ⚠️ 仅预发布环境存在:仅在预发布环境有的服务
  • ⚠️ 仅生产环境存在:仅在生产环境有的服务

Step 2: Review and Select Services

步骤2:审核并选择要推广的服务

Decide which services to promote. Common patterns:
bash
undefined
确定要推广的服务,常见操作示例:
bash
undefined

Promote a single critical service

推广单个核心服务

python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images front --dry-run
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images front --dry-run

Promote frontend services

推广前端服务

python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images front,front-homepage --dry-run
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images front,front-homepage --dry-run

Promote all AI services

推广所有AI服务

python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images anotherme-agent,anotherme-api,anotherme-search,anotherme-worker --dry-run
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images anotherme-agent,anotherme-api,anotherme-search,anotherme-worker --dry-run

Promote everything

推广所有服务

python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --all --dry-run
undefined
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --all --dry-run
undefined

Step 3: Apply Changes

步骤3:应用变更

After reviewing dry-run output, apply the changes:
bash
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images <services>
审核试运行输出后,应用变更:
bash
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --images <services>

Step 4: Commit and Push

步骤4:提交并推送

bash
cd /path/to/simplex-gitops
git add kubernetes/overlays/aws-prod/kustomization.yaml
git commit -m "chore: promote <services> to production"
git push
重要:推送后 ArgoCD 会检测到变更,但不会自动同步到生产集群。
bash
cd /path/to/simplex-gitops
git add kubernetes/overlays/aws-prod/kustomization.yaml
git commit -m "chore: promote <services> to production"
git push
重要:推送后 ArgoCD 会检测到变更,但不会自动同步到生产集群。

Step 5: Manual Production Sync (User Action Required)

步骤5:手动同步生产环境(需用户操作)

推送完成后,需要用户手动触发生产环境同步:
bash
undefined
推送完成后,需要用户手动触发生产环境同步:
bash
undefined

查看待同步的变更

查看待同步的变更

argocd app get simplex-aws-prod argocd app diff simplex-aws-prod
argocd app get simplex-aws-prod argocd app diff simplex-aws-prod

用户确认后手动同步

用户确认后手动同步

argocd app sync simplex-aws-prod

或通过 ArgoCD Web UI 手动点击 Sync 按钮:
- URL: http://192.168.10.117:31006
- 找到 `simplex-aws-prod` 应用
- 点击 "SYNC" 按钮
argocd app sync simplex-aws-prod

或通过 ArgoCD Web UI 手动点击 Sync 按钮:
- URL: http://192.168.10.117:31006
- 找到 `simplex-aws-prod` 应用
- 点击 "SYNC" 按钮

Configuration Sections That May Need Sync

可能需要同步的配置项

Beyond image tags, these sections may differ between environments:
除镜像标签外,以下配置项在环境间可能存在差异:

1. Image Tags (Primary Sync Target)

1. 镜像标签(主要同步目标)

Located in the
images:
section. This is what the sync script handles.
位于
images:
部分,这是同步脚本处理的内容。

2. ConfigMap Patches

2. ConfigMap 补丁

Files in
patches/
directory may contain environment-specific values:
Patch FilePurposeSync Consideration
api-cm0-configmap.yaml
API configUsually environment-specific, don't sync
gateway-cm0-configmap.yaml
Gateway configUsually environment-specific
anotherme-agent-env-configmap.yaml
Agent configMay need selective sync
anotherme-agent-secrets.yaml
Agent secretsNever sync, environment-specific
anotherme-search-env-configmap.yaml
Search configMay need selective sync
simplex-cron-env-configmap.yaml
Cron configUsually environment-specific
simplex-router-cm0-configmap.yaml
Router configUsually environment-specific
frontend-env.yaml
Frontend env varsUsually environment-specific
ingress.yaml
Ingress rulesNever sync, different domains
patches/
目录下的文件可能包含环境特定值:
补丁文件用途同步注意事项
api-cm0-configmap.yaml
API配置通常为环境特定,请勿同步
gateway-cm0-configmap.yaml
网关配置通常为环境特定
anotherme-agent-env-configmap.yaml
Agent配置可能需要选择性同步
anotherme-agent-secrets.yaml
Agent密钥绝对禁止同步,为环境特定
anotherme-search-env-configmap.yaml
搜索配置可能需要选择性同步
simplex-cron-env-configmap.yaml
定时任务配置通常为环境特定
simplex-router-cm0-configmap.yaml
路由配置通常为环境特定
frontend-env.yaml
前端环境变量通常为环境特定
ingress.yaml
入口规则绝对禁止同步,域名不同

3. Replica Counts

3. 副本数量

Staging often runs with fewer replicas. Production uses base defaults or higher. This is intentional and should NOT be synced.
预发布环境通常运行较少副本,生产环境使用基础默认值或更高副本数。这是有意设置的,请勿同步

4. Node Pool Assignments

4. 节点池分配

  • Staging:
    karpenter.sh/nodepool: staging
    /
    singleton-staging
  • Production:
    karpenter.sh/nodepool: production
    /
    singleton-production
These are environment-specific and should NOT be synced.
  • 预发布环境:
    karpenter.sh/nodepool: staging
    /
    singleton-staging
  • 生产环境:
    karpenter.sh/nodepool: production
    /
    singleton-production
这些为环境特定配置,请勿同步

5. Storage Classes

5. 存储类

Both environments use similar patterns but production uses
gp3
while staging uses
ebs-gp3-auto
. Usually no sync needed.
两个环境使用类似模式,但生产环境使用
gp3
而预发布环境使用
ebs-gp3-auto
。通常无需同步。

6. High Availability Settings

6. 高可用设置

Production has additional HA configurations:
  • topologySpreadConstraints
    for cross-AZ distribution
  • terminationGracePeriodSeconds: 60
    for graceful shutdown
These are production-specific optimizations and should NOT be synced to staging.
生产环境有额外的高可用配置:
  • topologySpreadConstraints
    用于跨可用区分布
  • terminationGracePeriodSeconds: 60
    用于优雅停机
这些是生产环境特定优化,请勿同步到预发布环境

Manual Sync Patterns

手动同步模式

For configurations not handled by the script:
对于脚本未处理的配置:

Sync a Specific ConfigMap Patch

同步特定ConfigMap补丁

bash
undefined
bash
undefined

Compare

对比差异

diff kubernetes/overlays/aws-staging/patches/anotherme-agent-env-configmap.yaml
kubernetes/overlays/aws-prod/patches/anotherme-agent-env-configmap.yaml
diff kubernetes/overlays/aws-staging/patches/anotherme-agent-env-configmap.yaml
kubernetes/overlays/aws-prod/patches/anotherme-agent-env-configmap.yaml

Copy if needed (carefully review first!)

如需同步(请先仔细审核!)

cp kubernetes/overlays/aws-staging/patches/anotherme-agent-env-configmap.yaml
kubernetes/overlays/aws-prod/patches/anotherme-agent-env-configmap.yaml
undefined
cp kubernetes/overlays/aws-staging/patches/anotherme-agent-env-configmap.yaml
kubernetes/overlays/aws-prod/patches/anotherme-agent-env-configmap.yaml
undefined

Sync New Resources

同步新资源

If staging has new resources (PV, PVC, etc.) that production needs:
  1. Check staging
    resources:
    section for new entries
  2. Copy the resource files to aws-prod
  3. Add to aws-prod
    kustomization.yaml
    resources section
  4. Adjust environment-specific values (namespace, labels, etc.)
如果预发布环境有生产环境需要的新资源(PV、PVC等):
  1. 检查预发布环境
    resources:
    部分的新条目
  2. 将资源文件复制到aws-prod目录
  3. 在aws-prod的
    kustomization.yaml
    的resources部分添加该资源
  4. 调整环境特定值(命名空间、标签等)

Verification After Sync

同步后的验证

Check ArgoCD Status (Read-Only, Safe)

检查ArgoCD状态(只读,安全)

bash
undefined
bash
undefined

查看应用状态和待同步变更

查看应用状态和待同步变更

argocd app get simplex-aws-prod argocd app diff simplex-aws-prod
undefined
argocd app get simplex-aws-prod argocd app diff simplex-aws-prod
undefined

Manual Sync (User Must Explicitly Request)

手动同步(需用户明确请求)

bash
undefined
bash
undefined

⛔ 仅在用户明确要求时执行

⛔ 仅在用户明确要求时执行

argocd app sync simplex-aws-prod
undefined
argocd app sync simplex-aws-prod
undefined

Check Deployed Versions

检查已部署版本

bash
undefined
bash
undefined

Production namespace

生产环境命名空间

k1 get pods -n production -o jsonpath='{range .items[]}{.metadata.name}{"\t"}{.spec.containers[].image}{"\n"}{end}'
k1 get pods -n production -o jsonpath='{range .items[]}{.metadata.name}{"\t"}{.spec.containers[].image}{"\n"}{end}'

Staging namespace

预发布环境命名空间

k2 get pods -n staging -o jsonpath='{range .items[]}{.metadata.name}{"\t"}{.spec.containers[].image}{"\n"}{end}'
undefined
k2 get pods -n staging -o jsonpath='{range .items[]}{.metadata.name}{"\t"}{.spec.containers[].image}{"\n"}{end}'
undefined

Validate Manifests

验证清单

bash
kubectl kustomize kubernetes/overlays/aws-prod > /tmp/prod-manifests.yaml
kubectl kustomize kubernetes/overlays/aws-staging > /tmp/staging-manifests.yaml
diff /tmp/staging-manifests.yaml /tmp/prod-manifests.yaml
bash
kubectl kustomize kubernetes/overlays/aws-prod > /tmp/prod-manifests.yaml
kubectl kustomize kubernetes/overlays/aws-staging > /tmp/staging-manifests.yaml
diff /tmp/staging-manifests.yaml /tmp/prod-manifests.yaml

Troubleshooting

故障排除

Script Not Finding Repository

脚本无法找到仓库

Ensure you're in the simplex-gitops directory or set the path explicitly:
bash
cd /path/to/simplex-gitops
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --diff
确保你处于simplex-gitops目录中,或显式设置路径:
bash
cd /path/to/simplex-gitops
python3 ~/.claude/skills/sync-to-prod/scripts/sync_images.py --diff

Image Not Found in Staging

预发布环境中未找到镜像

The service may use a different image name format (Aliyun vs ECR). Check both formats in the kustomization files.
该服务可能使用不同的镜像名称格式(阿里云 vs ECR)。请检查kustomization文件中的两种格式。

ArgoCD Not Syncing

ArgoCD未同步

bash
undefined
bash
undefined

查看应用状态(只读)

查看应用状态(只读)

argocd app get simplex-aws-prod --show-operation
argocd app get simplex-aws-prod --show-operation

刷新应用检测最新变更(只读,安全)

刷新应用以检测最新变更(只读,安全)

argocd app refresh simplex-aws-prod
argocd app refresh simplex-aws-prod

⛔ 手动同步 - 仅在用户明确要求时执行

⛔ 手动同步 - 仅在用户明确要求时执行

argocd app sync simplex-aws-prod
undefined
argocd app sync simplex-aws-prod
undefined

Service Categories Reference

服务分类参考

CategoryServices
AI Core
anotherme-agent
,
anotherme-api
,
anotherme-search
,
anotherme-worker
Frontend
front
,
front-homepage
Backend
simplex-cron
,
simplex-gateway-api
,
simplex-gateway-worker
Data
data-search-api
,
crawler
Infrastructure
litellm
,
node-server
,
simplex-router
,
simplex-router-backend
,
simplex-router-fronted
分类服务
AI核心
anotherme-agent
,
anotherme-api
,
anotherme-search
,
anotherme-worker
前端
front
,
front-homepage
后端
simplex-cron
,
simplex-gateway-api
,
simplex-gateway-worker
数据
data-search-api
,
crawler
基础设施
litellm
,
node-server
,
simplex-router
,
simplex-router-backend
,
simplex-router-fronted