cloud-infrastructure

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ABOUTME: AWS/GCP cloud infrastructure patterns and best practices

ABOUTME: AWS/GCP 云基础设施模式与最佳实践

ABOUTME: Well-Architected, security, cost optimization, observability

ABOUTME: Well-Architected、安全、成本优化、可观测性

Cloud Infrastructure (AWS & GCP)

云基础设施(AWS & GCP)

Quick Reference

快速参考

bash
undefined
bash
undefined

AWS

AWS

aws sts get-caller-identity aws --profile staging ecs list-services
aws sts get-caller-identity aws --profile staging ecs list-services

GCP

GCP

gcloud auth list gcloud config set project PROJECT_ID
gcloud auth list gcloud config set project PROJECT_ID

Security scanning

Security scanning

trivy config . && checkov -d .

**See:** `terraform/SKILL.md` | `_PATTERNS.md`

---
trivy config . && checkov -d .

**参见:** `terraform/SKILL.md` | `_PATTERNS.md`

---

AWS Well-Architected (6 Pillars)

AWS Well-Architected(六大支柱)

PillarKey Practices
Operational ExcellenceIaC, runbooks, observability, chaos engineering
SecurityLeast privilege IAM, GuardDuty/Security Hub, KMS encryption, SCPs
ReliabilityMulti-AZ, auto-scaling, RTO/RPO backups
PerformanceRight-size, caching, serverless, read replicas
CostReserved/Savings Plans, Spot, tagging
SustainabilityOptimize utilization, Graviton

支柱关键实践
卓越运营IaC, runbooks, observability, 混沌工程
安全最小权限 IAM, GuardDuty/Security Hub, KMS 加密, SCPs
可靠性多可用区(Multi-AZ), 自动扩缩容, RTO/RPO 备份
性能资源规格适配, 缓存, Serverless, 只读副本
成本预留实例/节省计划, 竞价实例(Spot), 标签管理
可持续性资源利用率优化, Graviton

AWS ECS vs EKS

AWS ECS vs EKS

FactorECSEKS
ComplexityLowerHigher (K8s)
Multi-cloudNoYes
CostFree control plane$0.10/hr/cluster
对比维度ECSEKS
复杂度较低较高(K8s)
多云支持
成本控制平面免费每集群 $0.10/小时

ECS Task Definition

ECS 任务定义

hcl
resource "aws_ecs_task_definition" "app" {
  family                   = "app"
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu                      = 256
  memory                   = 512
  container_definitions = jsonencode([{
    name   = "app"
    image  = "${var.ecr_repo}:${var.image_tag}"
    healthCheck = { command = ["CMD-SHELL", "curl -f http://localhost/health || exit 1"] }
  }])
}

hcl
resource "aws_ecs_task_definition" "app" {
  family                   = "app"
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu                      = 256
  memory                   = 512
  container_definitions = jsonencode([{
    name   = "app"
    image  = "${var.ecr_repo}:${var.image_tag}"
    healthCheck = { command = ["CMD-SHELL", "curl -f http://localhost/health || exit 1"] }
  }])
}

GCP Cloud Run

GCP Cloud Run

yaml
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/maxScale: "100"
    spec:
      containerConcurrency: 80
      containers:
        - image: gcr.io/project/image
          resources: { limits: { cpu: "1", memory: "512Mi" } }

yaml
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/maxScale: "100"
    spec:
      containerConcurrency: 80
      containers:
        - image: gcr.io/project/image
          resources: { limits: { cpu: "1", memory: "512Mi" } }

Cost Optimization

成本优化

ToolUse
Cost Explorer / Compute OptimizerAWS analysis
InfracostIaC cost in PRs
GCP FinOps HubGemini recommendations
Compute: Reserved (72% savings), Spot (90%), Graviton, auto-scaling Storage: S3 Intelligent Tiering, lifecycle policies Networking: VPC endpoints (avoid NAT costs)

工具用途
Cost Explorer / Compute OptimizerAWS 成本分析
InfracostPR 中的 IaC 成本预估
GCP FinOps HubGemini 成本优化建议
计算资源: 预留实例(72% 成本节省), 竞价实例(90% 成本节省), Graviton, 自动扩缩容 存储资源: S3 智能分层, 生命周期策略 网络资源: VPC 端点(避免 NAT 流量成本)

Observability (OpenTelemetry)

可观测性(OpenTelemetry)

Why OTEL: Vendor-agnostic, unified traces/metrics/logs
yaml
receivers:
  otlp: { protocols: { grpc: { endpoint: 0.0.0.0:4317 } } }
processors:
  batch: { timeout: 1s }
exporters:
  awsxray: { region: us-east-1 }
service:
  pipelines:
    traces: { receivers: [otlp], processors: [batch], exporters: [awsxray] }

为什么选择 OTEL: 厂商无关,统一链路追踪/指标/日志
yaml
receivers:
  otlp: { protocols: { grpc: { endpoint: 0.0.0.0:4317 } } }
processors:
  batch: { timeout: 1s }
exporters:
  awsxray: { region: us-east-1 }
service:
  pipelines:
    traces: { receivers: [otlp], processors: [batch], exporters: [awsxray] }

Networking

网络

AWS VPC Design

AWS VPC 设计

VPC (10.0.0.0/16)
├── Public → ALB, NAT
├── Private → ECS/EKS, Lambda
└── Isolated → RDS (no internet)
VPC (10.0.0.0/16)
├── Public → ALB, NAT
├── Private → ECS/EKS, Lambda
└── Isolated → RDS (no internet)

VPC Endpoints

VPC 端点

hcl
resource "aws_vpc_endpoint" "s3" {
  vpc_id       = aws_vpc.main.id
  service_name = "com.amazonaws.${var.region}.s3"
}

hcl
resource "aws_vpc_endpoint" "s3" {
  vpc_id       = aws_vpc.main.id
  service_name = "com.amazonaws.${var.region}.s3"
}

Code Review Checklist

代码评审检查清单

CategoryChecks
SecurityNo hardcoded secrets, least privilege IAM, KMS encryption, logging enabled
CostTagged resources, right-sized, auto-scaling
ReliabilityMulti-AZ, health checks, backups

分类检查项
安全无硬编码密钥, 最小权限 IAM, KMS 加密, 已启用日志
成本资源已打标签, 规格配置合理, 已配置自动扩缩容
可靠性多可用区部署, 健康检查配置, 备份策略生效

Resources

参考资源