kubernetes-operations
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseKubernetes Operations
Kubernetes操作
Deployment Manifest
部署清单
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
labels:
app: api-server
version: v1
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
version: v1
spec:
containers:
- name: api
image: registry.example.com/api:1.2.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: api-serverAlways set resource requests and limits. Use topology spread constraints for high availability.
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
labels:
app: api-server
version: v1
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
version: v1
spec:
containers:
- name: api
image: registry.example.com/api:1.2.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: api-server始终设置资源请求与限制。使用拓扑分布约束保障高可用性。
Helm Chart Structure
Helm Chart结构
chart/
Chart.yaml
values.yaml
values-staging.yaml
values-production.yaml
templates/
deployment.yaml
service.yaml
ingress.yaml
hpa.yaml
_helpers.tplyaml
undefinedchart/
Chart.yaml
values.yaml
values-staging.yaml
values-production.yaml
templates/
deployment.yaml
service.yaml
ingress.yaml
hpa.yaml
_helpers.tplyaml
undefinedvalues.yaml
values.yaml
replicaCount: 2
image:
repository: registry.example.com/api
tag: "1.2.0"
pullPolicy: IfNotPresent
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilization: 70
undefinedreplicaCount: 2
image:
repository: registry.example.com/api
tag: "1.2.0"
pullPolicy: IfNotPresent
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilization: 70
undefinedHorizontalPodAutoscaler
HorizontalPodAutoscaler
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300Troubleshooting Commands
故障排查命令
bash
undefinedbash
undefinedPod diagnostics
Pod诊断
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -c <container> --previous
kubectl exec -it <pod-name> -- /bin/sh
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -c <container> --previous
kubectl exec -it <pod-name> -- /bin/sh
Resource usage
资源使用情况
kubectl top pods -n <namespace> --sort-by=memory
kubectl top nodes
kubectl top pods -n <namespace> --sort-by=memory
kubectl top nodes
Network debugging
网络调试
kubectl run debug --image=nicolaka/netshoot --rm -it -- bash
nslookup <service-name>.<namespace>.svc.cluster.local
kubectl run debug --image=nicolaka/netshoot --rm -it -- bash
nslookup <service-name>.<namespace>.svc.cluster.local
Events sorted by time
按时间排序的事件
kubectl get events -n <namespace> --sort-by='.lastTimestamp'
kubectl get events -n <namespace> --sort-by='.lastTimestamp'
Find pods not running
查找未运行的Pod
kubectl get pods -A --field-selector=status.phase!=Running
undefinedkubectl get pods -A --field-selector=status.phase!=Running
undefinedAnti-Patterns
反模式
- Running containers as root without
securityContext.runAsNonRoot: true - Missing resource requests/limits (causes scheduling issues and noisy neighbors)
- Using tag instead of pinned image versions
latest - Not setting for critical workloads
PodDisruptionBudget - Storing secrets in ConfigMaps instead of Secrets (or external secret managers)
- Ignoring pod anti-affinity for replicated deployments
- 在未设置的情况下以root用户运行容器
securityContext.runAsNonRoot: true - 缺失资源请求/限制(会导致调度问题与“嘈杂邻居”现象)
- 使用标签而非固定镜像版本
latest - 不为关键工作负载设置
PodDisruptionBudget - 将密钥存储在ConfigMap而非Secrets(或外部密钥管理器)中
- 忽略副本部署的Pod反亲和性
Checklist
检查清单
- All containers have resource requests and limits
- Liveness and readiness probes configured
- Images use specific version tags, not
latest - Secrets stored in Kubernetes Secrets or external vault
- PodDisruptionBudget set for production workloads
- NetworkPolicies restrict traffic between namespaces
- Topology spread constraints or anti-affinity for HA
- Helm values split per environment (staging, production)
- 所有容器均配置了资源请求与限制
- 已配置存活探针与就绪探针
- 镜像使用特定版本标签,而非
latest - 密钥存储在Kubernetes Secrets或外部Vault中
- 为生产工作负载设置PodDisruptionBudget
- 配置NetworkPolicy限制命名空间间流量
- 配置拓扑分布约束或反亲和性以保障高可用
- Helm值按环境(预发布、生产)拆分