cicd-workflows
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCI/CD Workflows Skill
CI/CD 工作流技能
Overview
概述
Comprehensive CI/CD patterns for Databricks using GitHub Actions, including automated testing, quality gates, multi-environment deployments, and rollback strategies.
Key Benefits:
- Automated testing and validation
- Consistent deployments
- Environment promotion workflows
- Quality gates enforcement
- Rollback capabilities
- Audit trails
使用GitHub Actions实现的Databricks全场景CI/CD模式,涵盖自动化测试、质量门禁、多环境部署以及回滚策略。
核心优势:
- 自动化测试与验证
- 一致性部署
- 环境升级工作流
- 质量门禁强制执行
- 回滚能力
- 审计追踪
When to Use This Skill
何时使用该技能
Use CI/CD workflows when you need to:
- Automate deployment processes
- Enforce quality standards
- Deploy across multiple environments
- Implement approval gates
- Track deployment history
- Enable rapid iterations
- Reduce manual errors
在以下场景中使用CI/CD工作流:
- 自动化部署流程
- 强制执行质量标准
- 跨多环境部署
- 实现审批门禁
- 追踪部署历史
- 支持快速迭代
- 减少手动错误
Core Concepts
核心概念
1. CI Pipeline
1. CI 流水线
Continuous Integration Workflow:
yaml
undefined持续集成工作流:
yaml
undefined.github/workflows/ci.yml
.github/workflows/ci.yml
name: Continuous Integration
on:
push:
branches: [develop, main]
pull_request:
branches: [develop, main]
jobs:
lint:
name: Code Quality Checks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
pip install black ruff mypy
pip install -r requirements.txt
- name: Run Black
run: black --check src/
- name: Run Ruff
run: ruff check src/
- name: Run MyPy
run: mypy src/test:
name: Unit Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
pip install pytest pytest-cov pytest-spark
pip install -r requirements.txt
- name: Run Tests
run: pytest tests/ -v --cov=src --cov-report=xml
- name: Upload Coverage
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
fail_ci_if_error: truevalidate-bundle:
name: Validate Databricks Bundle
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Validate Bundle
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: databricks bundle validate -t devundefinedname: Continuous Integration
on:
push:
branches: [develop, main]
pull_request:
branches: [develop, main]
jobs:
lint:
name: Code Quality Checks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
pip install black ruff mypy
pip install -r requirements.txt
- name: Run Black
run: black --check src/
- name: Run Ruff
run: ruff check src/
- name: Run MyPy
run: mypy src/test:
name: Unit Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
pip install pytest pytest-cov pytest-spark
pip install -r requirements.txt
- name: Run Tests
run: pytest tests/ -v --cov=src --cov-report=xml
- name: Upload Coverage
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
fail_ci_if_error: truevalidate-bundle:
name: Validate Databricks Bundle
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Validate Bundle
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: databricks bundle validate -t devundefined2. CD Pipeline
2. CD 流水线
Continuous Deployment Workflow:
yaml
undefined持续部署工作流:
yaml
undefined.github/workflows/cd.yml
.github/workflows/cd.yml
name: Continuous Deployment
on:
push:
branches: [main]
workflow_dispatch:
inputs:
environment:
description: 'Target environment'
required: true
type: choice
options:
- dev
- staging
- prod
jobs:
deploy-dev:
name: Deploy to Development
if: github.event_name == 'push' || github.event.inputs.environment == 'dev'
runs-on: ubuntu-latest
environment: development
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Deploy Bundle
env:
DATABRICKS_HOST: ${{ secrets.DEV_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DEV_DATABRICKS_TOKEN }}
run: |
databricks bundle deploy -t dev
databricks bundle run -t dev integration_testsdeploy-staging:
name: Deploy to Staging
needs: deploy-dev
if: github.event_name == 'push' || github.event.inputs.environment == 'staging'
runs-on: ubuntu-latest
environment: staging
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Deploy Bundle
env:
DATABRICKS_HOST: ${{ secrets.STAGING_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.STAGING_DATABRICKS_TOKEN }}
run: databricks bundle deploy -t staging
- name: Run Smoke Tests
env:
DATABRICKS_HOST: ${{ secrets.STAGING_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.STAGING_DATABRICKS_TOKEN }}
run: databricks bundle run -t staging smoke_testsdeploy-prod:
name: Deploy to Production
needs: deploy-staging
if: github.event_name == 'push' || github.event.inputs.environment == 'prod'
runs-on: ubuntu-latest
environment:
name: production
url: https://prod-workspace.databricks.com
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Create Backup
env:
DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
run: |
# Backup current state
databricks bundle generate deployment-state --target prod > backup-state.json
- name: Deploy Bundle
env:
DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
run: databricks bundle deploy -t prod
- name: Run Health Checks
env:
DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
run: databricks bundle run -t prod health_check
- name: Notify Deployment
if: success()
run: |
echo "Production deployment successful"
# Send notification (Slack, email, etc.)undefinedname: Continuous Deployment
on:
push:
branches: [main]
workflow_dispatch:
inputs:
environment:
description: 'Target environment'
required: true
type: choice
options:
- dev
- staging
- prod
jobs:
deploy-dev:
name: Deploy to Development
if: github.event_name == 'push' || github.event.inputs.environment == 'dev'
runs-on: ubuntu-latest
environment: development
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Deploy Bundle
env:
DATABRICKS_HOST: ${{ secrets.DEV_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DEV_DATABRICKS_TOKEN }}
run: |
databricks bundle deploy -t dev
databricks bundle run -t dev integration_testsdeploy-staging:
name: Deploy to Staging
needs: deploy-dev
if: github.event_name == 'push' || github.event.inputs.environment == 'staging'
runs-on: ubuntu-latest
environment: staging
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Deploy Bundle
env:
DATABRICKS_HOST: ${{ secrets.STAGING_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.STAGING_DATABRICKS_TOKEN }}
run: databricks bundle deploy -t staging
- name: Run Smoke Tests
env:
DATABRICKS_HOST: ${{ secrets.STAGING_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.STAGING_DATABRICKS_TOKEN }}
run: databricks bundle run -t staging smoke_testsdeploy-prod:
name: Deploy to Production
needs: deploy-staging
if: github.event_name == 'push' || github.event.inputs.environment == 'prod'
runs-on: ubuntu-latest
environment:
name: production
url: https://prod-workspace.databricks.com
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Create Backup
env:
DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
run: |
# Backup current state
databricks bundle generate deployment-state --target prod > backup-state.json
- name: Deploy Bundle
env:
DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
run: databricks bundle deploy -t prod
- name: Run Health Checks
env:
DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
run: databricks bundle run -t prod health_check
- name: Notify Deployment
if: success()
run: |
echo "Production deployment successful"
# Send notification (Slack, email, etc.)undefined3. Quality Gates
3. 质量门禁
Enforce Quality Standards:
yaml
undefined强制执行质量标准:
yaml
undefined.github/workflows/quality-gates.yml
.github/workflows/quality-gates.yml
name: Quality Gates
on:
pull_request:
branches: [main, develop]
jobs:
security-scan:
name: Security Vulnerability Scan
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Bandit Security Scan
run: |
pip install bandit
bandit -r src/ -f json -o bandit-report.json
- name: Run Safety Check
run: |
pip install safety
safety check --jsoncode-coverage:
name: Code Coverage Gate
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Run Tests with Coverage
run: |
pip install pytest pytest-cov
pip install -r requirements.txt
pytest tests/ --cov=src --cov-report=term --cov-fail-under=80data-quality:
name: Data Quality Checks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Validate Data Contracts
run: |
python scripts/validate_contracts.py
- name: Check DLT Expectations
run: |
python scripts/validate_dlt_expectations.pyundefinedname: Quality Gates
on:
pull_request:
branches: [main, develop]
jobs:
security-scan:
name: Security Vulnerability Scan
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Bandit Security Scan
run: |
pip install bandit
bandit -r src/ -f json -o bandit-report.json
- name: Run Safety Check
run: |
pip install safety
safety check --jsoncode-coverage:
name: Code Coverage Gate
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Run Tests with Coverage
run: |
pip install pytest pytest-cov
pip install -r requirements.txt
pytest tests/ --cov=src --cov-report=term --cov-fail-under=80data-quality:
name: Data Quality Checks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Validate Data Contracts
run: |
python scripts/validate_contracts.py
- name: Check DLT Expectations
run: |
python scripts/validate_dlt_expectations.pyundefined4. Rollback Strategy
4. 回滚策略
Automated Rollback:
yaml
undefined自动化回滚:
yaml
undefined.github/workflows/rollback.yml
.github/workflows/rollback.yml
name: Rollback Deployment
on:
workflow_dispatch:
inputs:
environment:
description: 'Environment to rollback'
required: true
type: choice
options:
- dev
- staging
- prod
version:
description: 'Version to rollback to (commit SHA or tag)'
required: true
type: string
jobs:
rollback:
name: Rollback to Previous Version
runs-on: ubuntu-latest
environment: ${{ github.event.inputs.environment }}
steps:
- uses: actions/checkout@v3
with:
ref: ${{ github.event.inputs.version }}
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Validate Rollback Version
run: |
echo "Rolling back to version: ${{ github.event.inputs.version }}"
databricks bundle validate -t ${{ github.event.inputs.environment }}
- name: Deploy Previous Version
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: |
databricks bundle deploy -t ${{ github.event.inputs.environment }}
- name: Verify Rollback
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: |
databricks bundle run -t ${{ github.event.inputs.environment }} health_checkundefinedname: Rollback Deployment
on:
workflow_dispatch:
inputs:
environment:
description: 'Environment to rollback'
required: true
type: choice
options:
- dev
- staging
- prod
version:
description: 'Version to rollback to (commit SHA or tag)'
required: true
type: string
jobs:
rollback:
name: Rollback to Previous Version
runs-on: ubuntu-latest
environment: ${{ github.event.inputs.environment }}
steps:
- uses: actions/checkout@v3
with:
ref: ${{ github.event.inputs.version }}
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Validate Rollback Version
run: |
echo "Rolling back to version: ${{ github.event.inputs.version }}"
databricks bundle validate -t ${{ github.event.inputs.environment }}
- name: Deploy Previous Version
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: |
databricks bundle deploy -t ${{ github.event.inputs.environment }}
- name: Verify Rollback
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: |
databricks bundle run -t ${{ github.event.inputs.environment }} health_checkundefinedImplementation Patterns
实现模式
Pattern 1: Matrix Testing
模式1:矩阵测试
Test Across Multiple Versions:
yaml
name: Matrix Testing
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.9', '3.10', '3.11']
spark-version: ['3.3.0', '3.4.0']
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install PySpark ${{ matrix.spark-version }}
run: |
pip install pyspark==${{ matrix.spark-version }}
pip install -r requirements.txt
- name: Run Tests
run: pytest tests/ -v跨多版本测试:
yaml
name: Matrix Testing
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.9', '3.10', '3.11']
spark-version: ['3.3.0', '3.4.0']
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install PySpark ${{ matrix.spark-version }}
run: |
pip install pyspark==${{ matrix.spark-version }}
pip install -r requirements.txt
- name: Run Tests
run: pytest tests/ -vPattern 2: Blue-Green Deployment
模式2:蓝绿部署
Zero-Downtime Deployments:
yaml
name: Blue-Green Deployment
on:
workflow_dispatch:
inputs:
color:
description: 'Deployment color'
required: true
type: choice
options:
- blue
- green
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Deploy to ${{ github.event.inputs.color }}
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
COLOR: ${{ github.event.inputs.color }}
run: |
# Deploy to color-specific namespace
databricks bundle deploy -t prod --var environment_color=$COLOR
- name: Run Health Checks
run: |
# Verify new deployment
python scripts/health_check.py --color $COLOR
- name: Switch Traffic
if: success()
run: |
# Update routing to new color
python scripts/switch_traffic.py --to $COLOR
- name: Cleanup Old Deployment
run: |
# Remove old color after successful switch
OLD_COLOR=$([ "$COLOR" == "blue" ] && echo "green" || echo "blue")
databricks bundle destroy -t prod --var environment_color=$OLD_COLOR --auto-approve零停机部署:
yaml
name: Blue-Green Deployment
on:
workflow_dispatch:
inputs:
color:
description: 'Deployment color'
required: true
type: choice
options:
- blue
- green
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Deploy to ${{ github.event.inputs.color }}
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
COLOR: ${{ github.event.inputs.color }}
run: |
# Deploy to color-specific namespace
databricks bundle deploy -t prod --var environment_color=$COLOR
- name: Run Health Checks
run: |
# Verify new deployment
python scripts/health_check.py --color $COLOR
- name: Switch Traffic
if: success()
run: |
# Update routing to new color
python scripts/switch_traffic.py --to $COLOR
- name: Cleanup Old Deployment
run: |
# Remove old color after successful switch
OLD_COLOR=$([ "$COLOR" == "blue" ] && echo "green" || echo "blue")
databricks bundle destroy -t prod --var environment_color=$OLD_COLOR --auto-approvePattern 3: Canary Deployment
模式3:金丝雀部署
Gradual Rollout:
yaml
name: Canary Deployment
on:
workflow_dispatch:
inputs:
canary_percentage:
description: 'Percentage of traffic for canary'
required: true
type: choice
options:
- '10'
- '25'
- '50'
- '100'
jobs:
canary-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Deploy Canary
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: |
databricks bundle deploy -t prod-canary
- name: Route Traffic to Canary
run: |
python scripts/route_traffic.py \\
--canary-percentage ${{ github.event.inputs.canary_percentage }}
- name: Monitor Canary
timeout-minutes: 30
run: |
python scripts/monitor_canary.py \\
--duration 30 \\
--error-threshold 0.01
- name: Promote or Rollback
run: |
if python scripts/check_canary_health.py; then
echo "Canary healthy - promoting"
databricks bundle deploy -t prod
else
echo "Canary unhealthy - rolling back"
python scripts/rollback_canary.py
exit 1
fi灰度发布:
yaml
name: Canary Deployment
on:
workflow_dispatch:
inputs:
canary_percentage:
description: 'Percentage of traffic for canary'
required: true
type: choice
options:
- '10'
- '25'
- '50'
- '100'
jobs:
canary-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Deploy Canary
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: |
databricks bundle deploy -t prod-canary
- name: Route Traffic to Canary
run: |
python scripts/route_traffic.py \\
--canary-percentage ${{ github.event.inputs.canary_percentage }}
- name: Monitor Canary
timeout-minutes: 30
run: |
python scripts/monitor_canary.py \\
--duration 30 \\
--error-threshold 0.01
- name: Promote or Rollback
run: |
if python scripts/check_canary_health.py; then
echo "Canary healthy - promoting"
databricks bundle deploy -t prod
else
echo "Canary unhealthy - rolling back"
python scripts/rollback_canary.py
exit 1
fiPattern 4: Scheduled Deployments
模式4:定时部署
Time-Based Deployment Windows:
yaml
name: Scheduled Production Deployment
on:
schedule:
# Deploy every Tuesday at 2 AM UTC
- cron: '0 2 * * 2'
workflow_dispatch:
jobs:
check-deployment-window:
runs-on: ubuntu-latest
outputs:
should_deploy: ${{ steps.check.outputs.should_deploy }}
steps:
- name: Check if in deployment window
id: check
run: |
# Check if current time is within deployment window
HOUR=$(date +%H)
DAY=$(date +%u)
if [ "$DAY" == "2" ] && [ "$HOUR" -ge "2" ] && [ "$HOUR" -le "4" ]; then
echo "should_deploy=true" >> $GITHUB_OUTPUT
else
echo "should_deploy=false" >> $GITHUB_OUTPUT
fi
deploy:
needs: check-deployment-window
if: needs.check-deployment-window.outputs.should_deploy == 'true'
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Deploy to Production
env:
DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
run: |
databricks bundle deploy -t prod
- name: Send Notification
if: always()
run: |
python scripts/send_notification.py \\
--status ${{ job.status }} \\
--environment production基于时间的部署窗口:
yaml
name: Scheduled Production Deployment
on:
schedule:
# Deploy every Tuesday at 2 AM UTC
- cron: '0 2 * * 2'
workflow_dispatch:
jobs:
check-deployment-window:
runs-on: ubuntu-latest
outputs:
should_deploy: ${{ steps.check.outputs.should_deploy }}
steps:
- name: Check if in deployment window
id: check
run: |
# Check if current time is within deployment window
HOUR=$(date +%H)
DAY=$(date +%u)
if [ "$DAY" == "2" ] && [ "$HOUR" -ge "2" ] && [ "$HOUR" -le "4" ]; then
echo "should_deploy=true" >> $GITHUB_OUTPUT
else
echo "should_deploy=false" >> $GITHUB_OUTPUT
fi
deploy:
needs: check-deployment-window
if: needs.check-deployment-window.outputs.should_deploy == 'true'
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v3
- name: Install Databricks CLI
run: |
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
- name: Deploy to Production
env:
DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
run: |
databricks bundle deploy -t prod
- name: Send Notification
if: always()
run: |
python scripts/send_notification.py \\
--status ${{ job.status }} \\
--environment productionPattern 5: Integration Testing
模式5:集成测试
Automated Integration Tests:
yaml
name: Integration Tests
on:
push:
branches: [develop]
schedule:
- cron: '0 */4 * * *' # Every 4 hours
jobs:
integration-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
pip install pytest databricks-connect databricks-sdk
pip install -r requirements.txt
- name: Run Integration Tests
env:
DATABRICKS_HOST: ${{ secrets.TEST_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.TEST_DATABRICKS_TOKEN }}
run: |
pytest tests/integration/ -v \\
--junit-xml=test-results.xml
- name: Publish Test Results
if: always()
uses: EnricoMi/publish-unit-test-result-action@v2
with:
files: test-results.xml
- name: Cleanup Test Data
if: always()
run: python scripts/cleanup_test_data.py自动化集成测试:
yaml
name: Integration Tests
on:
push:
branches: [develop]
schedule:
- cron: '0 */4 * * *' # Every 4 hours
jobs:
integration-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
pip install pytest databricks-connect databricks-sdk
pip install -r requirements.txt
- name: Run Integration Tests
env:
DATABRICKS_HOST: ${{ secrets.TEST_DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.TEST_DATABRICKS_TOKEN }}
run: |
pytest tests/integration/ -v \\
--junit-xml=test-results.xml
- name: Publish Test Results
if: always()
uses: EnricoMi/publish-unit-test-result-action@v2
with:
files: test-results.xml
- name: Cleanup Test Data
if: always()
run: python scripts/cleanup_test_data.pyBest Practices
最佳实践
1. Secrets Management
1. 密钥管理
yaml
undefinedyaml
undefinedUse GitHub Secrets for credentials
Use GitHub Secrets for credentials
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
Never commit secrets to repository
Never commit secrets to repository
Use environment-specific secrets
Use environment-specific secrets
undefinedundefined2. Approval Gates
2. 审批门禁
yaml
undefinedyaml
undefinedRequire manual approval for production
Require manual approval for production
environment:
name: production
url: https://prod-workspace.databricks.com
Configure required reviewers in GitHub settings
undefinedenvironment:
name: production
url: https://prod-workspace.databricks.com
Configure required reviewers in GitHub settings
undefined3. Deployment Notifications
3. 部署通知
yaml
- name: Notify Slack
if: always()
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: 'Deployment to ${{ github.event.inputs.environment }}'
webhook_url: ${{ secrets.SLACK_WEBHOOK }}yaml
- name: Notify Slack
if: always()
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: 'Deployment to ${{ github.event.inputs.environment }}'
webhook_url: ${{ secrets.SLACK_WEBHOOK }}4. Artifact Management
4. 制品管理
yaml
- name: Upload Bundle Artifact
uses: actions/upload-artifact@v3
with:
name: databricks-bundle
path: |
databricks.yml
resources/
src/
retention-days: 90yaml
- name: Upload Bundle Artifact
uses: actions/upload-artifact@v3
with:
name: databricks-bundle
path: |
databricks.yml
resources/
src/
retention-days: 90Common Pitfalls to Avoid
需避免的常见陷阱
Don't:
- Skip testing in CI pipeline
- Deploy without validation
- Hard-code secrets
- Deploy directly to production
- Skip rollback planning
Do:
- Test every change
- Validate before deploying
- Use secret management
- Use staging environments
- Plan rollback strategy
不要:
- 在CI流水线中跳过测试
- 未验证就部署
- 硬编码密钥
- 直接部署到生产环境
- 跳过回滚规划
要:
- 对每一处变更进行测试
- 部署前先验证
- 使用密钥管理工具
- 使用预发布环境
- 规划回滚策略
Complete Examples
完整示例
See directory for:
/examples/- : Complete CI/CD setup
full_cicd_pipeline/ - : Zero-downtime deployment
blue_green_deployment/
查看目录获取:
/examples/- : 完整CI/CD配置
full_cicd_pipeline/ - : 零停机部署
blue_green_deployment/
Related Skills
相关技能
- : Bundle deployment
databricks-asset-bundles - : Automated testing
testing-patterns - : Quality validation
data-quality - : Pipeline deployment
delta-live-tables
- : 包部署
databricks-asset-bundles - : 自动化测试
testing-patterns - : 质量验证
data-quality - : 流水线部署
delta-live-tables