cicd-workflows

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

CI/CD Workflows Skill

CI/CD 工作流技能

Overview

概述

Comprehensive CI/CD patterns for Databricks using GitHub Actions, including automated testing, quality gates, multi-environment deployments, and rollback strategies.
Key Benefits:
  • Automated testing and validation
  • Consistent deployments
  • Environment promotion workflows
  • Quality gates enforcement
  • Rollback capabilities
  • Audit trails
使用GitHub Actions实现的Databricks全场景CI/CD模式,涵盖自动化测试、质量门禁、多环境部署以及回滚策略。
核心优势:
  • 自动化测试与验证
  • 一致性部署
  • 环境升级工作流
  • 质量门禁强制执行
  • 回滚能力
  • 审计追踪

When to Use This Skill

何时使用该技能

Use CI/CD workflows when you need to:
  • Automate deployment processes
  • Enforce quality standards
  • Deploy across multiple environments
  • Implement approval gates
  • Track deployment history
  • Enable rapid iterations
  • Reduce manual errors
在以下场景中使用CI/CD工作流:
  • 自动化部署流程
  • 强制执行质量标准
  • 跨多环境部署
  • 实现审批门禁
  • 追踪部署历史
  • 支持快速迭代
  • 减少手动错误

Core Concepts

核心概念

1. CI Pipeline

1. CI 流水线

Continuous Integration Workflow:
yaml
undefined
持续集成工作流:
yaml
undefined

.github/workflows/ci.yml

.github/workflows/ci.yml

name: Continuous Integration
on: push: branches: [develop, main] pull_request: branches: [develop, main]
jobs: lint: name: Code Quality Checks runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Set up Python
    uses: actions/setup-python@v4
    with:
      python-version: '3.10'

  - name: Install dependencies
    run: |
      pip install black ruff mypy
      pip install -r requirements.txt

  - name: Run Black
    run: black --check src/

  - name: Run Ruff
    run: ruff check src/

  - name: Run MyPy
    run: mypy src/
test: name: Unit Tests runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Set up Python
    uses: actions/setup-python@v4
    with:
      python-version: '3.10'

  - name: Install dependencies
    run: |
      pip install pytest pytest-cov pytest-spark
      pip install -r requirements.txt

  - name: Run Tests
    run: pytest tests/ -v --cov=src --cov-report=xml

  - name: Upload Coverage
    uses: codecov/codecov-action@v3
    with:
      file: ./coverage.xml
      fail_ci_if_error: true
validate-bundle: name: Validate Databricks Bundle runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Validate Bundle
    env:
      DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
    run: databricks bundle validate -t dev
undefined
name: Continuous Integration
on: push: branches: [develop, main] pull_request: branches: [develop, main]
jobs: lint: name: Code Quality Checks runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Set up Python
    uses: actions/setup-python@v4
    with:
      python-version: '3.10'

  - name: Install dependencies
    run: |
      pip install black ruff mypy
      pip install -r requirements.txt

  - name: Run Black
    run: black --check src/

  - name: Run Ruff
    run: ruff check src/

  - name: Run MyPy
    run: mypy src/
test: name: Unit Tests runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Set up Python
    uses: actions/setup-python@v4
    with:
      python-version: '3.10'

  - name: Install dependencies
    run: |
      pip install pytest pytest-cov pytest-spark
      pip install -r requirements.txt

  - name: Run Tests
    run: pytest tests/ -v --cov=src --cov-report=xml

  - name: Upload Coverage
    uses: codecov/codecov-action@v3
    with:
      file: ./coverage.xml
      fail_ci_if_error: true
validate-bundle: name: Validate Databricks Bundle runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Validate Bundle
    env:
      DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
    run: databricks bundle validate -t dev
undefined

2. CD Pipeline

2. CD 流水线

Continuous Deployment Workflow:
yaml
undefined
持续部署工作流:
yaml
undefined

.github/workflows/cd.yml

.github/workflows/cd.yml

name: Continuous Deployment
on: push: branches: [main] workflow_dispatch: inputs: environment: description: 'Target environment' required: true type: choice options: - dev - staging - prod
jobs: deploy-dev: name: Deploy to Development if: github.event_name == 'push' || github.event.inputs.environment == 'dev' runs-on: ubuntu-latest environment: development steps: - uses: actions/checkout@v3
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Deploy Bundle
    env:
      DATABRICKS_HOST: ${{ secrets.DEV_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.DEV_DATABRICKS_TOKEN }}
    run: |
      databricks bundle deploy -t dev
      databricks bundle run -t dev integration_tests
deploy-staging: name: Deploy to Staging needs: deploy-dev if: github.event_name == 'push' || github.event.inputs.environment == 'staging' runs-on: ubuntu-latest environment: staging steps: - uses: actions/checkout@v3
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Deploy Bundle
    env:
      DATABRICKS_HOST: ${{ secrets.STAGING_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.STAGING_DATABRICKS_TOKEN }}
    run: databricks bundle deploy -t staging

  - name: Run Smoke Tests
    env:
      DATABRICKS_HOST: ${{ secrets.STAGING_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.STAGING_DATABRICKS_TOKEN }}
    run: databricks bundle run -t staging smoke_tests
deploy-prod: name: Deploy to Production needs: deploy-staging if: github.event_name == 'push' || github.event.inputs.environment == 'prod' runs-on: ubuntu-latest environment: name: production url: https://prod-workspace.databricks.com steps: - uses: actions/checkout@v3
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Create Backup
    env:
      DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
    run: |
      # Backup current state
      databricks bundle generate deployment-state --target prod > backup-state.json

  - name: Deploy Bundle
    env:
      DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
    run: databricks bundle deploy -t prod

  - name: Run Health Checks
    env:
      DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
    run: databricks bundle run -t prod health_check

  - name: Notify Deployment
    if: success()
    run: |
      echo "Production deployment successful"
      # Send notification (Slack, email, etc.)
undefined
name: Continuous Deployment
on: push: branches: [main] workflow_dispatch: inputs: environment: description: 'Target environment' required: true type: choice options: - dev - staging - prod
jobs: deploy-dev: name: Deploy to Development if: github.event_name == 'push' || github.event.inputs.environment == 'dev' runs-on: ubuntu-latest environment: development steps: - uses: actions/checkout@v3
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Deploy Bundle
    env:
      DATABRICKS_HOST: ${{ secrets.DEV_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.DEV_DATABRICKS_TOKEN }}
    run: |
      databricks bundle deploy -t dev
      databricks bundle run -t dev integration_tests
deploy-staging: name: Deploy to Staging needs: deploy-dev if: github.event_name == 'push' || github.event.inputs.environment == 'staging' runs-on: ubuntu-latest environment: staging steps: - uses: actions/checkout@v3
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Deploy Bundle
    env:
      DATABRICKS_HOST: ${{ secrets.STAGING_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.STAGING_DATABRICKS_TOKEN }}
    run: databricks bundle deploy -t staging

  - name: Run Smoke Tests
    env:
      DATABRICKS_HOST: ${{ secrets.STAGING_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.STAGING_DATABRICKS_TOKEN }}
    run: databricks bundle run -t staging smoke_tests
deploy-prod: name: Deploy to Production needs: deploy-staging if: github.event_name == 'push' || github.event.inputs.environment == 'prod' runs-on: ubuntu-latest environment: name: production url: https://prod-workspace.databricks.com steps: - uses: actions/checkout@v3
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Create Backup
    env:
      DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
    run: |
      # Backup current state
      databricks bundle generate deployment-state --target prod > backup-state.json

  - name: Deploy Bundle
    env:
      DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
    run: databricks bundle deploy -t prod

  - name: Run Health Checks
    env:
      DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
    run: databricks bundle run -t prod health_check

  - name: Notify Deployment
    if: success()
    run: |
      echo "Production deployment successful"
      # Send notification (Slack, email, etc.)
undefined

3. Quality Gates

3. 质量门禁

Enforce Quality Standards:
yaml
undefined
强制执行质量标准:
yaml
undefined

.github/workflows/quality-gates.yml

.github/workflows/quality-gates.yml

name: Quality Gates
on: pull_request: branches: [main, develop]
jobs: security-scan: name: Security Vulnerability Scan runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Run Bandit Security Scan
    run: |
      pip install bandit
      bandit -r src/ -f json -o bandit-report.json

  - name: Run Safety Check
    run: |
      pip install safety
      safety check --json
code-coverage: name: Code Coverage Gate runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Set up Python
    uses: actions/setup-python@v4
    with:
      python-version: '3.10'

  - name: Run Tests with Coverage
    run: |
      pip install pytest pytest-cov
      pip install -r requirements.txt
      pytest tests/ --cov=src --cov-report=term --cov-fail-under=80
data-quality: name: Data Quality Checks runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Validate Data Contracts
    run: |
      python scripts/validate_contracts.py

  - name: Check DLT Expectations
    run: |
      python scripts/validate_dlt_expectations.py
undefined
name: Quality Gates
on: pull_request: branches: [main, develop]
jobs: security-scan: name: Security Vulnerability Scan runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Run Bandit Security Scan
    run: |
      pip install bandit
      bandit -r src/ -f json -o bandit-report.json

  - name: Run Safety Check
    run: |
      pip install safety
      safety check --json
code-coverage: name: Code Coverage Gate runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Set up Python
    uses: actions/setup-python@v4
    with:
      python-version: '3.10'

  - name: Run Tests with Coverage
    run: |
      pip install pytest pytest-cov
      pip install -r requirements.txt
      pytest tests/ --cov=src --cov-report=term --cov-fail-under=80
data-quality: name: Data Quality Checks runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
  - name: Validate Data Contracts
    run: |
      python scripts/validate_contracts.py

  - name: Check DLT Expectations
    run: |
      python scripts/validate_dlt_expectations.py
undefined

4. Rollback Strategy

4. 回滚策略

Automated Rollback:
yaml
undefined
自动化回滚:
yaml
undefined

.github/workflows/rollback.yml

.github/workflows/rollback.yml

name: Rollback Deployment
on: workflow_dispatch: inputs: environment: description: 'Environment to rollback' required: true type: choice options: - dev - staging - prod version: description: 'Version to rollback to (commit SHA or tag)' required: true type: string
jobs: rollback: name: Rollback to Previous Version runs-on: ubuntu-latest environment: ${{ github.event.inputs.environment }} steps: - uses: actions/checkout@v3 with: ref: ${{ github.event.inputs.version }}
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Validate Rollback Version
    run: |
      echo "Rolling back to version: ${{ github.event.inputs.version }}"
      databricks bundle validate -t ${{ github.event.inputs.environment }}

  - name: Deploy Previous Version
    env:
      DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
    run: |
      databricks bundle deploy -t ${{ github.event.inputs.environment }}

  - name: Verify Rollback
    env:
      DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
    run: |
      databricks bundle run -t ${{ github.event.inputs.environment }} health_check
undefined
name: Rollback Deployment
on: workflow_dispatch: inputs: environment: description: 'Environment to rollback' required: true type: choice options: - dev - staging - prod version: description: 'Version to rollback to (commit SHA or tag)' required: true type: string
jobs: rollback: name: Rollback to Previous Version runs-on: ubuntu-latest environment: ${{ github.event.inputs.environment }} steps: - uses: actions/checkout@v3 with: ref: ${{ github.event.inputs.version }}
  - name: Install Databricks CLI
    run: |
      curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

  - name: Validate Rollback Version
    run: |
      echo "Rolling back to version: ${{ github.event.inputs.version }}"
      databricks bundle validate -t ${{ github.event.inputs.environment }}

  - name: Deploy Previous Version
    env:
      DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
    run: |
      databricks bundle deploy -t ${{ github.event.inputs.environment }}

  - name: Verify Rollback
    env:
      DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
    run: |
      databricks bundle run -t ${{ github.event.inputs.environment }} health_check
undefined

Implementation Patterns

实现模式

Pattern 1: Matrix Testing

模式1:矩阵测试

Test Across Multiple Versions:
yaml
name: Matrix Testing

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ['3.9', '3.10', '3.11']
        spark-version: ['3.3.0', '3.4.0']

    steps:
      - uses: actions/checkout@v3

      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v4
        with:
          python-version: ${{ matrix.python-version }}

      - name: Install PySpark ${{ matrix.spark-version }}
        run: |
          pip install pyspark==${{ matrix.spark-version }}
          pip install -r requirements.txt

      - name: Run Tests
        run: pytest tests/ -v
跨多版本测试:
yaml
name: Matrix Testing

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ['3.9', '3.10', '3.11']
        spark-version: ['3.3.0', '3.4.0']

    steps:
      - uses: actions/checkout@v3

      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v4
        with:
          python-version: ${{ matrix.python-version }}

      - name: Install PySpark ${{ matrix.spark-version }}
        run: |
          pip install pyspark==${{ matrix.spark-version }}
          pip install -r requirements.txt

      - name: Run Tests
        run: pytest tests/ -v

Pattern 2: Blue-Green Deployment

模式2:蓝绿部署

Zero-Downtime Deployments:
yaml
name: Blue-Green Deployment

on:
  workflow_dispatch:
    inputs:
      color:
        description: 'Deployment color'
        required: true
        type: choice
        options:
          - blue
          - green

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Deploy to ${{ github.event.inputs.color }}
        env:
          DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
          DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
          COLOR: ${{ github.event.inputs.color }}
        run: |
          # Deploy to color-specific namespace
          databricks bundle deploy -t prod --var environment_color=$COLOR

      - name: Run Health Checks
        run: |
          # Verify new deployment
          python scripts/health_check.py --color $COLOR

      - name: Switch Traffic
        if: success()
        run: |
          # Update routing to new color
          python scripts/switch_traffic.py --to $COLOR

      - name: Cleanup Old Deployment
        run: |
          # Remove old color after successful switch
          OLD_COLOR=$([ "$COLOR" == "blue" ] && echo "green" || echo "blue")
          databricks bundle destroy -t prod --var environment_color=$OLD_COLOR --auto-approve
零停机部署:
yaml
name: Blue-Green Deployment

on:
  workflow_dispatch:
    inputs:
      color:
        description: 'Deployment color'
        required: true
        type: choice
        options:
          - blue
          - green

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Deploy to ${{ github.event.inputs.color }}
        env:
          DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
          DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
          COLOR: ${{ github.event.inputs.color }}
        run: |
          # Deploy to color-specific namespace
          databricks bundle deploy -t prod --var environment_color=$COLOR

      - name: Run Health Checks
        run: |
          # Verify new deployment
          python scripts/health_check.py --color $COLOR

      - name: Switch Traffic
        if: success()
        run: |
          # Update routing to new color
          python scripts/switch_traffic.py --to $COLOR

      - name: Cleanup Old Deployment
        run: |
          # Remove old color after successful switch
          OLD_COLOR=$([ "$COLOR" == "blue" ] && echo "green" || echo "blue")
          databricks bundle destroy -t prod --var environment_color=$OLD_COLOR --auto-approve

Pattern 3: Canary Deployment

模式3:金丝雀部署

Gradual Rollout:
yaml
name: Canary Deployment

on:
  workflow_dispatch:
    inputs:
      canary_percentage:
        description: 'Percentage of traffic for canary'
        required: true
        type: choice
        options:
          - '10'
          - '25'
          - '50'
          - '100'

jobs:
  canary-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Deploy Canary
        env:
          DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
          DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
        run: |
          databricks bundle deploy -t prod-canary

      - name: Route Traffic to Canary
        run: |
          python scripts/route_traffic.py \\
            --canary-percentage ${{ github.event.inputs.canary_percentage }}

      - name: Monitor Canary
        timeout-minutes: 30
        run: |
          python scripts/monitor_canary.py \\
            --duration 30 \\
            --error-threshold 0.01

      - name: Promote or Rollback
        run: |
          if python scripts/check_canary_health.py; then
            echo "Canary healthy - promoting"
            databricks bundle deploy -t prod
          else
            echo "Canary unhealthy - rolling back"
            python scripts/rollback_canary.py
            exit 1
          fi
灰度发布:
yaml
name: Canary Deployment

on:
  workflow_dispatch:
    inputs:
      canary_percentage:
        description: 'Percentage of traffic for canary'
        required: true
        type: choice
        options:
          - '10'
          - '25'
          - '50'
          - '100'

jobs:
  canary-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Deploy Canary
        env:
          DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
          DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
        run: |
          databricks bundle deploy -t prod-canary

      - name: Route Traffic to Canary
        run: |
          python scripts/route_traffic.py \\
            --canary-percentage ${{ github.event.inputs.canary_percentage }}

      - name: Monitor Canary
        timeout-minutes: 30
        run: |
          python scripts/monitor_canary.py \\
            --duration 30 \\
            --error-threshold 0.01

      - name: Promote or Rollback
        run: |
          if python scripts/check_canary_health.py; then
            echo "Canary healthy - promoting"
            databricks bundle deploy -t prod
          else
            echo "Canary unhealthy - rolling back"
            python scripts/rollback_canary.py
            exit 1
          fi

Pattern 4: Scheduled Deployments

模式4:定时部署

Time-Based Deployment Windows:
yaml
name: Scheduled Production Deployment

on:
  schedule:
    # Deploy every Tuesday at 2 AM UTC
    - cron: '0 2 * * 2'
  workflow_dispatch:

jobs:
  check-deployment-window:
    runs-on: ubuntu-latest
    outputs:
      should_deploy: ${{ steps.check.outputs.should_deploy }}
    steps:
      - name: Check if in deployment window
        id: check
        run: |
          # Check if current time is within deployment window
          HOUR=$(date +%H)
          DAY=$(date +%u)
          if [ "$DAY" == "2" ] && [ "$HOUR" -ge "2" ] && [ "$HOUR" -le "4" ]; then
            echo "should_deploy=true" >> $GITHUB_OUTPUT
          else
            echo "should_deploy=false" >> $GITHUB_OUTPUT
          fi

  deploy:
    needs: check-deployment-window
    if: needs.check-deployment-window.outputs.should_deploy == 'true'
    runs-on: ubuntu-latest
    environment: production
    steps:
      - uses: actions/checkout@v3

      - name: Install Databricks CLI
        run: |
          curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

      - name: Deploy to Production
        env:
          DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
          DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
        run: |
          databricks bundle deploy -t prod

      - name: Send Notification
        if: always()
        run: |
          python scripts/send_notification.py \\
            --status ${{ job.status }} \\
            --environment production
基于时间的部署窗口:
yaml
name: Scheduled Production Deployment

on:
  schedule:
    # Deploy every Tuesday at 2 AM UTC
    - cron: '0 2 * * 2'
  workflow_dispatch:

jobs:
  check-deployment-window:
    runs-on: ubuntu-latest
    outputs:
      should_deploy: ${{ steps.check.outputs.should_deploy }}
    steps:
      - name: Check if in deployment window
        id: check
        run: |
          # Check if current time is within deployment window
          HOUR=$(date +%H)
          DAY=$(date +%u)
          if [ "$DAY" == "2" ] && [ "$HOUR" -ge "2" ] && [ "$HOUR" -le "4" ]; then
            echo "should_deploy=true" >> $GITHUB_OUTPUT
          else
            echo "should_deploy=false" >> $GITHUB_OUTPUT
          fi

  deploy:
    needs: check-deployment-window
    if: needs.check-deployment-window.outputs.should_deploy == 'true'
    runs-on: ubuntu-latest
    environment: production
    steps:
      - uses: actions/checkout@v3

      - name: Install Databricks CLI
        run: |
          curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

      - name: Deploy to Production
        env:
          DATABRICKS_HOST: ${{ secrets.PROD_DATABRICKS_HOST }}
          DATABRICKS_TOKEN: ${{ secrets.PROD_DATABRICKS_TOKEN }}
        run: |
          databricks bundle deploy -t prod

      - name: Send Notification
        if: always()
        run: |
          python scripts/send_notification.py \\
            --status ${{ job.status }} \\
            --environment production

Pattern 5: Integration Testing

模式5:集成测试

Automated Integration Tests:
yaml
name: Integration Tests

on:
  push:
    branches: [develop]
  schedule:
    - cron: '0 */4 * * *'  # Every 4 hours

jobs:
  integration-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'

      - name: Install dependencies
        run: |
          pip install pytest databricks-connect databricks-sdk
          pip install -r requirements.txt

      - name: Run Integration Tests
        env:
          DATABRICKS_HOST: ${{ secrets.TEST_DATABRICKS_HOST }}
          DATABRICKS_TOKEN: ${{ secrets.TEST_DATABRICKS_TOKEN }}
        run: |
          pytest tests/integration/ -v \\
            --junit-xml=test-results.xml

      - name: Publish Test Results
        if: always()
        uses: EnricoMi/publish-unit-test-result-action@v2
        with:
          files: test-results.xml

      - name: Cleanup Test Data
        if: always()
        run: python scripts/cleanup_test_data.py
自动化集成测试:
yaml
name: Integration Tests

on:
  push:
    branches: [develop]
  schedule:
    - cron: '0 */4 * * *'  # Every 4 hours

jobs:
  integration-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'

      - name: Install dependencies
        run: |
          pip install pytest databricks-connect databricks-sdk
          pip install -r requirements.txt

      - name: Run Integration Tests
        env:
          DATABRICKS_HOST: ${{ secrets.TEST_DATABRICKS_HOST }}
          DATABRICKS_TOKEN: ${{ secrets.TEST_DATABRICKS_TOKEN }}
        run: |
          pytest tests/integration/ -v \\
            --junit-xml=test-results.xml

      - name: Publish Test Results
        if: always()
        uses: EnricoMi/publish-unit-test-result-action@v2
        with:
          files: test-results.xml

      - name: Cleanup Test Data
        if: always()
        run: python scripts/cleanup_test_data.py

Best Practices

最佳实践

1. Secrets Management

1. 密钥管理

yaml
undefined
yaml
undefined

Use GitHub Secrets for credentials

Use GitHub Secrets for credentials

env: DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }} DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
env: DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }} DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}

Never commit secrets to repository

Never commit secrets to repository

Use environment-specific secrets

Use environment-specific secrets

undefined
undefined

2. Approval Gates

2. 审批门禁

yaml
undefined
yaml
undefined

Require manual approval for production

Require manual approval for production

environment: name: production url: https://prod-workspace.databricks.com

Configure required reviewers in GitHub settings

undefined
environment: name: production url: https://prod-workspace.databricks.com

Configure required reviewers in GitHub settings

undefined

3. Deployment Notifications

3. 部署通知

yaml
- name: Notify Slack
  if: always()
  uses: 8398a7/action-slack@v3
  with:
    status: ${{ job.status }}
    text: 'Deployment to ${{ github.event.inputs.environment }}'
    webhook_url: ${{ secrets.SLACK_WEBHOOK }}
yaml
- name: Notify Slack
  if: always()
  uses: 8398a7/action-slack@v3
  with:
    status: ${{ job.status }}
    text: 'Deployment to ${{ github.event.inputs.environment }}'
    webhook_url: ${{ secrets.SLACK_WEBHOOK }}

4. Artifact Management

4. 制品管理

yaml
- name: Upload Bundle Artifact
  uses: actions/upload-artifact@v3
  with:
    name: databricks-bundle
    path: |
      databricks.yml
      resources/
      src/
    retention-days: 90
yaml
- name: Upload Bundle Artifact
  uses: actions/upload-artifact@v3
  with:
    name: databricks-bundle
    path: |
      databricks.yml
      resources/
      src/
    retention-days: 90

Common Pitfalls to Avoid

需避免的常见陷阱

Don't:
  • Skip testing in CI pipeline
  • Deploy without validation
  • Hard-code secrets
  • Deploy directly to production
  • Skip rollback planning
Do:
  • Test every change
  • Validate before deploying
  • Use secret management
  • Use staging environments
  • Plan rollback strategy
不要:
  • 在CI流水线中跳过测试
  • 未验证就部署
  • 硬编码密钥
  • 直接部署到生产环境
  • 跳过回滚规划
要:
  • 对每一处变更进行测试
  • 部署前先验证
  • 使用密钥管理工具
  • 使用预发布环境
  • 规划回滚策略

Complete Examples

完整示例

See
/examples/
directory for:
  • full_cicd_pipeline/
    : Complete CI/CD setup
  • blue_green_deployment/
    : Zero-downtime deployment
查看
/examples/
目录获取:
  • full_cicd_pipeline/
    : 完整CI/CD配置
  • blue_green_deployment/
    : 零停机部署

Related Skills

相关技能

  • databricks-asset-bundles
    : Bundle deployment
  • testing-patterns
    : Automated testing
  • data-quality
    : Quality validation
  • delta-live-tables
    : Pipeline deployment
  • databricks-asset-bundles
    : 包部署
  • testing-patterns
    : 自动化测试
  • data-quality
    : 质量验证
  • delta-live-tables
    : 流水线部署

References

参考资料