Loading...
Loading...
This skill helps the agent generate or update orchestration pipeline definitions for Google Cloud Composer to initialize orchestration pipeline or update the orchestration definition for orchestration of various data pipelines, like dbt pipelines, notebooks, Spark jobs, Dataform, Python scripts or inline BigQuery SQL queries. This skill also helps deploy and trigger orchestration pipelines.
npx skill4agent add gemini-cli-extensions/data-agent-kit-starter-pack gcp-pipeline-orchestration| Function/Use Case | Required Reference File | Capabilities & Intent Keywords |
|---|---|---|
| orchestration-pipelines schema | | orchestrate, generate, create, update |
1. `Orchestration File` (e.g., `orchestration-pipeline.yaml`,
`test-pipeline.yaml`): Defines the pipeline's logic, tasks, and
schedule. **IMPORTANT:** Check if a `deployment.yaml` file exists and
references an existing orchestration file. If it does, you **must update
the existing orchestration file** (e.g.,`test_pipeline.yaml`) instead of
creating a new one. The filename can be customized but must be
referenced in the `deployment.yaml` file.
2. `deployment.yaml`: Defines the environment-specific
configurations.(e.g., `dev`, `prod`). `deployment.yaml`should only
exists in the repository root and must be named `deployment.yaml`deployment.yamldeployment.yamldeployment.yamlinitdeployment.yaml# Replace <ORCHESTRATION_PIPELINE_NAME> with the actual name
# Replace <ENV_NAME> with the actual environment name
gcloud beta orchestration-pipelines init <ORCHESTRATION_PIPELINE_NAME> --environment=<ENV_NAME>[!IMPORTANT] While the internal pipeline models are defined using protobuf (which typically uses), the YAML configuration expectssnake_casefor almost all field names.camelCaseMapping Rule: Always convertproto fields (e.g.,snake_case) topipeline_idin YAML (e.g.,camelCase).pipelineId
references/orchestration-pipelines-schema.mdtags["job:datacloud:antigravity"]["job:datacloud:vscode"]["job:datacloud:other"]environmentsprojectregioncomposer_environmentartifact_storagebucketpath_prefixpipelines- sourcevariables[!TIP] If the user doesn't provide specific paths for scripts, dbt projects, or GCP details (Project ID, Region), use tools liketo search the repository andfind_by_namecommands (e.g.,gcloud) to retrieve the necessary information.gcloud config get-value project
deployment.yaml# Replace <PROJECT_ID> with the actual project_id
# Replace <REGION> with the actual region
gcloud dataproc clusters list \
--project <PROJECT_ID> \
--region <REGION> \[!TIP] Running the command withoutprovides a clear, tabular output that is easier to read.--format=yaml
endTimestartTime[!IMPORTANT] A Composer environment is not a Dataproc cluster. If no Dataproc clusters are available, do not use a Composer environment for the. It is better to omit this configuration if a dedicated Spark History Server is not available.sparkHistoryServerConfig
pysparkdeployment.yaml# Replace <PROJECT_ID> with the actual project_id
# Replace <REGION> with the actual region
gcloud composer environments list \
--project <PROJECT_ID> \
--locations <REGION> \# Replace <ENVIRONMENT_NAME> with the Composer environment name
# Replace <REGION> with the region
gcloud composer environments describe <ENVIRONMENT_NAME> \
--location <REGION> \
--format="json(config.softwareConfig.imageVersion, config.softwareConfig.pypiPackages)"orchestration-pipelinesdeployment.yamlartifact_storageartifact_storageYOUR_BUCKETinitdeployment.yamldeployment.yamldeployment.yaml# TODO:dbt_clean_pipeline.yamlnew_name.yamlsourcepipelinesdeployment.yaml[!IMPORTANT]Time Format: Do NOT include thesuffix inZandstartTime. Use the formatendTime(e.g.,"YYYY-MM-DDTHH:MM:SS")."2025-10-01T00:00:00"
gcloud beta orchestration-pipelines validatedeployment.yamlvalidatedeployment.yaml# Replace <ENV_NAME> with the identified environment name
gcloud beta orchestration-pipelines validate --environment=<ENV_NAME>deployment.yamlenvironments:
<environment_name>: # e.g., dev, prod
project: <PROJECT_ID>
region: <REGION>
composer_environment: <COMPOSER_ENVIRONMENT_NAME>
gcs_bucket: "" # Optional
artifact_storage:
bucket: <ARTIFACT_BUCKET_NAME>
path_prefix: "<prefix>-" # e.g., namespace or username prefix
pipelines:
- source: '<orchestration-pipeline.yaml>' # e.g., list of pipeline yaml namesdeployment.yamldevpipelineId--local# Replace <ENV_NAME> with the target environment
# Replace <PIPELINE_SOURCE> with the orchestration YAML filename
gcloud beta orchestration-pipelines deploy \
--environment=<ENV_NAME> --localPipeline deployment successful for version local-b32d15e307b5local-b32d15e307b5[!IMPORTANT]deployments now default to--local. The deployed DAG will be visible in Airflow as a paused DAG without a schedule. It will not auto-run. Use Step 7 to trigger it.--paused=true
devdeployment.yamlbundle IDpipelineId
# Initial delay: wait 30 seconds after deploy
sleep 30
# Poll every 15 seconds, up to 2 minutes total
# Replace <ENV_NAME>, <BUNDLE_ID> with actual values
gcloud beta orchestration-pipelines list \
--environment=<ENV_NAME> \
--bundle=<BUNDLE_ID> # Replace <ENV_NAME>, <BUNDLE_ID>, <PIPELINE_ID> with actual values
gcloud beta orchestration-pipelines trigger \
--environment=<ENV_NAME> \
--bundle=<BUNDLE_ID> \
--pipeline=<PIPELINE_ID> gcloud beta orchestration-pipelines runs
list \ --environment=<ENV_NAME> \ --bundle=<BUNDLE_ID> \
--pipeline=<PIPELINE_ID>`[!TIP] Trigger-only (no deploy): If the user wants to trigger an already-deployed pipeline, skip Step 6. Useto find the bundle ID, then trigger directly with Step 7.4.gcloud beta orchestration-pipelines list --environment=<ENV_NAME>
[!IMPORTANT] Fallback: Iffails, use the bundled script: Run script with -- help to discover and learn the interface.gcloud trigger
python scripts/trigger/airflow_trigger.py \ --project <PROJECT_ID>
--location <REGION> \ --environment <COMPOSER_ENV> --dag_id <PIPELINE_ID>Get,project, andregionfromcomposer_environment.deployment.yaml
deployment.yamlorchestration_pipeline.yamlendTimegcloud beta orchestration-pipelines validate --environment=<ENV_NAME>gcloud beta orchestration-pipelines deploy --environment=<ENV_NAME> --localgcloud beta orchestration-pipelines listgcloud beta orchestration-pipelines triggergcloud beta orchestration-pipelines runs list # Replace <ENV_NAME>, <BUNDLE_ID>, <PIPELINE_ID> with actual values
gcloud beta orchestration-pipelines pause \
--environment=<ENV_NAME> \
--bundle=<BUNDLE_ID> \
--pipeline=<PIPELINE_ID> # Replace <ENV_NAME>, <BUNDLE_ID>, <PIPELINE_ID> with actual values
gcloud beta orchestration-pipelines unpause \
--environment=<ENV_NAME> \
--bundle=<BUNDLE_ID> \
--pipeline=<PIPELINE_ID>