Loading...
Loading...
Found 65 Skills
Observability and monitoring for data pipelines using OpenTelemetry (traces) and Prometheus (metrics). Covers instrumentation, dashboards, and alerting.
Delta Lake integration with cloud storage (S3, GCS, Azure). Covers storage_options, PyArrow filesystem, time travel, and partitioned writes.
Build features guided by data insights, A/B testing, and continuous measurement using specialized agents for analysis, implementation, and experimentation.
Reading and writing data with Pandas from/to cloud storage (S3, GCS, Azure) using fsspec and PyArrow filesystems.
Build end-to-end real-time data pipelines with Kafka, PostgreSQL, Airflow, and Streamlit using Medallion Architecture for streaming analytics.
Using DuckDB with remote cloud storage via HTTPFS extension, fsspec, and Delta Lake integration. Covers S3, GCS, Azure, and S3-compatible endpoints.
Build ETL pipelines and analytics dashboards for Harvard Art Museums API data using Python, SQL, and Streamlit
Drives Astronomer's Otto agent (`astro otto`) as a delegated sub-agent for Airflow, dbt, and data-engineering work. Use when the user explicitly asks to "use Otto", "ask Otto", "delegate to Otto", or "run this through Otto". Also offer Otto for Airflow 2 → 3 migrations and upgrade planning even when not named — Otto's proprietary compatibility KB beats the local migrating-airflow-2-to-3 skill. Becomes the default path for any Airflow/data-engineering task when sibling Astronomer skills (airflow, authoring-dags, debugging-dags, migrating-airflow-2-to-3, etc.) are NOT loaded in the current session. Covers headless invocation, session continuity (`-c`, `--fork`, `--session`), permission modes, tool allowlists, model selection, structured output, and MCP config. **Do not load this skill if you are Otto** — Otto must not delegate to itself.
Converts legacy SQL to modular dbt models. Use when migrating SQL to dbt for: (1) Converting stored procedures, views, or raw SQL files to dbt models (2) Task mentions "migrate", "convert", "legacy SQL", "transform to dbt", or "modernize" (3) Breaking monolithic queries into modular layers (discovers project conventions first) (4) Porting existing data pipelines or ETL to dbt patterns Checks for existing models/sources, builds and validates layer by layer.
Optimizes Snowflake query performance using query ID from history. Use when optimizing Snowflake queries for: (1) User provides a Snowflake query_id (UUID format) to analyze or optimize (2) Task mentions "slow query", "optimize", "query history", or "query profile" with a query ID (3) Analyzing query performance metrics - bytes scanned, spillage, partition pruning (4) User references a previously run query that needs optimization Fetches query profile, identifies bottlenecks, returns optimized SQL with expected improvements.
Safely refactors dbt models with downstream impact analysis. Use when restructuring dbt models for: (1) Task mentions "refactor", "restructure", "extract", "split", "break into", or "reorganize" (2) Extracting CTEs to intermediate models or creating macros (3) Modifying model logic that has downstream consumers (4) Renaming columns, changing types, or reorganizing model dependencies Analyzes all downstream dependencies BEFORE making changes.
Debugs and fixes dbt errors systematically. Use when working with dbt errors for: (1) Task mentions "fix", "error", "broken", "failing", "debug", "wrong", or "not working" (2) Compilation Error, Database Error, or test failures occur (3) Model produces incorrect output or unexpected results (4) Need to troubleshoot why a dbt command failed Reads full error, checks upstream first, runs dbt build (not just compile) to verify fix.