Loading...
Loading...
Found 63 Skills
Complete guide for Apache Airflow orchestration including DAGs, operators, sensors, XComs, task dependencies, dynamic workflows, and production deployment
Expert-level Apache Airflow orchestration, DAGs, operators, sensors, XComs, task dependencies, and scheduling
Reporting pipelines for CSV/JSON/Markdown exports with timestamped outputs, summaries, and post-processing.
Data Quality Checker - Auto-activating skill for Data Pipelines. Triggers on: data quality checker, data quality checker Part of the Data Pipelines skill category.
Database development and operations workflow covering SQL, NoSQL, database design, migrations, optimization, and data engineering.
Interactive tutorial that teaches Snowflake Dynamic Tables hands-on. The agent guides users step-by-step through building data pipelines with automatic refresh, incremental processing, and CDC patterns. Use when the user wants to learn dynamic tables, build a DT pipeline, or understand DT vs streams/tasks/materialized views.
Data processing expert including parsing, transformation, and validation
Guides understanding and working with Apache Beam runners (Direct, Dataflow, Flink, Spark, etc.). Use when configuring pipelines for different execution environments or debugging runner-specific issues.
Consult this skill when designing data pipelines or transformation workflows. Use when data flows through fixed sequence of transformations, stages can be independently developed and tested, parallel processing of stages is beneficial. Do not use when selecting from multiple paradigms - use architecture-paradigms first. DO NOT use when: data flow is not sequential or predictable. DO NOT use when: complex branching/merging logic dominates.
Schema Validator - Auto-activating skill for Data Pipelines. Triggers on: schema validator, schema validator Part of the Data Pipelines skill category.
Pipeline state management for Goldsky Turbo — pause, resume, restart, and delete commands with their rules and safety behavior. Use this skill when the user asks: will deleting my pipeline lose the data already in my postgres/clickhouse table, how do I pause a pipeline while doing database maintenance, how do I restart from block zero to reprocess all historical data, can I update a running streaming pipeline in place or do I have to delete and redeploy, will resuming a paused pipeline pick up from where it left off (checkpoint), how do I re-run a completed job pipeline from the beginning, can I pause or restart a job-mode pipeline. Also covers what happens to checkpoint state on delete, and job auto-deletion 1 hour after termination. For actively diagnosing why a pipeline is broken or erroring, use /turbo-doctor instead.
This skill should be used when the user asks to "validate a DataFrame with pandera", "write a pandera schema", "use pandera DataFrameModel", "add data validation to a pipeline", or needs guidance on pandera best practices for data quality.