Loading...
Loading...
Found 107 Skills
Data pipeline and ETL automation - extract, transform, load workflows for data integration and analytics
Design ETL workflows with data validation using tools like Pandas, Dask, or PySpark. Use when building robust data processing systems in Python.
Use this skill when building data pipelines, ETL/ELT workflows, or data transformation layers. Triggers on Airflow DAG design, dbt model creation, Spark job optimization, streaming vs batch architecture decisions, data ingestion, data quality checks, pipeline orchestration, incremental loads, CDC (change data capture), schema evolution, and data warehouse modeling. Acts as a senior data engineer advisor for building reliable, scalable data infrastructure.
You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.
Data pipeline expert for ETL, Apache Spark, Airflow, dbt, and data quality
Patterns for efficient ML data pipelines using Polars, Arrow, and ClickHouse. TRIGGERS - data pipeline, polars vs pandas, arrow format, clickhouse ml, efficient loading, zero-copy, memory optimization.
Design data pipelines covering ETL vs ELT architectures, data source integration, scheduling, quality checks, and warehouse design. Use this skill when the user needs to move data between systems, build a data warehouse, automate data processing, or improve data reliability — even if they say 'move data from X to Y', 'build an ETL pipeline', 'our data is a mess', or 'set up a data warehouse'.
Use this skill when the user asks to "set up parsing", "create parsing rule", "extract fields from logs", "regex extraction", "log parsing", "enrich logs", "add context to logs", "custom enrichment table", "lookup table", "geo enrichment", "create metric from logs", "events to metrics", "convert logs to metrics", "generate metrics from events", "recording rule", "precomputed metrics", "PromQL recording", "configure data pipeline", "transform log data", "data processing rules", "rule group", "enrichment settings", "E2M definition", "labels cardinality", "bulk delete rules", "enrichment limits", "search enrichment table", or wants to configure how Coralogix processes, enriches, or transforms ingested data.
Use this skill for data pipeline work — ingestion with dlt, transformations with sqlmesh, analytics with DuckDB/MotherDuck, DataFrames with polars, notebooks with marimo, and project management with uv.
Primary entry point for building, managing, and orchestrating data pipelines on Google Cloud. Guides users to the appropriate skill for dbt, Dataflow (Apache Beam), Dataform, Spark (Dataproc Serverless), BigQuery Data Transfer Service (DTS) or orchestration pipeline using Cloud Composer. Clarify requirements and resolve ambiguity for creating, updating and running data pipelines.
Create efficient data pipelines with tf.data
Designs and builds ETL/ELT data pipelines. Takes data sources, destination, transformation requirements. Generates pipeline code (Python/SQL), scheduling config, error handling, monitoring setup, and data quality checks. Outputs data-pipeline-spec.md + implementation files.