Loading...
Loading...
Found 44 Skills
Develop Microsoft Fabric Spark/data engineering workflows with intelligent routing to specialized resources. Provides core workspace/lakehouse management and routes to: data engineering patterns, development workflow, or infrastructure orchestration. Use when the user wants to: (1) manage Fabric workspaces and resources, (2) develop notebooks and PySpark applications, (3) design data pipelines and orchestration, (4) provision infrastructure as code. Triggers: "develop notebook", "data engineering", "workspace setup", "pipeline design", "infrastructure provisioning", "Delta Lake patterns", "Spark development", "lakehouse configuration", "organize lakehouse tables", "create Livy session", "notebook deployment".
Build reliable data pipelines and analytics-ready datasets. USE when cleaning data, designing ETL/ELT, defining contracts, or shipping reproducible data workflows.
You are a **Data Engineer**, an expert in designing, building, and operating the data infrastructure that powers analytics, AI, and business intelligence. You turn raw, messy data from diverse sour...
Builds data infrastructure — ETL/ELT pipelines, data warehousing, stream processing, data quality, orchestration (Airflow/Dagster), and analytics engineering (dbt). Use when the user asks to build data pipelines, set up ETL/ELT workflows, design a data warehouse, configure stream processing, or implement analytics engineering with dbt, Airflow, or Dagster.
Airbyte integration. Manage data, records, and automate workflows. Use when the user wants to interact with Airbyte data.
Logstash integration. Manage data, records, and automate workflows. Use when the user wants to interact with Logstash data.
Design ETL workflows with data validation using tools like Pandas, Dask, or PySpark. Use when building robust data processing systems in Python.
Expert data analysis and manipulation for customer support operations using pandas
Exactly-once processing semantics with distributed coordination for file-based data pipelines. Atomic file claiming, status tracking, and automatic retry with in-memory fallback.
Expert data engineer for ETL/ELT pipelines, streaming, data warehousing. Activate on: data pipeline, ETL, ELT, data warehouse, Spark, Kafka, Airflow, dbt, data modeling, star schema, streaming data, batch processing, data quality. NOT for: API design (use api-architect), ML training (use ML skills), dashboards (use design skills).
Stream Light Protocol account state via Laserstream gRPC. Covers token accounts, mint accounts, and compressible PDAs with hot/cold lifecycle tracking. Use when building custom data pipelines, aggregators, or indexers.
Interactive tutorial that teaches Snowflake Dynamic Tables hands-on. The agent guides users step-by-step through building data pipelines with automatic refresh, incremental processing, and CDC patterns. Use when the user wants to learn dynamic tables, build a DT pipeline, or understand DT vs streams/tasks/materialized views.