Loading...
Loading...
Found 5 Skills
Pyspark Transformer - Auto-activating skill for Data Pipelines. Triggers on: pyspark transformer, pyspark transformer Part of the Data Pipelines skill category.
Design ETL workflows with data validation using tools like Pandas, Dask, or PySpark. Use when building robust data processing systems in Python.
Execute read-only T-SQL queries against Fabric Data Warehouse, Lakehouse SQL Endpoints, and Mirrored Databases via CLI. Default skill for any lakehouse data query (row counts, SELECT, filtering, aggregation) unless the user explicitly requests PySpark or Spark DataFrames. Use when the user wants to: (1) query warehouse/lakehouse data, (2) count rows or explore lakehouse tables, (3) discover schemas/columns, (4) generate T-SQL scripts, (5) monitor SQL performance, (6) export results to CSV/JSON. Triggers: "warehouse", "SQL query", "T-SQL", "query warehouse", "show warehouse tables", "show lakehouse tables", "query lakehouse", "lakehouse table", "how many rows", "count rows", "SQL endpoint", "describe warehouse schema", "generate T-SQL script", "warehouse performance", "export SQL data", "connect to warehouse", "lakehouse data", "explore lakehouse".
Expert-level Databricks platform, Apache Spark, Delta Lake, MLflow, notebooks, and cluster management
Apache Spark distributed computing. Use for big data processing.