Loading...
Loading...
Found 46 Skills
pytest, data validation, Great Expectations, and quality assurance for data systems
Optimize provider selection, routing, and credit usage across 150+ enrichment sources for company/contact intelligence.
Conduct Exploratory Data Analysis (EDA) using descriptive statistics, visualizations, and data quality checks. Use this skill when the user has a dataset and needs to understand its structure, find patterns, detect anomalies, or prepare data for further analysis — even if they say 'what does this data look like', 'find interesting patterns', 'clean this data', or 'summarize this dataset'.
Profile and explore datasets to understand their shape, quality, and patterns before analysis. Use when encountering a new dataset, assessing data quality, discovering column distributions, identifying nulls and outliers, or deciding which dimensions to analyze.
Bronze/Silver/Gold layer design patterns and templates for building scalable data lakehouse architectures. Includes incremental processing, data quality checks, and optimization strategies.
Profile datasets to understand schema, quality, and characteristics. Use when analyzing data files (CSV, JSON, Parquet), discovering dataset properties, assessing data quality, or when user mentions data profiling, schema detection, data analysis, or quality metrics. Provides basic and intermediate profiling including distributions, uniqueness, and pattern detection.
Design data pipelines covering ETL vs ELT architectures, data source integration, scheduling, quality checks, and warehouse design. Use this skill when the user needs to move data between systems, build a data warehouse, automate data processing, or improve data reliability — even if they say 'move data from X to Y', 'build an ETL pipeline', 'our data is a mess', or 'set up a data warehouse'.
Designs and builds ETL/ELT data pipelines. Takes data sources, destination, transformation requirements. Generates pipeline code (Python/SQL), scheduling config, error handling, monitoring setup, and data quality checks. Outputs data-pipeline-spec.md + implementation files.
Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.
Analyze datasets to discover patterns, anomalies, and relationships. Use when exploring data files, generating statistical summaries, checking data quality, or creating visualizations. Supports CSV, Excel, JSON, Parquet, and more.
Use when implementing data governance frameworks, building data catalogs, establishing data lineage, defining data quality rules, or setting up data stewardship programs - covers metadata management, data quality, and complianceUse when ", " mentioned.
Database development and operations workflow covering SQL, NoSQL, database design, migrations, optimization, and data engineering.