Loading...
Loading...
Found 91 Skills
QA an analysis before sharing -- methodology, accuracy, and bias checks. Use when reviewing an analysis before a stakeholder presentation, spot-checking calculations and aggregation logic, verifying a SQL query's results look right, or assessing whether conclusions are actually supported by the data.
Run a comprehensive data quality assessment and produce a scorecard across 6 dimensions: completeness, uniqueness, consistency, timeliness, accuracy, validity. Use when the user asks about data quality, mentions data issues, wants to audit a table, is onboarding a new data source, or needs to validate pipeline output.
Side role: find and correct bad signals, earn leaderboard points per Publisher-approved correction (max 3/day)
Validate, format, and convert between JSON, YAML, and TOML. Parse and query structured data files. No API key required.
Python data validation using type hints and runtime type checking with Pydantic v2's Rust-powered core for high-performance validation in FastAPI, Django, and configuration management.
Expert in data pipelines, ETL processes, and data infrastructure
Data validation with quality scoring and quarantine for suspicious records. Validates incoming data without blocking the pipeline, enabling manual review of edge cases.
Expert for developing Streamlit data apps for Keboola deployment. Activates when building, modifying, or debugging Keboola data apps, Streamlit dashboards, adding filters, creating pages, or fixing data app issues. Validates data structures using Keboola MCP before writing code, tests implementations with Playwright browser automation, and follows SQL-first architecture patterns.
Chapter 2 데이터 수집 품질 기준 및 검증 방법
Plan a migration onto MotherDuck. Use when moving from Snowflake, Redshift, PostgreSQL, dbt-heavy stacks, or lakehouse tooling and the key decisions are target pattern, cutover slices, validation, rollback, and native-versus-DuckLake posture.
Guidance for counting tokens in datasets, particularly from HuggingFace or similar sources. This skill should be used when tasks involve counting tokens in datasets, understanding dataset schemas, filtering by categories/domains, or working with tokenizers. It helps avoid common pitfalls like incomplete field identification and ambiguous terminology interpretation.
Probability, distributions, hypothesis testing, and statistical inference. Use for A/B testing, experimental design, or statistical validation.