Total 50,520 skills, Data Processing has 2561 skills
Showing 12 of 2561 skills
Review Kafka schema changes (Avro, Protobuf, JSON Schema) for compatibility and evolution best practices using the Lenses MCP server. Detects breaking changes, missing defaults, schema drift and naming issues. Use when user says "review schema changes", "check schema compatibility", "will this schema break consumers" or asks about schema evolution. Do NOT use for creating new schemas from scratch or registering them in the cluster.
Audit the health of a PostHog project's data warehouse — find every broken or degraded pipeline item across sources, sync schemas, materialized views, batch exports, and transformations. Use when the user asks "what's broken in my warehouse?", "give me a health check", "audit my data pipeline", "why are some dashboards stale?", or wants a one-shot triage summary before deciding where to spend time. Produces a prioritized report of issues grouped by severity and type, with recommended next steps.
Deep dive on a PostHog user by email address. Analyze what they do, where they spend time, and what products they use.
Guide the user through connecting a new data warehouse source — Postgres, MySQL, Stripe, Hubspot, MongoDB, Salesforce, BigQuery, Snowflake, and so on. Use when the user wants to "connect Stripe", "import data from Postgres", "add a new data source", "sync my warehouse tables", or wants to pick sync methods for each table. Walks through source-type discovery, credential validation, table discovery, per-table sync_type selection, and the final create call. Also covers picking a good prefix and what to do right after creation.
Guide to using LookML sets for grouping fields, controlling visibility, and managing drill paths.
All-in-one Assistant for Data Analysis and Office Productivity. Covers end-to-end workflows including data processing, analytical insights, report writing, PPT creation, and data visualization. Always approach from an expert perspective and think one step ahead for users. Proactively confirm with users when encountering uncertain issues. Supported features: Excel data analysis, campaign data review, ROI calculation, data visualization, report generation, PPT creation, formula generation. Use this skill when users mention terms like "analyze data", "create report", "make PPT", "Excel", "campaign analysis", "ROI", "review", "weekly report", "monthly report", "data processing", "chart", "visualization", "presentation", "spreadsheet", "formula".
Extracts structured practitioner data from healthcare practice websites. Returns names, credentials, specialties, contact info, and education for every provider on a practice's site. Use when user asks to extract, pull, or list doctors, providers, or staff from practice websites. Triggers: "extract doctors from", "pull providers from", "who are the providers at", "build a provider database", "list all doctors at", "scrape the team page", "get practitioner data from". Accepts practice URLs (pasted, CSV, Google Sheet) or discovers practices via Google Maps when given specialty + location. Single sites or 100+ URLs. Do NOT use for filling data gaps — use healthcare-providers-enrich instead. Do NOT use for credential validation — use healthcare-providers-verify instead. Do NOT use for discovering practices — use market-finder or local-places instead. Do NOT use for general extraction — use nimble-web-expert instead.
Cluster vectors by similarity using npx ruvector k-means or density-based methods with labeled group summaries
Convert an Omni Analytics topic into a Databricks Metric View definition in Unity Catalog. Use this skill whenever someone wants to export Omni metrics to Databricks, create a Metric View from an Omni topic, harden BI metrics into Unity Catalog, or bridge Omni's semantic layer with Databricks AI/BI dashboards and Genie spaces.
Comprehensive PostGIS spatial table design reference covering geometry types, coordinate systems, spatial indexing, and performance patterns for location-based applications
Resolve data lake and lakehouse asset references across Glue Data Catalog, S3, S3 Tables, and Redshift. Triggers on: find the table, where is our data, which table has, locate dataset, find data for, search catalog, what tables match, Redshift table, lakehouse table, data lake table, warehouse table, reverse lookup S3 path. Do NOT use for: full catalog audits (use exploring-data-catalog), running queries (use querying-data-lake), creating tables (use creating-data-lake-table).
Create managed Iceberg tables using Amazon S3 Tables (s3tables API namespace) with automatic compaction and snapshot management. Sets up table bucket, namespace, table, schema, Glue catalog registration, partitioning, IAM access control. Triggers on: create table, data lake table, analytics table, structured data storage, S3 Tables, Iceberg, Athena table, partitioning strategy, access permissions. Do NOT use for: importing files (use ingesting-into-data-lake), vector storage (use storing-and-querying-vectors), querying existing tables (use querying-data-lake), or locating existing table (use finding-data-lake-assets).