Search Results: data-pipeline

Found 107 Skills

nutmeg

Football data analytics — the single entry point. Use whenever the user mentions football data, xG, expected goals, match analysis, player stats, scouting, match reports, shot maps, passing networks, Premier League data, Champions League stats, scraping FBref/Understat/Transfermarkt, building football charts, or anything football analytics related. Routes to specialised sub-skills automatically. Also handles first-time setup and profile management.

🇺🇸|EnglishTranslated

Data Processingmembranedev/application-s...

astronomer

Astronomer integration. Manage data, records, and automate workflows. Use when the user wants to interact with Astronomer data.

🇺🇸|EnglishTranslated

Data Processingmicrosoft/skills-for-fabr...

spark-authoring-cli

Develop Microsoft Fabric Spark/data engineering workflows with intelligent routing to specialized resources. Provides core workspace/lakehouse management and routes to: data engineering patterns, development workflow, or infrastructure orchestration. Use when the user wants to: (1) manage Fabric workspaces and resources, (2) develop notebooks and PySpark applications, (3) design data pipelines and orchestration, (4) provision infrastructure as code. Triggers: "develop notebook", "data engineering", "workspace setup", "pipeline design", "infrastructure provisioning", "Delta Lake patterns", "Spark development", "lakehouse configuration", "organize lakehouse tables", "create Livy session", "notebook deployment".

🇺🇸|EnglishTranslated

Data Processingmembranedev/application-s...

airbyte

Airbyte integration. Manage data, records, and automate workflows. Use when the user wants to interact with Airbyte data.

🇺🇸|EnglishTranslated

Data Processingdadbodgeoff/drift

checkpoint-resume

Exactly-once processing semantics with distributed coordination for file-based data pipelines. Atomic file claiming, status tracking, and automatic retry with in-memory fallback.

🇺🇸|EnglishTranslated

Data Processingvamseeachanta/workspace-h...

bsee-sodir-extraction

Extract and process energy data from BSEE (Gulf of Mexico) and SODIR (Norway) regulatory databases

🇺🇸|EnglishTranslated

Data Processingunopim/unopim

unopim-data-transfer

Import/export pipeline for UnoPim. Activates when configuring imports, exports, debugging job pipelines, or creating data transfer profiles; or when the user mentions import, export, CSV, Excel, job, queue, batch, or data transfer.

🇺🇸|EnglishTranslated

Automationpersonizeai/personize-ski...

personize-no-code-pipelines

Generates importable n8n workflow JSON files that sync data between Personize and 400+ apps. Produces ready-to-import workflows for batch sync, webhook ingestion, per-record AI enrichment, and data export — no code required. Use this skill whenever the user wants no-code integrations, visual workflows, n8n automation, or to connect Personize to HubSpot, Salesforce, Google Sheets, Slack, Postgres, or any app without writing code. Also trigger when they mention 'workflow automation', 'scheduled sync without code', 'visual pipeline', or 'connect Personize to [app]' and don't want to write TypeScript.

🇺🇸|EnglishTranslated

Data Processingaradotso/data-skills

apache-airflow-orchestration

Expert knowledge of Apache Airflow for building, scheduling, and monitoring data pipelines and workflows

🇺🇸|EnglishTranslated

Data Processinggoldsky-io/goldsky-agent

mirror

Use this skill when the user asks about Goldsky Mirror pipelines — creating, deploying, operating, or troubleshooting Mirror. Triggers on: 'Mirror pipeline', 'goldsky pipeline apply', 'sync subgraph to database', 'mirror vs turbo', 'direct indexing', 'mirror pipeline YAML', 'mirror pipeline pause/stop/restart'. Also use this skill when the user wants to sync a Goldsky subgraph into a database or message queue — Mirror is the only pipeline product that supports subgraph sources. For new pipelines that don't need a subgraph source, the turbo-builder skill is usually a better fit. Do NOT trigger on 'goldsky turbo' commands or generic 'build a pipeline' requests without subgraph context — those belong to the turbo skills.

🇺🇸|EnglishTranslated

Data Processingsnakeo/claude-debug-and-r...

refactor:pandas

Refactor Pandas code to improve maintainability, readability, and performance. Identifies and fixes loops/.iterrows() that should be vectorized, overuse of .apply() where vectorized alternatives exist, chained indexing patterns, inplace=True usage, inefficient dtypes, missing method chaining opportunities, complex filters, merge operations without validation, and SettingWithCopyWarning patterns. Applies Pandas 2.0+ features including PyArrow backend, Copy-on-Write, vectorized operations, method chaining, .query()/.eval(), optimized dtypes, and pipeline patterns.

🇺🇸|EnglishTranslated