Search Results: data-processing

Found 278 Skills

tenzir-docs

Answer questions using the Tenzir documentation. Use whenever the user asks about TQL syntax, pipeline operators, functions, data parsing or transformation, normalization, OCSF mapping, enrichment, lookup tables, contexts, packages, nodes, platform setup, deployment, configuration, integrations with tools like Splunk, Kafka, S3, Elasticsearch, or any other Tenzir feature. Also use when the user asks how to collect, route, filter, aggregate, or export security data with Tenzir, or needs help writing or debugging TQL pipelines, even if they don't mention 'Tenzir' explicitly but are clearly working in a Tenzir context.

🇺🇸|EnglishTranslated

Tools & Utilitiesnetresearch/data-tools-sk...

data-tools

Use when querying, transforming, or editing structured data (JSON, YAML, TOML, XML, CSV). Prefer these tools over grep/sed/awk on structured formats.

🇺🇸|EnglishTranslated

Data Processinggeneraljerel/chalk-skills

synthesize-feedback

Synthesize customer feedback into thematic clusters when the user asks to analyze feedback, review VoC data, or understand customer sentiment

🇺🇸|EnglishTranslated

Data Processingwithqwerty/nutmeg

nutmeg-wrangle

Transform, filter, reshape, join, and manipulate football data. Use when the user needs to clean data, merge datasets, convert between formats, handle missing values, work with large datasets, or do any data manipulation task on football data.

🇺🇸|EnglishTranslated

Data Processingwithqwerty/nutmeg

nutmeg

Football data analytics — the single entry point. Use whenever the user mentions football data, xG, expected goals, match analysis, player stats, scouting, match reports, shot maps, passing networks, Premier League data, Champions League stats, scraping FBref/Understat/Transfermarkt, building football charts, or anything football analytics related. Routes to specialised sub-skills automatically. Also handles first-time setup and profile management.

🇺🇸|EnglishTranslated

Automationmaxandersen/skills

prefer-jbang-automation

Use when about to use jq, curl, sed, awk, or bash for JSON/XML processing, API calls, data transformation, or file processing - before writing any bash commands for data manipulation

🇺🇸|EnglishTranslated

Data Processingyaooqinn/spark-history-cl...

spark-history-cli

Query a running Apache Spark History Server from Copilot CLI. Use this whenever the user wants to inspect SHS applications, jobs, stages, executors, SQL executions, environment details, or event logs, especially when they mention Spark History Server, SHS, event log history, benchmark runs, or application IDs.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingjarmen423/skills

crawl4ai-openrouter

Use Crawl4AI for web crawling, markdown extraction, and LLM-powered structured extraction through OpenRouter. Use when the user mentions Crawl4AI, unclecode/crawl4ai, wants website data extracted with Crawl4AI, or needs an agent to crawl pages and turn them into structured JSON with OpenRouter-backed models.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingclickhouse/agent-skills

chdb-datastore

Drop-in pandas replacement with ClickHouse performance. Use `import chdb.datastore as pd` (or `from datastore import DataStore`) and write standard pandas code — same API, 10-100x faster on large datasets. Supports 16+ data sources (MySQL, PostgreSQL, S3, MongoDB, ClickHouse, Iceberg, Delta Lake, etc.) and 10+ file formats (Parquet, CSV, JSON, Arrow, ORC, etc.) with cross-source joins. Use this skill when the user wants to analyze data with pandas-style syntax, speed up slow pandas code, query remote databases or cloud storage as DataFrames, or join data across different sources — even if they don't explicitly mention chdb or DataStore. Do NOT use for raw SQL queries, ClickHouse server administration, or non-Python languages.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingtytodd/semantic-scholar-s...

trace-citations

Trace the citation neighborhood around one focal paper into foundations, descendants, bridges, weak edges, and optional second-hop links

🇺🇸|EnglishTranslated

18 scripts/Attention

Data Processinggemini-cli-extensions/dat...

gcp-spark

Develops and executes Spark code on Dataproc Clusters and Serverless. Reads and writes data using BigLake Iceberg catalogs, BigQuery and Spanner. Debugs execution failures. Use when: - Writing Spark ETL pipelines on GCP. - Training or running inference with ML models with spark on GCP. - Managing Spark clusters, jobs, batches, and interactive sessions. Don't use when: - Writing generic Python scripts that don't use Spark. - Performing simple SQL queries that can be done directly in BigQuery.

🇺🇸|EnglishTranslated

Document Processinganthropics/financial-serv...

pitch-deck

Populates investment banking pitch deck templates with data from source files. Use when: user provides a PowerPoint template to fill in, user has source data (Excel/CSV) to populate into slides, user mentions populating or filling a pitch deck template, or user needs to transfer data into existing slide layouts. Not for creating presentations from scratch.

🇺🇸|EnglishTranslated