Total 30,744 skills, Data Processing has 1471 skills
Showing 12 of 1471 skills
Use when validating data with Standard Schema-compatible schemas or handling ValidationError results.
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Use this skill when spreadsheet files are the primary input or output. This means the user wants to: open, read, edit, or repair existing .xlsx, .xlsm, .csv, or .tsv files (e.g., add columns, calculate formulas, format, create charts, clean messy data); create new spreadsheets from scratch or from other data sources; or convert between spreadsheet file formats. Trigger this especially when the user references a spreadsheet file by name or path—even casually (such as "the xlsx in my downloads")—and wants to process it or generate content from it. It's also used to clean or reorganize messy tabular data files (rows with incorrect formatting, misaligned headers, garbage data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do not trigger this when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
Systematisk analys av intervjutranskript enligt Gioia-metodiken. [VAD] Analyserar kvalitativa intervjutranskript genom 1st-order kodning (informantspråk), 2nd-order tematisering (forskarkonceptualisering) och aggregerade dimensioner. Producerar standardiserad dokumentation med indexerade, export-ready citat. [NÄR] Use when: gioia, intervjuanalys, transkriptanalys, kvalitativ analys, kodning, tematisk analys, grounded theory, 1st-order, 2nd-order [SPRÅK] Svenska (primärt), engelska vid behov [KÄLLA] Gioia, D.A., Corley, K.G. & Hamilton, A.L. (2013). Seeking Qualitative Rigor in Inductive Research. Organizational Research Methods.
Screens and selects stocks based on customizable quantitative criteria including value, growth, quality, and momentum factors
Detect and mask PII (names, emails, phones, SSN, addresses) in text and CSV files. Multiple masking strategies with reversible tokenization option.
Use this skill whenever the user wants to find trading opportunities, detect arbitrage, analyze a market, perform edge detection, find mispricing, do probability analysis, evaluate orderbook depth, find momentum signals, or assess Polymarket market quality. Triggers: "find opportunities", "detect arbitrage", "analyze market", "edge detection", "mispricing", "probability analysis", "orderbook analysis", "momentum scanner", "market inefficiency", "price gap", "volume surge", "trading edge", "market analysis".
Best practices for doing quick exploratory data analysis with minimal code and a Pandas .plot like API using HoloViews hvPlot.
Track data lineage and provenance from source to consumption. Use when auditing data flows, debugging data quality issues, ensuring compliance (GDPR, SOX), or understanding data dependencies. Covers lineage tracking, impact analysis, data catalogs, and metadata management.
Explains core Apache Beam programming model concepts including PCollections, PTransforms, Pipelines, and Runners. Use when learning Beam fundamentals or explaining pipeline concepts.
Enrich a CSV with any data field using a waterfall pattern: try multiple providers in sequence, stop at the first successful match. Prevents paying for duplicate lookups and maximizes fill rates. Triggers: - "enrich my lead list" - "add [field] to my CSV" - "waterfall enrichment" - "try multiple providers to find [data]" Requires: Deepline CLI — https://code.deepline.com
Comprehensive guide for interacting with the hydric Liquidity Pools Indexer (Envio/HyperIndex). Use this skill when you need to (1) Query real-time Liquidity Pool data like TVL, Volume, Fees, or Yields/APY, (2) Fetch cross-chain token metadata and prices, (3) Aggregate protocol data (Uniswap, etc.), (4) Retrieve historical time-series data for generic analytics