Total 50,473 skills, Data Processing has 2559 skills
Showing 12 of 2559 skills
Access UniProt for protein sequence and annotation retrieval. Use this skill when: (1) Looking up protein sequences by accession, (2) Finding functional annotations, (3) Getting domain boundaries, (4) Finding homologs and variants, (5) Cross-referencing to PDB structures. For structure retrieval, use pdb. For sequence design, use proteinmpnn.
Materialize documentation for SQL syntax, data ingestion, concepts, and best practices. Use when users ask about Materialize queries, sources, sinks, views, or clusters.
Pandas for time series analysis, OrcaFlex results processing, and marine engineering data workflows
Complete guide for Apache Airflow orchestration including DAGs, operators, sensors, XComs, task dependencies, dynamic workflows, and production deployment
Transform raw data into analytical assets using ETL/ELT patterns, SQL (dbt), Python (pandas/polars/PySpark), and orchestration (Airflow). Use when building data pipelines, implementing incremental models, migrating from pandas to polars, or orchestrating multi-step transformations with testing and quality checks.
Create data-driven charts with Vega-Lite (simple) and Vega (advanced). Best for bar, line, scatter, heatmap, area charts, and multi-series analytics. Use when you have numeric data arrays needing statistical visualization. Vega for radar charts and word clouds. NOT for process diagrams (use mermaid) or quick KPI cards (use infographic).
Apache Airflow workflow orchestration. Use for data pipelines.
Data acquisition for web scraping and data collection. Use when user needs "爬取数据/抓取网页/scrape data". Outputs structured JSON/CSV for analysis.
Strategic guidance for designing modern data platforms, covering storage paradigms (data lake, warehouse, lakehouse), modeling approaches (dimensional, normalized, data vault, wide tables), data mesh principles, and medallion architecture patterns. Use when architecting data platforms, choosing between centralized vs decentralized patterns, selecting table formats (Iceberg, Delta Lake), or designing data governance frameworks.
Comprehensive HPK (proprietary healthcare message format) parser and explainer. Supports 100+ message types across patient administration (ID, MV, CV), supply chain (PR, FO, MA, CO, LI, RO, FA), inventory (SO, IM), organizational structure (ST, UT), and financial operations (RD, DD). Uses @erp-pas/hpk-dictionary as source of truth. Validates structure, extracts fields, explains business context, maps to HL7 v2.5/IHE PAM, and troubleshoots integration issues.
Expert customer health scoring and analytics guidance. Use when designing health scores, building churn prediction models, analyzing usage metrics, identifying at-risk accounts, creating executive dashboards, or performing cohort analysis. Use for leading indicator development, customer data enrichment, risk escalation frameworks, and retention analytics.
Optimize provider selection, routing, and credit usage across 150+ enrichment sources for company/contact intelligence.