Total 50,476 skills, Data Processing has 2559 skills
Showing 12 of 2559 skills
Data discovery and analysis specialist focused on extracting actionable insights from complex datasets, identifying patterns and anomalies, and transforming raw data into strategic intelligence. Excels at multi-source data integration, advanced analytics, and data-driven decision support.
Use when running a dbt Fusion project with Astronomer Cosmos. Covers Cosmos 1.11+ configuration for Fusion on Snowflake/Databricks with ExecutionMode.LOCAL. Before implementing, verify dbt engine is Fusion (not Core), warehouse is supported, and local execution is acceptable. Does not cover dbt Core.
Diagnose ClickHouse merge performance, part backlog, and 'too many parts' errors. Use for merge issues and part management problems.
Analyze ClickHouse table structure, partitioning, ORDER BY keys, materialized views, and identify schema design anti-patterns. Use for table design issues and optimization.
Retrieve detailed revenue breakdown by product segment for public companies. Use when analyzing product mix, revenue concentration, segment contribution, or business line performance.
Analyze cryptocurrency market sentiment using Fear & Greed Index, news analysis, and market momentum. Use when gauging overall market mood, checking if markets are fearful or greedy, or analyzing sentiment for specific coins. Trigger with phrases like "analyze crypto sentiment", "check market mood", "is the market fearful", "sentiment for Bitcoin", or "Fear and Greed index".
Complete guide for Apache Spark data processing including RDDs, DataFrames, Spark SQL, streaming, MLlib, and production deployment
Retrieves protein structure data from RCSB PDB, PDBe, and AlphaFold with protein disambiguation, quality assessment, and comprehensive structural profiles. Creates detailed structure reports with experimental metadata, ligand information, and download links. Use when users need protein structures, 3D models, crystallography data, or mention PDB IDs (4-character codes like 1ABC) or UniProt accessions.
Retrieve paper metadata from arXiv using keyword queries and save results as JSONL (`papers/papers_raw.jsonl`). **Trigger**: arXiv, arxiv, paper search, metadata retrieval, 文献检索, 论文检索, 拉取元数据, 离线导入. **Use when**: 需要一个初始论文集合(survey/snapshot 的 Stage C1),来源为 arXiv(在线检索或离线导入 export)。 **Skip if**: 已经有可用的 `papers/papers_raw.jsonl`,或数据源不是 arXiv。 **Network**: 在线检索需要网络;离线 `--input <export.*>` 不需要网络。 **Guardrail**: 只做 metadata;不要在 `output/` 写长 prose。
Python package for working with DICOM files. It allows you to read, modify, and write DICOM data in a Pythonic way. Essential for medical imaging processing, clinical data extraction, and AI in radiology.
Expert-level research methodology, academic writing, statistical analysis, and scientific investigation
Use when animating charts, graphs, dashboards, data transitions, or any information visualization work.