Total 50,503 skills, Data Processing has 2560 skills
Showing 12 of 2560 skills
Use when user wants to extract text from ebooks (EPUB, MOBI, PDF). Use for converting ebooks to plain text for analysis, processing, or reading. Handles all common ebook formats.
Manage Jupyter notebooks — create, execute cells, manage kernels via the container's Jupyter Server REST API.
Walk Claude through PyDESeq2-based differential expression, including ID mapping, DE testing, fold-change thresholding, and enrichment visualisation.
Turn bulk RNA-seq cohorts into synthetic single-cell datasets using omicverse's Bulk2Single workflow for cell fraction estimation, beta-VAE generation, and quality control comparisons against reference scRNA-seq.
Execute use when you need to work with query optimization. This skill provides query performance analysis with comprehensive guidance and automation. Trigger with phrases like "optimize queries", "analyze performance", or "improve query speed".
Performance attribution: Brinson (allocation/selection/interaction), factor-based attribution, fixed-income attribution.
Build interactive data applications and dashboards with pure Python - no frontend experience required
Discover genes associated with diseases and traits using GWAS data from the GWAS Catalog (500,000+ associations) and Open Targets Genetics (L2G predictions). Identifies genetic risk factors, prioritizes causal genes via locus-to-gene scoring, and assesses druggability. Use when asked to find genes associated with a disease or trait, discover genetic risk factors, translate GWAS signals to gene targets, or answer questions like "What genes are associated with type 2 diabetes?"
Analyze spatial transcriptomics data to map gene expression in tissue architecture. Supports 10x Visium, MERFISH, seqFISH, Slide-seq, and imaging-based platforms. Performs spatial clustering, domain identification, cell-cell proximity analysis, spatial gene expression patterns, tissue architecture mapping, and integration with single-cell data. Use when analyzing spatial transcriptomics datasets, studying tissue organization, identifying spatial expression patterns, mapping cell-cell interactions in tissue context, characterizing tumor microenvironment spatial structure, or integrating spatial and single-cell RNA-seq data for comprehensive tissue analysis.
Reading and writing data with Pandas from/to cloud storage (S3, GCS, Azure) using fsspec and PyArrow filesystems.
Exploratory Data Analysis (EDA): profiling, visualization, correlation analysis, and data quality checks. Use when understanding dataset structure, distributions, relationships, or preparing for feature engineering and modeling.
Use this skill for data pipeline work — ingestion with dlt, transformations with sqlmesh, analytics with DuckDB/MotherDuck, DataFrames with polars, notebooks with marimo, and project management with uv.