Total 50,402 skills, Data Processing has 2557 skills
Showing 12 of 2557 skills
Use when you want to retrieve quantitative RNA expression data and variant eQTL information from the GTEx (Genotype-Tissue Expression) Project across 54 non-diseased tissue sites.
Query the JASPAR database for Transcription Factor (TF) binding profiles. Use when retrieving Position Frequency Matrices (PFMs) or Position Weight Matrices (PWMs) for specific TFs, resolving gene symbols to JASPAR Matrix IDs, or getting TF metadata. Supports multiple output formats (MEME, TRANSFAC, PFM, JASPAR, YAML).
Query the STRING database for protein-protein interactions (PPIs), functional enrichment, and homology. Use when the user asks about interactions between specific proteins, interaction evidence, confidence scores, protein interaction partners, or pathway enrichments.
Fetch Evolutionary Conservation scores (phyloP, phastCons) and Transcription Factor Binding Sites (TFBS) from the UCSC Genome Browser. Use when analyzing whether genomic variants or regions are evolutionarily conserved, functionally important, or bounded by TF regulators across major projects (ENCODE, JASPAR, ReMap).
Guides advanced long-term actuarial mathematics (SOA ALTAM)—survival models, life insurance and annuity APVs, premiums and reserves (equivalence principle, Thiele), multiple decrement and Markov states, yield-curve discounting, mortality improvement, longevity risk, profit testing, and mortality graduation. Tool-agnostic, concept-first. Use when the user mentions advanced long-term actuarial mathematics, ALTAM, survival model, life insurance reserve, annuity valuation, equivalence principle, Thiele equation, multiple decrement, force of mortality, longevity risk, mortality improvement, actuarial present value, or net premium reserve—not ASTAM/P&C (advanced-short-term-actuarial-mathematics), workpapers only (actuarial-analyst), appointed actuary (appointed-chief-actuary), assumption governance (assumption-setting), ALM detail (asset-liability-management), or exam-only deliverables.
Design data architecture at enterprise and solution levels. Cover data mesh, lakehouse, governance, domain-driven design, conceptual/logical/physical data modeling, platform selection, and compliance frameworks. Produce ADRs, data model diagrams, platform comparison matrices, and governance policy templates. Triggers on "design data platform", "choose data warehouse", "data mesh", "lakehouse architecture", "data governance", "data modeling", "platform selection", "data architecture decision", "compliance framework", or "data strategy". For applied AI solution architecture (RAG data plane, embeddings, vector stores in commercial or enterprise products), use applied-ai-architect-commercial-enterprise. For dbt analytics layers and mart delivery, use analytics-data-engineer—not data-architect.
Manage data programs, governance operations, and data reliability. Cover data roadmaps, stakeholder coordination, metadata stewardship, lifecycle management, monitoring, incident response, capacity planning, and SLA frameworks. Triggers on "manage data team", "data roadmap", "governance review", "data incident", "SLA framework", "data ops", "stewardship", "data product delivery", or "data KPIs". Human annotation/labeling platform PM: product-management-human-data-platform.
HK Stock Dividend Tracker. Monitor dividend policies, dividend history, dividend yields and other metrics of Hong Kong-listed companies. Used for income investing and dividend strategy analysis.
Answer questions about spatial data using DuckDB. Use when the user mentions locations, coordinates, lat/lng, distances, maps, addresses, "near", "within", "closest", geographic names, or spatial file formats (GeoJSON, Shapefile, GeoPackage, GPX, GeoParquet). Also triggers when the user wants to find places, buildings, or roads — Overture Maps provides free global data on S3 with zero API keys. Handles spatial joins, distance calculations, containment checks, density analysis, and format conversions for geographic data.
Technical analysis with 130+ indicators using pandas-ta for crypto market data
Use when writing or modifying Python code that imports `genoray` to read genotypes/dosages from VCF, PGEN, or SparseVar (`.svar`) files. Covers the public API surface, mode constants, range queries, chunking, filtering, and the SparseVar workflow. Skip for unrelated bioinformatics work.
Run a Bayesian A/B test on conversion data using PyMC. Use when the user wants to compare two variants (landing pages, emails, pricing, UI changes) and decide which to ship using posterior probabilities and expected loss instead of p-values. Covers Beta-Binomial model, ROPE, expected loss, sample-size guidance, and ArviZ diagnostics.