Total 50,510 skills, Data Processing has 2560 skills
Showing 12 of 2560 skills
Guidance for data resharding tasks that involve reorganizing files across directory structures with constraints on file sizes and directory contents. This skill applies when redistributing datasets, splitting large files, or reorganizing data into shards while maintaining constraints like maximum files per directory or maximum file sizes. Use when tasks involve resharding, data partitioning, or directory-constrained file reorganization.
Expert-level Databricks platform, Apache Spark, Delta Lake, MLflow, notebooks, and cluster management
Use when "PyMC", "Bayesian", "MCMC", "probabilistic programming", or asking about "Bayesian regression", "hierarchical model", "NUTS sampler", "posterior distribution", "prior predictive", "credible intervals", "uncertainty quantification"
Use when building revenue analytics on HubSpot — SQL warehouse queries, API enrichment pipelines, lead scoring models, pipeline forecasting, competitive intelligence. Triggers on "hubspot analytics", "revops dashboard", "lead scoring", "pipeline forecast", "ICP analysis", "hubspot SQL".
Observability and monitoring for data pipelines using OpenTelemetry (traces) and Prometheus (metrics). Covers instrumentation, dashboards, and alerting.
Use public market data to check whether the Interest Rate Volatility (MOVE) is not spooked by interest rate events (such as JGB yield changes) and whether it leads VIX/credit spreads lower.
Database specialist covering PostgreSQL, MongoDB, Redis, and advanced data patterns for modern applications
Generate custom reports, query reports, and script reports for Frappe applications. Use when creating data analysis and reporting features.
Guide Claude through omicverse's single-cell clustering workflow, covering preprocessing, QC, multimethod clustering, topic modeling, cNMF, and cross-batch integration as demonstrated in t_cluster.ipynb and t_single_batch.ipynb.
Walk through omicverse's single-cell preprocessing tutorials to QC PBMC3k data, normalise counts, detect HVGs, and run PCA/embedding pipelines on CPU, CPU–GPU mixed, or GPU stacks.
Extract and process energy data from BSEE (Gulf of Mexico) and SODIR (Norway) regulatory databases
Convert data between formats (JSON, XML, CSV, YAML, TOML). Use when transforming data structures or migrating between data formats.