Search Results: dataset

Found 335 Skills

Backend Developmentaj-geddes/useful-ai-promp...

background-job-processing

Implement background job processing systems with task queues, workers, scheduling, and retry mechanisms. Use when handling long-running tasks, sending emails, generating reports, and processing large datasets asynchronously.

🇺🇸|EnglishTranslated

AI & Machine Learningtondevrel/scientific-agen...

xgboost-lightgbm

Industry-standard gradient boosting libraries for tabular data and structured datasets. XGBoost and LightGBM excel at classification and regression tasks on tables, CSVs, and databases. Use when working with tabular machine learning, gradient boosting trees, Kaggle competitions, feature importance analysis, hyperparameter tuning, or when you need state-of-the-art performance on structured data.

🇺🇸|EnglishTranslated

Frontend Developmentantvis/l7

antv-l7

Comprehensive guide for AntV L7 geospatial visualization library. Use when users need to: (1) Create interactive maps with WebGL rendering (2) Visualize geographic data (points, lines, polygons, heatmaps) (3) Build location-based data dashboards (4) Add map layers, interactions, or animations (5) Process and display GeoJSON, CSV, or other spatial data (6) Integrate maps with AMap (GaodeMap), Mapbox, Maplibre, or standalone L7 Map (7) Optimize performance for large-scale geographic datasets

🇺🇸|EnglishTranslated

AI & Machine Learningorq-ai/assistant-plugins

run-experiment

Create and run orq.ai experiments — compare configurations against datasets using evaluators, analyze results, and generate prioritized action plans. Use when evaluating LLM agents, deployments, conversations, or RAG pipelines end-to-end. Do NOT use without a dataset and evaluators. Do NOT use for cross-framework comparisons with external agents (use compare-agents).

🇺🇸|EnglishTranslated

AI & Machine Learningorq-ai/assistant-plugins

analyze-trace-failures

Read production traces, identify what's failing, and build failure taxonomies using open coding and axial coding methodology. Use when debugging agent or pipeline quality, investigating "why are my outputs bad?", or before building any evaluator — error analysis must come first. Do NOT use when you already have identified failure modes and need evaluators (use build-evaluator) or datasets (use generate-synthetic-dataset).

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesgetcompanion-ai/feynman

runpod-compute

Provision and manage GPU pods on RunPod for long-running experiments. Use when the user needs persistent GPU compute with SSH access, large datasets, or multi-step experiments.

🇺🇸|EnglishTranslated

Data Processingdavila7/claude-code-templ...

vaex

Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that don't fit in memory.

🇺🇸|EnglishTranslated

Data Processinganthropics/knowledge-work...

nextflow-development

Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or public datasets from GEO/SRA. Triggers on nf-core, Nextflow, FASTQ analysis, variant calling, gene expression, differential expression, GEO reanalysis, GSE/GSM/SRR accessions, or samplesheet creation.

🇺🇸|EnglishTranslated

10 scripts/Checked

AI & Machine Learningyonatangross/orchestkit

golden-dataset-validation

Use when validating golden dataset quality. Runs schema checks, duplicate detection, and coverage analysis to ensure dataset integrity for AI evaluation.

🇺🇸|EnglishTranslated

Data Processingdkyazzentwatwa/chatgpt-sk...

dataset-comparer

Compare two datasets to find differences, added/removed rows, changed values. Use for data validation, ETL verification, or tracking changes.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learninglubu-labs/langchain-agent...

langgraph-testing-evaluation

Use this skill when you need to test or evaluate LangGraph/LangChain agents: writing unit or integration tests, generating test scaffolds, mocking LLM/tool behavior, running trajectory evaluation (match or LLM-as-judge), running LangSmith dataset evaluations, and comparing two agent versions with A/B-style offline analysis. Use it for Python and JavaScript/TypeScript workflows, evaluator design, experiment setup, regression gates, and debugging flaky/incorrect evaluation results.

🇺🇸|EnglishTranslated

11 scripts/Attention

AI & Machine Learningsundial-org/skills

tinker-training-cost

Calculate training costs for Tinker fine-tuning jobs. Use when estimating costs for Tinker LLM training, counting tokens in datasets, or comparing Tinker model training prices. Tokenizes datasets using the correct model tokenizer and provides accurate cost estimates.

🇺🇸|EnglishTranslated