Search Results: dataset

Found 328 Skills

Data Processingmicrosoft/skills-for-fabr...

powerbi-authoring-cli

Create, manage, and deploy Power BI semantic models inside Microsoft Fabric workspaces via `az rest` CLI against Fabric and Power BI REST APIs. Use when the user wants to: (1) create a semantic model from TMDL definition files, (2) retrieve or download semantic model definitions, (3) update a semantic model definition with modified TMDL, (4) trigger or manage dataset refresh operations, (5) configure data sources, parameters, or permissions, (6) deploy semantic models between pipeline stages. Covers Fabric Items API (CRUD) and Power BI Datasets API (refresh, data sources, permissions). For read-only DAX queries, use `powerbi-consumption-cli`. For fine-grained modeling changes, route to `powerbi-modeling-mcp`. Triggers: "create semantic model", "upload TMDL", "download semantic model TMDL", "refresh dataset", "semantic model deployment pipeline", "dataset permissions", "list dataset users", "semantic model authoring".

🇺🇸|EnglishTranslated

Data Processinggemini-cli-extensions/big...

bigquery-data

Use these skills when you need to handle large-scale data exploration and dataset management. Use when users need to find data assets or run SQL at scale. Provides metadata discovery and query execution across the data warehouse.

🇺🇸|EnglishTranslated

6 scripts/Attention

Data Processingdaemon-blockint-tech/agen...

data-scrubbing

Guides cleaning and standardizing tabular datasets before analysis, modeling, or reporting—profiling, quality rules, missing values, duplicates, outliers, type coercion, encoding fixes, record linkage, deduplication, high-level PII handling (not legal advice), actuarial/insurance field scrubbing, reproducible scrub pipelines, validation checks, and sign-off. Distinct from warehouse ETL or statistical modeling. Use when the user asks for "data scrubbing", "clean this dataset", "scrub the data", "data cleaning", "dedupe records", "handle missing values", "outlier treatment", "standardize columns", "data quality rules", "profile this table", or "prepare data for modeling". Not warehouse pipelines (data-warehouse-engineer), ML modeling (data-scientist, actuary), privacy programs (compliance-engineer), FinOps only (finops-analyst), or assumption governance (assumption-setting).

🇺🇸|EnglishTranslated

Data Processingwentorai/research-plugins

scraping-skills

6 web scraping & data collection skills. Trigger: collecting web data, finding datasets, API access for research. Design: ethical scraping methods with rate limiting and data quality checks.

🇺🇸|EnglishTranslated

AI & Machine Learningawslabs/agent-plugins

dataset-evaluation

Validates dataset formatting and quality for SageMaker model fine-tuning (SFT, DPO, or RLVR). Use when the user says "is my dataset okay", "evaluate my data", "check my training data", "I have my own data", or before starting any fine-tuning job. Detects file format, checks schema compliance against the selected model and technique, and reports whether the data is ready for training or evaluation.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingvoxel51/fiftyone-skills

fiftyone-embeddings-visualization

Visualizes datasets in 2D using embeddings with UMAP or t-SNE dimensionality reduction. Use when exploring dataset structure, finding clusters, identifying outliers, or understanding data distribution.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesaliyun/alibabacloud-aiops...

alibabacloud-cms-dataset

Alicloud CMS Dataset lifecycle management and querying skill. Covers listing, inspecting, creating, updating, deleting datasets and executing dataset queries via the aliyun CLI (CMS API version 2024-03-30). Triggers: "CMS dataset", "数据集", "创建数据集", "查询数据集", "dataset 查询", "ExecuteQuery", "CreateDataset", "GetDataset", "ListDatasets", "UpdateDataset", "DeleteDataset".

🇺🇸|EnglishTranslated

AI & Machine Learningaradotso/mcp-skills

datagouv-mcp-server

Use the data.gouv.fr MCP server to search, explore, and analyze French Open Data datasets through AI chatbots

🇺🇸|EnglishTranslated

AI & Machine Learningnousresearch/hermes-agent

huggingface-hub

HuggingFace hf CLI: search/download/upload models, datasets.

🇺🇸|EnglishTranslated

AI & Machine Learninglllllllama/ai-paper-repro...

paper-context-resolver

Optional sub-skill for README-first AI repo reproduction. Use only when README and repository files leave a narrow reproduction-critical gap and the task is to resolve a specific paper detail such as dataset split, preprocessing, evaluation protocol, checkpoint mapping, or runtime assumption from primary paper sources while recording conflicts. Do not use for general paper summary, repo scanning, environment setup, command execution, title-only paper lookup, or replacing README guidance by default.

🇺🇸|EnglishTranslated

140.1k

AI & Machine Learninglllllllama/ai-paper-repro...

env-and-assets-bootstrap

Sub-skill for environment and asset preparation in README-first AI repo reproduction. Use when the task is specifically to prepare a conservative conda-first environment, checkpoint and dataset path assumptions, cache location hints, and setup notes before any run on a README-documented repository. Do not use for repo scanning, full orchestration, paper interpretation, final run reporting, or generic environment setup that is not tied to a specific reproduction target.

🇺🇸|EnglishTranslated

139.7k

2 scripts/Attention

AI & Machine Learninglllllllama/rigorpilot-ski...

ai-research-explore

Rigor Explore compatible skill slug for meaningful and potentially novel deep learning research candidates. Use when the researcher has chosen the task family, dataset, benchmark, evaluation method, provided SOTA references, and wants candidate-only exploration on top of `current_research` with auditable repo understanding, idea gating, fair comparison, and governed experiments written to `explore_outputs/`. Do not use for README-first trusted reproduction, open-ended direction finding, narrow code-only or run-only exploration, passive repo analysis, verified novelty claims, or implicit experimentation.

🇺🇸|EnglishTranslated

38.6k

26 scripts/Attention