Loading...
Loading...
Found 22 Skills
Compare a paper's claims against its public codebase. Use when the user asks to audit a paper, check code-claim consistency, verify reproducibility of a specific paper, or find mismatches between a paper and its implementation.
Prepare a research artifact package for conference artifact evaluation, reproducibility review, badges, supplementary material, or post-acceptance artifact release. Use this skill whenever the user needs install instructions, reviewer-facing reproduction commands, Docker or environment checks, data/checkpoint packaging, hardware/runtime estimates, anonymized or public artifact metadata, artifact evaluation forms, or a claim-to-artifact reproducibility audit for ML/AI venues.
Use when designing or auditing computer science experiments, evaluation plans, baselines, metrics, ablations, datasets, statistical tests, benchmarks, validity threats, or reproducibility claims.
Use when creating, repairing, refactoring, validating, or documenting an academic research repository structure, including wiki, sources, SOTA, outputs, agent docs, tests, and reproducibility folders.
Use when preparing academic artifacts, reproducibility packages, artifact evaluation submissions, open science materials, code/data release, model cards, dataset cards, or replication bundles.
Systematic peer review toolkit. Evaluate methodology, statistics, design, reproducibility, ethics, figure integrity, reporting standards, for manuscript and grant review across disciplines.
QA an analysis before sharing with stakeholders — methodology checks, accuracy verification, and bias detection. Use when reviewing an analysis for errors, checking for survivorship bias, validating aggregation logic, or preparing documentation for reproducibility.
Paper reviewer that evaluates machine learning research projects following official ICML reviewer guidelines. Provides comprehensive reviews with actionable feedback across all key dimensions: claims/evidence, relation to prior work, originality, significance, clarity, and reproducibility. Also provides formative feedback on incomplete drafts, proposals, and research code repositories. MANDATORY TRIGGERS: review paper, ICML review, paper review, evaluate paper, research paper feedback, ML paper review, conference review, academic review, paper critique, NeurIPS review, ICLR review, project proposal, research proposal, paper draft, early feedback, incomplete paper, work in progress, WIP review, review repo, review codebase, research project review
End-to-end data science and ML engineering workflows: problem framing, data/EDA, feature engineering (feature stores), modelling, evaluation/reporting, plus SQL transformations with SQLMesh. Use for dataset exploration, feature design, model selection, metrics and slice analysis, model cards/eval reports, experiment reproducibility, and production handoff (monitoring and retraining).
Make every number in the final PDF traceable to the exact code line that produced it. Uses \hypertarget/\hyperlink LaTeX commands and \num{formula} evaluated at compile time. Use for reproducibility and data integrity verification.
This skill should be used when working with LaminDB, an open-source data framework for biology that makes data queryable, traceable, reproducible, and FAIR. Use when managing biological datasets (scRNA-seq, spatial, flow cytometry, etc.), tracking computational workflows, curating and validating data with biological ontologies, building data lakehouses, or ensuring data lineage and reproducibility in biological research. Covers data management, annotation, ontologies (genes, cell types, diseases, tissues), schema validation, integrations with workflow managers (Nextflow, Snakemake) and MLOps platforms (W&B, MLflow), and deployment strategies.
Audit a CS or AI research project for reproducibility across environment, data, code, configuration, logging, and documentation. Use this skill whenever the user wants to make experiments reproducible, prepare code for collaborators, debug environment drift, write a README, package a project for paper release, or ensure they can rerun results months later.