Search Results: llm

Found 1,564 Skills

narev-update-llm-pricing

Update LLM prices in the repo: Use this skill to snapshot live LLM pricing into a checked-in file so billing or cost math can run offline with deterministic rates. Use for any language or stack (TypeScript, Python, Go, JSON registries, etc.) — not only typescript. Use when the user wants pinned prices, wants to remove a runtime dependency on the Narev API, wants to refresh a committed pricing file, or mentions "snapshot pricing", "freeze prices", "pin model rates", "regenerate pricing file", "update pricing in the repo", or "sync token pricing from Narev".

🇺🇸|EnglishTranslated

AI & Machine Learningdatadog-labs/agent-skills

llm-obs-eval-bootstrap

Bootstrap evaluators from production traces — emit SDK code, a framework-agnostic JSON spec, or publish online LLM-judge evaluators directly to Datadog. Use when user says "bootstrap evaluators", "generate evaluators", "create evals from traces", "eval bootstrap", "write evaluators", "build eval suite", "publish evaluators", or wants to generate BaseEvaluator/LLMJudge code or online judge configs from production LLM trace data. Works with ml_app and optional RCA report or failure hypothesis.

🇺🇸|EnglishTranslated

AI & Machine Learningllmquant/skills

llmquant-equities

Router skill for LLMQuant equities workflows. Use when the user needs stock analysis, equity comparison, research memos, merger-arb memos, or sell/take-profit work.

🇺🇸|EnglishTranslated

AI & Machine Learningllmquant/skills

llmquant-strategies

Router skill for LLMQuant hedge-fund and PM strategy workflows. Use when the user needs equity long/short, long-biased, event-driven, macro, quant, or multi-strategy playbooks.

🇺🇸|EnglishTranslated

AI & Machine Learningllmquant/skills

llmquant-portfolio

Router skill for LLMQuant portfolio workflows. Use when the user needs company profiles, thesis tracking, theme research, watchlist monitoring, or alert management.

🇺🇸|EnglishTranslated

Code Qualityexistential-birds/beagle

llm-judge

LLM-as-judge methodology for comparing code implementations across repositories. Scores implementations on functionality, security, test quality, overengineering, and dead code using weighted rubrics. Used by /beagle:llm-judge command.

🇺🇸|EnglishTranslated

AI & Machine Learningitsmostafa/llm-engineerin...

context-engineering

Strategies for managing LLM context windows effectively in AI agents. Use when building agents that handle long conversations, multi-step tasks, tool orchestration, or need to maintain coherence across extended interactions.

🇺🇸|EnglishTranslated

AI & Machine Learningitsmostafa/llm-engineerin...

agents

Patterns and architectures for building AI agents and workflows with LLMs. Use when designing systems that involve tool use, multi-step reasoning, autonomous decision-making, or orchestration of LLM-driven tasks.

🇺🇸|EnglishTranslated

AI & Machine Learningeyadsibai/ltk

llm-inference

Use when "LLM inference", "serving LLM", "vLLM", "llama.cpp", "GGUF", "text generation", "model serving", "inference optimization", "KV cache", "continuous batching", "speculative decoding", "local LLM", "CPU inference"

🇺🇸|EnglishTranslated

AI & Machine Learningshimo4228/claude-code-lea...

cost-aware-llm-pipeline

Use when building an LLM-powered app that needs cost control via model routing, budget tracking, retry, and prompt caching.

🇺🇸|EnglishTranslated

AI & Machine Learningancoleman/ai-design-compo...

evaluating-llms

Evaluate LLM systems using automated metrics, LLM-as-judge, and benchmarks. Use when testing prompt quality, validating RAG pipelines, measuring safety (hallucinations, bias), or comparing models for production deployment.

🇺🇸|EnglishTranslated

9 scripts/Attention

AI & Machine Learningvuralserhat86/antigravity...

llm_evaluation

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

🇺🇸|EnglishTranslated