Loading...
Loading...
Found 777 Skills
Build evaluation frameworks for agent systems. Use when testing agent performance, validating context engineering choices, or measuring improvements over time.
Verify claims in generated output against sources. Use as a separate pass AFTER content generation to catch hallucinations. Critical constraint - cannot be reliably combined with generation in a single pass.
Working memory management, context prioritization, and knowledge retention patterns for AI agents. Use when you need to maintain relevant context and avoid information loss during long tasks.
AI trustworthiness testing using OWASP AI Testing Guide v1. Execute 44 test cases across 4 layers (Application, Model, Infrastructure, Data) with practical payloads and remediation.
Tools and frameworks for AI red teaming including PyRIT, garak, Counterfit, and custom attack automation
Ralph Wiggum persistence loop with intelligent multi-model routing (Gemini, Codex, Claude, Council)
Build AI-powered Ruby applications with RubyLLM. Full lifecycle - chat, tools, streaming, Rails integration, embeddings, and production deployment. Covers all providers (OpenAI, Anthropic, Gemini, etc.) with one unified API.
State-of-the-art Machine Learning for PyTorch, TensorFlow, and JAX. Provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio. The industry standard for Large Language Models (LLMs) and foundation models in science.
Create or update Langfuse model pricing. Use when setting up new models, updating pricing, or configuring model costs.
List Langfuse sessions. Use when checking user sessions, analyzing conversation flows, or monitoring session activity.
Refine prompts for Claude models (Opus, Sonnet, Haiku) using Anthropic's best practices. Use when preparing complex tasks for Claude.
Use when user wants to find a note to publish as a blog post. Triggers on「选一篇笔记发博客」「note to blog」「写博客」「博客选题」. Scans Obsidian notes via Python script, evaluates blog-readiness, supports batch selection with fast/deep dual-track and parallel Agent dispatch.