Loading...
Loading...
Found 7 Skills
Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation) Use when: prompt caching, cache prompt, response cache, cag, cache augmented.
Prompt caching for Claude API to reduce latency by up to 85% and costs by up to 90%. Activate for cache_control, ephemeral caching, cache breakpoints, and performance optimization.
Cost optimization patterns for LLM API usage — model routing by task complexity, budget tracking, retry logic, and prompt caching.
Build with Claude Messages API using structured outputs for guaranteed JSON schema validation. Covers prompt caching (90% savings), streaming SSE, tool use, and model deprecations. Prevents 16 documented errors. Use when: building chatbots/agents, troubleshooting rate_limit_error, prompt caching issues, streaming SSE parsing errors, MCP timeout issues, or structured output hallucinations.
AI session compression techniques for managing multi-turn conversations efficiently through summarization, embedding-based retrieval, and intelligent context management.
This skill provides comprehensive knowledge for working with the Anthropic Messages API (Claude API). It should be used when integrating Claude models into applications, implementing streaming responses, enabling prompt caching for cost savings, adding tool use (function calling), processing images with vision capabilities, or using extended thinking mode. Use when building chatbots, AI assistants, content generation tools, or any application requiring Claude's language understanding. Covers both server-side implementations (Node.js, Cloudflare Workers, Next.js) and direct API access. Keywords: claude api, anthropic api, messages api, @anthropic-ai/sdk, claude streaming, prompt caching, tool use, vision, extended thinking, claude 3.5 sonnet, claude 3.7 sonnet, claude sonnet 4, function calling, SSE, rate limits, 429 errors
Use when building an LLM-powered app that needs cost control via model routing, budget tracking, retry, and prompt caching.