Loading...
Loading...
Found 125 Skills
Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve. Supports multiple datasets (random, sharegpt, sonnet, HF), backends (openai, openai-chat, vllm-pooling, embeddings), throughput/latency testing with request-rate control, and result saving. Use when benchmarking LLM serving performance, measuring TTFT/TPOT, or load testing inference APIs.
Build search applications and query log analytics data with OpenSearch. Use this skill when the user mentions OpenSearch, search app, index setup, search architecture, semantic search, vector search, hybrid search, BM25, dense vector, sparse vector, agentic search, RAG, embeddings, KNN, PDF ingestion, document processing, or any related search topic. Also use for log analytics and observability — when the user wants to set up log ingestion, query logs with PPL, analyze error patterns, set up index lifecycle policies, investigate traces, or check stack health. Activate even if the user says log analysis, Fluent Bit, Fluentd, Logstash, syslog, traceId, OpenTelemetry, or log analytics without mentioning OpenSearch.
Design data architecture at enterprise and solution levels. Cover data mesh, lakehouse, governance, domain-driven design, conceptual/logical/physical data modeling, platform selection, and compliance frameworks. Produce ADRs, data model diagrams, platform comparison matrices, and governance policy templates. Triggers on "design data platform", "choose data warehouse", "data mesh", "lakehouse architecture", "data governance", "data modeling", "platform selection", "data architecture decision", "compliance framework", or "data strategy". For applied AI solution architecture (RAG data plane, embeddings, vector stores in commercial or enterprise products), use applied-ai-architect-commercial-enterprise. For dbt analytics layers and mart delivery, use analytics-data-engineer—not data-architect.
Sets up vector databases for semantic search including Pinecone, Chroma, pgvector, and Qdrant with embedding generation and similarity search. Use when users request "vector database", "semantic search", "embeddings storage", "Pinecone setup", or "similarity search".
Access Telnyx LLM inference APIs, embeddings, and AI analytics for call insights and summaries. This skill provides REST API (curl) examples.
Command-line interface for Ollama - Local LLM inference and model management via Ollama REST API. Designed for AI agents and power users who need to manage models, generate text, chat, and create embeddings without a GUI.
Deploy, operate, and integrate the VSS 3.2 GA RT-Embed Video Embedding microservice. Covers Docker Compose bring-up, GPU and storage prerequisites, the `/v1` REST API (file uploads, text and video embeddings, live RTSP streams, health and metrics), Redis/Kafka/OTel integration, common failure modes, and teardown.
Use when text embeddings are needed from Alibaba Cloud Model Studio models for semantic search, retrieval-augmented generation, clustering, or offline vectorization pipelines.
Redis vector search guidance covering HNSW vs FLAT algorithm choice, vector index configuration (dims, distance metric, datatype), filtered hybrid search combining vector similarity with TAG or NUMERIC filters, and the RAG retrieval pattern with RedisVL. Use when defining a VECTOR field in FT.CREATE, integrating embeddings (OpenAI, Cohere, sentence-transformers), tuning HNSW parameters (M, EF_CONSTRUCTION, EF_RUNTIME), building a retrieval-augmented generation pipeline, or filtering vector results by attribute.
Configure GOB local file storage for GrepAI. Use this skill for simple, single-machine setups.
Skills for Upstash Vector features, TypeScript/JavaScript SDK usage, and integrations. Use when users ask how to work with Vector, its TS SDK, features, or supported frameworks.
Store a learning, pattern, or decision in the memory system for future recall