Loading...
Loading...
Found 113 Skills
Store and query vector embeddings using Amazon S3 Vectors, a cost-effective long-term vector storage service with its own API namespace (s3vectors). Triggers on: create S3 vector bucket, vector index, store embeddings, semantic search, RAG vector storage, similarity search, vector database, migrate from other vector databases. Do NOT use for: querying tabular data (use querying-data-lake), S3 object storage, or hundreds/thousands of sustained QPS (use OpenSearch).
Build RAG systems and semantic search with Gemini embeddings (gemini-embedding-001). 768-3072 dimension vectors, 8 task types, Cloudflare Vectorize integration. Prevents 13 documented errors. Use when: vector search, RAG systems, semantic search, document clustering. Troubleshoot: dimension mismatch, normalization required, batch ordering bug, memory limits, wrong task type, rate limits (100 RPM).
Build document Q&A with Gemini File Search - fully managed RAG with automatic chunking, embeddings, and citations. Upload 100+ file formats, query with natural language. Use when: document Q&A, searchable knowledge bases, semantic search. Troubleshoot: document immutability, storage quota (3x), chunking config, metadata limits (20 max), polling timeouts, displayName dropped (Blob uploads), grounding lost (JSON mode), tool conflicts (googleSearch + fileSearch).
Optimizing vector embeddings for RAG systems through model selection, chunking strategies, caching, and performance tuning. Use when building semantic search, RAG pipelines, or document retrieval systems that require cost-effective, high-quality embeddings.
Analyze AI/ML technical content (papers, articles, blog posts) and extract actionable insights filtered through enterprise AI engineering lens. Use when user provides URL/document for AI/ML content analysis, asks to "review this paper", or mentions technical content in domains like RAG, embeddings, fine-tuning, prompt engineering, LLM deployment.
RAG-specific best practices for LlamaIndex, ChromaDB, and Celery workers. Covers ingestion, retrieval, embeddings, and performance.
Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval.
Vector embeddings configuration and semantic search
Semantic and multi-modal search across documents using LanceDB vector embeddings. Use when searching knowledge bases, finding information semantically, ingesting documents for RAG, or performing vector similarity search. Triggers on "search documents", "semantic search", "find in knowledge base", "vector search", "index documents", "LanceDB", or RAG/embedding operations.
Build search applications and query log analytics data with OpenSearch. Use this skill when the user mentions OpenSearch, search app, index setup, search architecture, semantic search, vector search, hybrid search, BM25, dense vector, sparse vector, agentic search, RAG, embeddings, KNN, PDF ingestion, document processing, or any related search topic. Also use for log analytics and observability — when the user wants to set up log ingestion, query logs with PPL, analyze error patterns, set up index lifecycle policies, investigate traces, or check stack health. Activate even if the user says log analysis, Fluent Bit, Fluentd, Logstash, syslog, traceId, OpenTelemetry, or log analytics without mentioning OpenSearch.
Vector search best practices for Azure DocumentDB using `cosmosSearch` — choosing between DiskANN / HNSW / IVF, creating indexes, tuning `lBuild` / `lSearch` / `maxDegree`, Product Quantization (up to 16,000 dims), half-precision (fp16) indexing, and normalizing embeddings for cosine similarity. Use when building RAG / semantic-search applications, creating a vector index, tuning recall/latency, or reducing vector-index memory footprint.
This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.