Loading...
Loading...
Found 149 Skills
Detect anomalies in data using statistical and ML methods. Z-score, IQR, Isolation Forest, and time-series anomalies.
Comprehensive skill for Microsoft GraphRAG - modular graph-based RAG system for reasoning over private datasets
Parse, search, analyze, and ingest LinkedIn GDPR data exports. This skill should be used when working with LinkedIn data — searching messages, analyzing connections, exporting to Markdown, or ingesting into RLAMA for semantic search. Requires a LinkedIn GDPR data export ZIP file.
Guide for using Nushell for structured data pipelines and scripting. Use when writing shell scripts, processing structured data, or working with cross-platform automation.
Patterns for building robust, reproducible genomics analysis pipelines. Covers workflow managers, NGS data processing, variant calling, RNA-seq, and common bioinformatics pitfalls. Use when ", " mentioned.
This skill should be used when building data processing pipelines with CocoIndex v1, a Python library for incremental data transformation. Use when the task involves processing files/data into databases, creating vector embeddings, building knowledge graphs, ETL workflows, or any data pipeline requiring automatic change detection and incremental updates. CocoIndex v1 is Python-native (supports any Python types), has no DSL, and is currently under pre-release (version 1.0.0a1 or later).
Comprehensive guide to Spark Structured Streaming for production workloads. Use when building streaming pipelines, implementing real-time data processing, handling stateful operations, or optimizing streaming performance.
Split Excel workbooks into separate Excel files by worksheet, with each worksheet generating an individual file. Application scenarios: (1) Split multi-worksheet Excel files into separate files, (2) Extract specific worksheets as independent files, (3) Distribute worksheets from merged workbooks, (4) Create copies of worksheets for separate processing or distribution.
Explains core Apache Beam programming model concepts including PCollections, PTransforms, Pipelines, and Runners. Use when learning Beam fundamentals or explaining pipeline concepts.
Enrich a CSV with any data field using a waterfall pattern: try multiple providers in sequence, stop at the first successful match. Prevents paying for duplicate lookups and maximizes fill rates. Triggers: - "enrich my lead list" - "add [field] to my CSV" - "waterfall enrichment" - "try multiple providers to find [data]" Requires: Deepline CLI — https://code.deepline.com
Create a custom technical indicator using Numba JIT + NumPy. Generates production-grade, O(n) optimized indicator functions with charting and benchmarking.
Use this skill for ANY question about creating test or evaluation datasets for LangChain agents. Covers generating datasets from traces (final_response, single_step, trajectory, RAG types), uploading to LangSmith, and managing evaluation data.