Loading...
Loading...
Found 91 Skills
Build end-to-end ETL pipelines and analytics dashboards using the Harvard Art Museums API with Python, SQL, and Streamlit
End-to-end data engineering pipeline with Harvard Art Museums API, ETL processing, SQL analytics, and Streamlit visualization
Expert data engineer for ETL/ELT pipelines, streaming, data warehousing. Activate on: data pipeline, ETL, ELT, data warehouse, Spark, Kafka, Airflow, dbt, data modeling, star schema, streaming data, batch processing, data quality. NOT for: API design (use api-architect), ML training (use ML skills), dashboards (use design skills).
Data engineering patterns for ETL pipelines, data warehousing, Apache Spark, and data quality validation
Converts legacy SQL to modular dbt models. Use when migrating SQL to dbt for: (1) Converting stored procedures, views, or raw SQL files to dbt models (2) Task mentions "migrate", "convert", "legacy SQL", "transform to dbt", or "modernize" (3) Breaking monolithic queries into modular layers (discovers project conventions first) (4) Porting existing data pipelines or ETL to dbt patterns Checks for existing models/sources, builds and validates layer by layer.
Customer.io platform help — customer engagement & marketing automation for behavior-based multi-channel messaging. Journeys (visual workflow builder with branching, delays, wait-untils), Campaigns (segment/event/date-triggered), Transactional Messages (API-triggered email, push, SMS), Segmentation (data-driven auto-updating and manual/static), Multi-channel (email, SMS via Twilio, push iOS/Android/web, in-app, WhatsApp), Data Pipelines (primary ingestion API, reverse ETL), Custom Objects, Ad Audience Sync (Google, Facebook, Instagram, YouTube), Design Studio (drag-and-drop email editor), A/B & cohort testing, Broadcasts (one-time/scheduled/API-triggered), Webhooks in workflows, and Analytics with AI-powered insights. Use when asking 'how do I do X in Customer.io', building behavior-triggered automation, setting up transactional messaging via Customer.io, configuring segments or journeys, integrating Customer.io Data Pipelines, or working with the Track/App/Transactional APIs. Do NOT use for general email marketing strategy (use /sales-email-marketing), cross-platform email deliverability (use /sales-deliverability), or email open/click tracking strategy (use /sales-email-tracking).
Create business process and integration diagrams using PlantUML syntax with BPMN, EIP, and Lean Mapping stencil icons. Best for workflow automation, approval processes, message-based integration patterns, ETL pipelines, and value stream mapping. NOT for simple flowcharts (use mermaid) or UML activity diagrams (use uml skill).
Builds and deploys data processing and ML training pipelines using TrueFoundry Workflows (built on Flyte). Use when creating DAGs, orchestrating multi-step tasks, scheduling ETL pipelines, or running ML training workflows.
Guides technology selection and implementation of AI and ML features in .NET 8+ applications using ML.NET, Microsoft.Extensions.AI (MEAI), Microsoft Agent Framework (MAF), GitHub Copilot SDK, ONNX Runtime, and OllamaSharp. Covers the full spectrum from classic ML through modern LLM orchestration to local inference. Use when adding classification, regression, clustering, anomaly detection, recommendation, LLM integration (text generation, summarization, reasoning), RAG pipelines with vector search, agentic workflows with tool calling, Copilot extensions, or custom model inference via ONNX Runtime to a .NET project. DO NOT USE FOR projects targeting .NET Framework (requires .NET 8+), the task is pure data engineering or ETL with no ML/AI component, or the project needs a custom deep learning training loop (use Python with PyTorch/TensorFlow, then export to ONNX for .NET inference).
Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.
Guides hands-on actuarial analyst work for insurance, reinsurance, and pension—reserving and loss development (IBNR, triangles, chain-ladder diagnostics), pricing and rate indication support (experience, trend, credibility, basic GLM at spec level), data validation and model I/O review, reporting packs and workpapers, assumption application under actuary direction, and statutory tie-outs at analyst depth. Use when the user mentions actuarial analyst, loss development, IBNR, reserve analysis, rate indication, pricing support, actuarial workpaper, triangle analysis, credibility, experience study, actuarial reporting, or reserve roll-forward—not actuary sign-off (actuary), consulting engagements (actuarial-consulting), assumption governance (assumption-setting), ALM strategy (asset-liability-management), P&C legal depth (property-casualty-insurance), charts only (data-visualization), or ETL-only pipelines (data-scrubbing).
Workload-aware architecture design for Apache Doris. MUST USE when designing data architectures, choosing between data models, planning ingestion strategies, sizing clusters, or translating business requirements into Apache Doris system designs. Complements doris-best-practices with decision frameworks and sizing-first workflow. Use when user describes a workload involving: IoT, sensor data, telemetry, real-time analytics, dashboard, log analysis, log search, CDC sync, time-series, device monitoring, point query service, ad-hoc analytics, lakehouse federation, ETL/ELT pipeline, report analytics, clickstream, user behavior, observability, metrics, fleet tracking, or any OLAP workload requiring table design from scratch. Also triggers on prompts like: "design a table for...", "how should I store...", "build an architecture for...", "we have X devices sending data every Y seconds", "recommend a cluster size for...", "what data model should I use for...", "we need to ingest X GB/day", "migrate from MySQL/PostgreSQL to Apache Doris". Also use for legacy analytics/search/serving stack consolidation prompts even when Apache Doris is not named explicitly, including replacing or migrating from Impala, Kudu, Elasticsearch/ES, Greenplum, Presto, HBase, Hive, Hadoop, Redis, or Lambda-style multi-engine data platforms.