Total 50,510 skills, Data Processing has 2560 skills
Showing 12 of 2560 skills
Track crypto holdings across exchanges. Calculate P&L, asset allocation, and generate performance reports.
Obtain securities and financial information such as A-share market quotes, historical K-lines, financial data, macroeconomic data; Use this when users mention Tushare or need to obtain A-share data
Complete FFmpeg + OpenCV + Python integration guide for video processing pipelines. PROACTIVELY activate for: (1) FFmpeg to OpenCV frame handoff, (2) cv2.VideoCapture vs ffmpeg subprocess, (3) BGR/RGB color format conversion gotchas, (4) Frame dimension order img[y,x] vs img[x,y], (5) ffmpegcv GPU-accelerated video I/O, (6) VidGear multi-threaded streaming, (7) Decord batch video loading for ML, (8) PyAV frame-level processing, (9) Audio stream preservation with video filters, (10) Memory-efficient frame generators, (11) OpenCV + FFmpeg + Modal parallel processing, (12) Pipe frames between FFmpeg and OpenCV. Provides: Color format conversion patterns, coordinate system gotchas, library selection guide, memory management, subprocess pipe patterns, GPU-accelerated alternatives to cv2.VideoCapture. Ensures: Correct integration between FFmpeg and OpenCV without color/coordinate bugs. See also: ffmpeg-python-integration-reference for type-safe parameter mappings.
6 web scraping & data collection skills. Trigger: collecting web data, finding datasets, API access for research. Design: ethical scraping methods with rate limiting and data quality checks.
Used for extracting selected metadata from one DICOM file and flagging standard-tag PHI presence. Not for anonymization or clinical use.
Workload-aware architecture design for Apache Doris. MUST USE when designing data architectures, choosing between data models, planning ingestion strategies, sizing clusters, or translating business requirements into Apache Doris system designs. Complements doris-best-practices with decision frameworks and sizing-first workflow. Use when user describes a workload involving: IoT, sensor data, telemetry, real-time analytics, dashboard, log analysis, log search, CDC sync, time-series, device monitoring, point query service, ad-hoc analytics, lakehouse federation, ETL/ELT pipeline, report analytics, clickstream, user behavior, observability, metrics, fleet tracking, or any OLAP workload requiring table design from scratch. Also triggers on prompts like: "design a table for...", "how should I store...", "build an architecture for...", "we have X devices sending data every Y seconds", "recommend a cluster size for...", "what data model should I use for...", "we need to ingest X GB/day", "migrate from MySQL/PostgreSQL to Apache Doris". Also use for legacy analytics/search/serving stack consolidation prompts even when Apache Doris is not named explicitly, including replacing or migrating from Impala, Kudu, Elasticsearch/ES, Greenplum, Presto, HBase, Hive, Hadoop, Redis, or Lambda-style multi-engine data platforms.
Use this skill when working with Brain Imaging Data Structure (BIDS) datasets: organizing neuroscience and biomedical data (MRI, EEG, MEG, iEEG, PET, microscopy, NIRS, motion capture, EMG, MR spectroscopy, behavioral), querying BIDS layouts, validating compliance, converting DICOM to BIDS, writing metadata sidecars, or creating BIDS derivatives.
Query the Precision Medicine Knowledge Graph (PrimeKG) for multiscale biological data including genes, drugs, diseases, phenotypes, and more.
Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.
Build and analyze phylogenetic trees using MAFFT (multiple alignment), IQ-TREE 2 (maximum likelihood), and FastTree (fast NJ/ML). Visualize with ETE3 or FigTree. For evolutionary analysis, microbial genomics, viral phylodynamics, protein family analysis, and molecular clock studies.
Use when defining field mappings for data streams, populating ecs.yml with ECS field references, selecting ECS categorization values, choosing custom field types, or troubleshooting mapping validation failures.
Run OpenMMDL molecular dynamics workflows via the FastFold Workflows API (`openmmdl_v1`) from local topology + optional ligand files, prepare draft scripts, execute drafts, wait for completion, fetch artifacts/metrics, and extract trajectory frames. Use when users ask for OpenMMDL, protein-ligand MD, OpenMMDL script preparation, or `/openmmdl/results/<workflow_id>` reruns.