Total 30,738 skills, Data Processing has 1471 skills
Showing 12 of 1471 skills
Search and scrape public web content with headless Chrome and DuckDuckGo using safe practices.
Salesforce Data Cloud Prepare phase. TRIGGER when: user creates or manages Data Cloud data streams, DLOs, transforms, or Document AI configurations, or asks about ingestion into Data Cloud. DO NOT TRIGGER when: the task is connection setup only (use sf-datacloud-connect), DMOs and identity resolution (use sf-datacloud-harmonize), or query/search work (use sf-datacloud-retrieve).
This skill should be used when the user asks for 'TRX price', 'TRON token price', 'price chart on TRON', 'K-line data for USDT/TRX', 'TRON trade history', 'TRON whale activity', 'large transfers on TRON', 'smart money on TRON', 'TRON DEX volume', or mentions checking real-time prices, candlestick data, trading volume, whale monitoring, or smart money signals on the TRON network. For token search and metadata, use tron-token. For swap execution, use tron-swap.
Twitter/X data via the 6551 API. Supports user profiles, tweet search, user tweets, follower events, deleted tweets, and KOL followers.
Build and deploy Streamlit apps natively in Snowflake. Covers snowflake.yml scaffolding, Snowpark sessions, multi-page structure, Marketplace publishing as Native Apps, and caller's rights connections (v1.53.0+). Use when building data apps on Snowflake, deploying SiS, fixing package channel errors, authentication issues, cache key bugs, or path resolution errors.
Evaluate research rigor. Assess methodology, experimental design, statistical validity, biases, confounding, evidence quality (GRADE, Cochrane ROB), for critical analysis of scientific claims.
Statistical visualization. Scatter, box, violin, heatmaps, pair plots, regression, correlation matrices, KDE, faceted plots, for exploratory analysis and publication figures.
SQLite file format, B-trees, pages, cells, overflow, freelist that is used in tursodb
GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora.
Fast DataFrame library (Apache Arrow). Select, filter, group_by, joins, lazy evaluation, CSV/Parquet I/O, expression API, for high-performance data analysis workflows.
Direct REST API access to PubMed. Advanced Boolean/MeSH queries, E-utilities API, batch processing, citation management. For Python workflows, prefer biopython (Bio.Entrez). Use this for direct HTTP/REST work or custom API implementations.
Statistical modeling toolkit. OLS, GLM, logistic, ARIMA, time series, hypothesis tests, diagnostics, AIC/BIC, for rigorous statistical inference and econometric analysis.