Total 50,402 skills, Data Processing has 2557 skills
Showing 12 of 2557 skills
Generate styled word clouds from text with custom shapes, colors, fonts, and stopword filtering. Supports PNG/SVG export and frequency dictionaries.
Comprehensive data quality patterns using Great Expectations, DLT expectations, and custom validators for ensuring data reliability and trust.
Process Excel files, supporting reading, analysis, statistics and export of xlsx data
读取、写入和操作Excel文件(.xlsx、.xls)。创建电子表格、读取数据并导出为各种格式。
Build apps on Databricks Apps platform. Use when asked to create dashboards, data apps, analytics tools, or visualizations. Invoke BEFORE starting implementation.
Diagnose ClickHouse issues by analyzing system.part_log (part creation, merges, mutations, downloads, removals, moves). Use for too many parts / micro-batch inserts, merge backlog or slow merges, mutation storms (ALTER DELETE/UPDATE), unusual replication DownloadPart churn, unexpected RemovePart spikes, or ZooKeeper/Keeper znode growth correlated with part activity.
Extract tables from PDFs and images to CSV or Excel. Support for scanned documents with OCR, multi-page PDFs, and complex table structures.
Manages MongoDB Atlas Stream Processing (ASP) workflows. Handles workspace provisioning, data source/sink connections, processor lifecycle operations, debugging diagnostics, and tier sizing. Supports Kafka, Atlas clusters, S3, HTTPS, and Lambda integrations for streaming data workloads and event processing. NOT for general MongoDB queries or Atlas cluster management. Requires MongoDB MCP Server with Atlas API credentials.
Access PUDL table data plus table/column/source metadata in Jupyter or Marimo notebooks for debugging and visualization. Use when users ask what a table contains, how to read it, or how columns are defined.
Construcción y optimización cuantitativa de portafolios: Markowitz (scipy.optimize + Monte Carlo), Black-Litterman (prior CAPM, views absolutas/relativas, posterior bayesiano), HRP/HERC/NCO (clustering jerárquico, risk parity, NCO con restricciones). Todo flat numpy + scipy, sin Riskfolio-Lib ni PyPortfolioOpt.
Market Data API de Alpaca: acciones, crypto, opciones. Historical y real-time data para 5000+ stocks.
Reconcile Venmo business transactions and separate personal from business.