Total 50,505 skills, Data Processing has 2560 skills
Showing 12 of 2560 skills
MANDATORY when working with geographic data, spatial queries, geometry operations, or location-based features - enforces PostGIS 3.6.1 best practices including ST_CoverageClean, SFCGAL 3D functions, and bigint topology
Tinybird TypeScript SDK for defining datasources, pipes, and queries with full type inference. Use when working with @tinybirdco/sdk, TypeScript Tinybird projects, or type-safe data ingestion and queries.
Configure Databricks across development, staging, and production environments. Use when setting up multi-environment deployments, configuring per-environment secrets, or implementing environment-specific Databricks configurations. Trigger with phrases like "databricks environments", "databricks staging", "databricks dev prod", "databricks environment setup", "databricks config by env".
Ingest and transform large data files (CSV/JSON) into Elasticsearch indices. Stream-based processing for files up to 30GB, cross-version migration (ES 8.x ↔ 9.x), custom JavaScript transformations, and reindexing with transforms. Use when you need to load data into Elasticsearch, migrate indices, or transform data during ingestion.
Use when "NetworkX", "graph analysis", "network analysis", "graph algorithms", "shortest path", "centrality", "PageRank", "community detection", "social network", "knowledge graph"
Generate charts and visualizations from data using various charting libraries and formats.
Google Optimization Tools. An open-source software suite for optimization, specialized in vehicle routing, flows, integer and linear programming, and constraint programming. Features the world-class CP-SAT solver. Use for vehicle routing problems (VRP), scheduling, bin packing, knapsack problems, linear programming (LP), integer programming (MIP), network flows, constraint programming, combinatorial optimization, resource allocation, shift scheduling, job-shop scheduling, and discrete optimization problems.
Use when "data visualization", "plotting", "charts", "matplotlib", "plotly", "seaborn", "graphs", "figures", "heatmap", "scatter plot", "bar chart", "interactive plots"
The practice of collecting, analyzing, and acting on data to drive product decisions. Great analytics isn't about dashboards—it's about insights that lead to action. Every metric should answer a question that changes behavior. This skill covers event tracking, metrics design, dashboards, user behavior analysis, and data-driven decision making. The best analytics teams measure what matters, not what's easy to measure. Use when "analytics, metrics, tracking, dashboard, funnel, cohort, retention, events, KPI, measure, data, insights, conversion, engagement, analytics, metrics, data, dashboards, tracking, funnels, cohorts, KPIs, insights" mentioned.
Native Arrow filesystem integration with PyArrow. Optimized for Parquet workflows, zero-copy data transfer, predicate pushdown, and column pruning. Covers S3, GCS, HDFS with PyArrow datasets.
Delta Lake integration with cloud storage (S3, GCS, Azure). Covers storage_options, PyArrow filesystem, time travel, and partitioned writes.
Elasticsearch 集群管理