Loading...
Loading...
Found 20 Skills
Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.
Fast DataFrame library (Apache Arrow). Select, filter, group_by, joins, lazy evaluation, CSV/Parquet I/O, expression API, for high-performance data analysis workflows.
Use when "Polars", "fast dataframe", "lazy evaluation", "Arrow backend", or asking about "pandas alternative", "parallel dataframe", "large CSV processing", "ETL pipeline", "expression API"
Polars fast DataFrame library. Use for fast data processing.
Search 2.4M+ full-text PubMed Central Open Access papers for literature reviews, trends, and data extraction.
Patterns for efficient ML data pipelines using Polars, Arrow, and ClickHouse. TRIGGERS - data pipeline, polars vs pandas, arrow format, clickhouse ml, efficient loading, zero-copy, memory optimization.
Transform raw data into analytical assets using ETL/ELT patterns, SQL (dbt), Python (pandas/polars/PySpark), and orchestration (Airflow). Use when building data pipelines, implementing incremental models, migrating from pandas to polars, or orchestrating multi-step transformations with testing and quality checks.
Transform raw data into analytical assets using ETL/ELT patterns, SQL (dbt), Python (pandas/polars/PySpark), and orchestration (Airflow). Use when building data pipelines, implementing incremental models, migrating from pandas to polars, or orchestrating multi-step transformations with testing and quality checks.
Fast in-process analytical database for SQL queries on DataFrames, CSV, Parquet, JSON files, and more. Use when user wants to perform SQL analytics on data files or Python DataFrames (pandas, Polars), run complex aggregations, joins, or window functions, or query external data sources without loading into memory. Best for analytical workloads, OLAP queries, and data exploration.
Extract and analyze Agentforce session tracing data from Salesforce Data 360. Supports high-volume extraction (1-10M records/day), Polars-based analysis, and debugging workflows for agent sessions.
Use this skill whenever the user wants to work with survey data using the `survy` Python library. Triggers include: loading or reading survey CSV/Excel/JSON/SPSS files, handling multiselect (multi-choice) questions, computing frequency tables or crosstabs, exporting survey data to SPSS (.sav) or other formats, updating variable labels or value indices, transforming survey data between wide/compact formats, filtering respondents, replacing values, adding/dropping/sorting variables, or any task involving survy's API (read_csv, read_excel, read_json, read_polars, read_spss, crosstab, survey["Q1"], to_spss, to_csv, to_excel, to_json, etc.). Also trigger when the user says things like "analyze my survey", "process questionnaire data", "build a survey analysis script", or "help me with survy". Always read this skill before writing any survy code — it contains the correct API, patterns, and gotchas.
High-performance data analysis using Polars - load, transform, aggregate, visualize and export tabular data. Use for CSV/JSON/Parquet processing, statistical analysis, time series, and creating charts.