Loading...
Loading...
Found 103 Skills
Parallel/distributed computing. Scale pandas/NumPy beyond memory, parallel DataFrames/Arrays, multi-file processing, task graphs, for larger-than-RAM datasets and parallel workflows.
Analyze CSV files, generate summary statistics, and create visualizations using Python and pandas. Use when the user uploads, attaches, or references a CSV file, asks to summarize or analyze tabular data, requests insights from CSV data, or wants to understand data structure and quality.
Use when "Dask", "parallel computing", "distributed computing", "larger than memory", or asking about "parallel pandas", "parallel numpy", "out-of-core", "multi-file processing", "cluster computing", "lazy evaluation dataframe"
Geospatial Analysis provides workflows for satellite imagery processing, GIS operations with GeoPandas, spatial statistics, and Earth observation data analysis.
Panel data analysis with Python using linearmodels and pandas.
A Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. Great for exploring relationships between variables and visualizing distributions. Use for statistical data visualization, exploratory data analysis (EDA), relationship plots, distribution plots, categorical comparisons, regression visualization, heatmaps, cluster maps, and creating publication-quality statistical graphics from Pandas DataFrames.
Use when tasks involve creating, editing, analyzing, or formatting spreadsheets (`.xlsx`, `.csv`, `.tsv`) using Python (`openpyxl`, `pandas`), especially when formulas, references, and formatting need to be preserved and verified. Originally from OpenAI's curated skills catalog.
Parses API Gateway access logs (AWS API Gateway, Kong, Nginx) to detect BOLA/IDOR attacks, rate limit bypass, credential scanning, and injection attempts. Uses pandas for statistical analysis of request patterns and anomaly detection. Use when investigating API abuse or building API-specific threat detection rules.
Fast in-process analytical database for SQL queries on DataFrames, CSV, Parquet, JSON files, and more. Use when user wants to perform SQL analytics on data files or Python DataFrames (pandas, Polars), run complex aggregations, joins, or window functions, or query external data sources without loading into memory. Best for analytical workloads, OLAP queries, and data exploration.
Smart Excel/CSV file parsing with intelligent routing based on file complexity analysis. Analyzes file structure (merged cells, row count, table layout) using lightweight metadata scanning, then recommends optimal processing strategy - either high-speed Pandas mode for standard tables or semantic HTML mode for complex reports. Use when processing Excel/CSV files with unknown or varying structure where optimization between speed and accuracy is needed.
Best practices for doing quick exploratory data analysis with minimal code and a Pandas .plot like API using HoloViews hvPlot.
Use when tasks involve creating, editing, analyzing, or formatting spreadsheets (`.xlsx`, `.csv`, `.tsv`) using Python (`openpyxl`, `pandas`), especially when formulas, references, and formatting need to be preserved and verified.