Loading...
Loading...
Found 16 Skills
Clean and transform messy data in Stata with reproducible workflows
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
WPS Spreadsheet Intelligent Assistant: Control Excel via natural language to solve pain points such as formula writing, data cleaning, chart creation, etc.
Use when working with pandas DataFrames, data cleaning, aggregation, merging, or time series analysis. Invoke for data manipulation, missing value handling, groupby operations, or performance optimization.
Credit risk data cleaning and variable screening pipeline for pre-loan modeling. Use when working with raw credit data that needs quality assessment, missing value analysis, or variable selection before modeling. it covers data loading and formatting, abnormal period filtering, missing rate calculation, high-missing variable removal,low-IV variable filtering, high-PSI variable removal, Null Importance denoising, high-correlation variable removal, and cleaning report generation. Applicable scenarios arecredit risk data cleaning, variable screening, pre-loan modeling preprocessing.
Expert in high-performance CSV processing, parsing, and data cleaning using Python, DuckDB, and command-line tools. Use when working with CSV files, cleaning data, transforming datasets, or processing large tabular data files.
Normalize messy creator campaign metrics from multiple sources into a single clean table with standardized field names ready to merge into your master tracker. This skill should be used when cleaning up influencer metrics, standardizing campaign data from multiple platforms, normalizing creator performance numbers, merging metrics from Instagram and TikTok and YouTube into one sheet, formatting messy analytics exports, preparing campaign data for a master spreadsheet, converting raw platform stats into a consistent format, combining metrics from different reporting tools, deduplicating creator data from multiple sources, fixing inconsistent column names across exports, or cleaning up a metrics dump before reporting. For calculating engagement rates, see engagement-rate-calculator-benchmarker. For full campaign reports, see campaign-roi-calculator. For parsing a single Story screenshot, see story-metrics-screenshot-parser.
Coaches users to transform messy data into clean, analysis-ready formats using Power Query UI. Diagnoses data problems, visualizes goals, and guides step-by-step transformations.
This skill should be used when the user asks to "use pandas", "analyze data with pandas", "work with DataFrames", "clean data with pandas", or needs guidance on pandas best practices, data manipulation, performance optimization, or common pandas patterns.
Parse, transform, and analyze CSV files with advanced data manipulation capabilities.
Use when asked to parse, normalize, standardize, or convert dates from various formats to consistent ISO 8601 or custom formats.
Pandas data manipulation with DataFrames. Use for data analysis.