Total 50,522 skills, Data Processing has 2561 skills
Showing 12 of 2561 skills
Apply meta-analysis to synthesize effect sizes across multiple studies, assess heterogeneity, and evaluate publication bias. Use this skill when the user needs to combine findings from prior research, compare fixed-effect vs random-effects models, compute pooled effect sizes, or when they ask 'what does the overall evidence say', 'how do I combine results across studies', or 'is there publication bias'.
Conduct cohort analysis to track user behavior over time, build retention matrices, and compare cohort performance. Use this skill when the user needs to measure retention, understand how user behavior changes after acquisition, compare product versions' impact on engagement, or predict LTV — even if they say 'what's our retention rate', 'are newer users behaving differently', 'build a retention table', or 'how long do customers stick around'.
Implement PageRank algorithm to compute web page importance scores using the random surfer model. Use this skill when the user needs to rank pages by link authority, build a simplified search ranking system, or understand how link structure determines page importance — even if they say 'which pages are most important', 'link analysis', or 'page authority score'.
Implement Gale-Shapley stable matching algorithm for two-sided matching problems. Use this skill when the user needs to match candidates to positions, assign students to schools, or solve any two-sided preference matching — even if they say 'optimal job matching', 'stable assignment', or 'candidate-position pairing'.
Implement Elo rating system to rank items or players from pairwise comparison outcomes. Use this skill when the user needs to rank items from head-to-head matchups, build a competitive rating system, or evaluate relative quality from comparison data — even if they say 'player rating', 'ranking from comparisons', or 'competitive scoring system'.
Evaluate source credibility using primary/secondary classification, internal/external criticism, triangulation, and misinformation detection. Use this skill when the user needs to assess whether information is trustworthy, evaluate research sources, fact-check claims, or detect misinformation — even if they say 'can I trust this source', 'is this real', 'how reliable is this data', or 'fact-check this for me'.
Apply rigorous survey design principles including construct operationalization, Likert scale development, reliability and validity assessment, and common method variance control. Use this skill when the user designs questionnaires, develops measurement items, needs to evaluate Cronbach's alpha or AVE, or when they ask 'how do I operationalize this construct', 'is my scale reliable', or 'how do I control for CMV'.
Optimize SQL query performance using EXPLAIN analysis, indexing strategies, and common anti-pattern fixes. Use this skill when the user needs to speed up slow queries, design indexes, fix N+1 problems, or optimize database performance — even if they say 'this query is slow', 'optimize our database', 'which indexes do we need', or 'our dashboard takes 30 seconds to load'.
Apply Bayesian averaging to rank items by combining observed ratings with prior expectations. Use this skill when the user needs to rank items with varying review counts, build a 'top rated' list that handles low-sample items fairly, or implement IMDB-style weighted rating — even if they say 'weighted average rating', 'IMDB formula', or 'ranking with prior'.
Apply Hierarchical Linear Modeling (HLM) to analyze nested data structures with random intercepts and slopes, accounting for intra-class correlation and cross-level interactions. Use this skill when the user has students nested in schools, employees in firms, or repeated measures in individuals, needs to partition variance across levels, or when they ask 'how do I handle nested data', 'what is ICC', or 'do group-level factors moderate individual-level relationships'.
Combine multiple forecasting models into ensemble predictions for improved accuracy. Use this skill when the user needs to improve forecast reliability, combine ARIMA/Prophet/ETS outputs, or build a robust forecasting pipeline — even if they say 'combine forecasts', 'model averaging', or 'which forecast should I trust'.
Drop-in pandas replacement with ClickHouse performance. Use `import chdb.datastore as pd` (or `from datastore import DataStore`) and write standard pandas code — same API, 10-100x faster on large datasets. Supports 16+ data sources (MySQL, PostgreSQL, S3, MongoDB, ClickHouse, Iceberg, Delta Lake, etc.) and 10+ file formats (Parquet, CSV, JSON, Arrow, ORC, etc.) with cross-source joins. Use this skill when the user wants to analyze data with pandas-style syntax, speed up slow pandas code, query remote databases or cloud storage as DataFrames, or join data across different sources — even if they don't explicitly mention chdb or DataStore. Do NOT use for raw SQL queries, ClickHouse server administration, or non-Python languages.