Statistical Analysis

统计分析

Apply statistical methods to understand data and validate findings.

运用统计方法理解数据并验证研究结论。

Quick Start

快速开始

python

from scipy import stats
import numpy as np

python

from scipy import stats
import numpy as np

Descriptive statistics

data = np.array([1, 2, 3, 4, 5]) print(f"Mean: {np.mean(data)}") print(f"Std: {np.std(data)}")

Hypothesis testing

group1 = [23, 25, 27, 29, 31] group2 = [20, 22, 24, 26, 28] t_stat, p_value = stats.ttest_ind(group1, group2) print(f"P-value: {p_value}")

undefined

group1 = [23, 25, 27, 29, 31] group2 = [20, 22, 24, 26, 28] t_stat, p_value = stats.ttest_ind(group1, group2) print(f"P-value: {p_value}")

undefined

Core Tests

核心检验方法

T-Test (Compare Means)

T-Test（均值比较）

python

undefined

python

undefined

One-sample: Compare to population mean

stats.ttest_1samp(data, 100)

Two-sample: Compare two groups

stats.ttest_ind(group1, group2)

Paired: Before/after comparison

stats.ttest_rel(before, after)

undefined

stats.ttest_rel(before, after)

undefined

Chi-Square (Categorical Data)

Chi-Square（分类数据检验）

python

from scipy.stats import chi2_contingency

observed = np.array([[10, 20], [15, 25]])
chi2, p_value, dof, expected = chi2_contingency(observed)

python

from scipy.stats import chi2_contingency

observed = np.array([[10, 20], [15, 25]])
chi2, p_value, dof, expected = chi2_contingency(observed)

ANOVA (Multiple Groups)

ANOVA（多组比较）

python

f_stat, p_value = stats.f_oneway(group1, group2, group3)

python

f_stat, p_value = stats.f_oneway(group1, group2, group3)

Confidence Intervals

置信区间

python

from scipy import stats

confidence_level = 0.95
mean = np.mean(data)
se = stats.sem(data)
ci = stats.t.interval(confidence_level, len(data)-1, mean, se)

print(f"95% CI: [{ci[0]:.2f}, {ci[1]:.2f}]")

python

from scipy import stats

confidence_level = 0.95
mean = np.mean(data)
se = stats.sem(data)
ci = stats.t.interval(confidence_level, len(data)-1, mean, se)

print(f"95% CI: [{ci[0]:.2f}, {ci[1]:.2f}]")

Correlation

Pearson (linear)

r, p_value = stats.pearsonr(x, y)

Spearman (rank-based)

rho, p_value = stats.spearmanr(x, y)

undefined

rho, p_value = stats.spearmanr(x, y)

undefined

Distributions

分布分析

python

undefined

python

undefined

Normal

x = np.linspace(-3, 3, 100) pdf = stats.norm.pdf(x, loc=0, scale=1)

Sampling

samples = np.random.normal(0, 1, 1000)

Test normality

stat, p_value = stats.shapiro(data)

undefined

stat, p_value = stats.shapiro(data)

undefined

A/B Testing Framework

A/B测试框架

python

def ab_test(control, treatment, alpha=0.05):
    """
    Run A/B test with statistical significance

    Returns: significant (bool), p_value (float)
    """
    t_stat, p_value = stats.ttest_ind(control, treatment)

    significant = p_value < alpha
    improvement = (np.mean(treatment) - np.mean(control)) / np.mean(control) * 100

    return {
        'significant': significant,
        'p_value': p_value,
        'improvement': f"{improvement:.2f}%"
    }

python

def ab_test(control, treatment, alpha=0.05):
    """
    Run A/B test with statistical significance

    Returns: significant (bool), p_value (float)
    """
    t_stat, p_value = stats.ttest_ind(control, treatment)

    significant = p_value < alpha
    improvement = (np.mean(treatment) - np.mean(control)) / np.mean(control) * 100

    return {
        'significant': significant,
        'p_value': p_value,
        'improvement': f"{improvement:.2f}%"
    }

Interpretation

结果解读

P-value < 0.05: Reject null hypothesis (statistically significant)

P-value >= 0.05: Fail to reject null (not significant)

P值 < 0.05：拒绝原假设（具有统计显著性）

P值 >= 0.05：无法拒绝原假设（无显著性）

Common Pitfalls

常见误区

Multiple testing without correction
Small sample sizes
Ignoring assumptions (normality, independence)
Confusing correlation with causation
p-hacking (searching for significance)

未校正的多重检验
样本量过小
忽略前提假设（正态性、独立性）
将相关性混淆为因果关系
P值操纵（刻意寻找显著性结果）

Troubleshooting

问题排查

Common Issues

常见问题

Problem: Non-normal data for t-test

python

undefined

问题：T检验数据不符合正态分布

python

undefined

Check normality first

stat, p = stats.shapiro(data) if p < 0.05: # Use non-parametric alternative stat, p = stats.mannwhitneyu(group1, group2) # Instead of ttest_ind


**Problem: Multiple comparisons inflating false positives**
```python
from statsmodels.stats.multitest import multipletests

stat, p = stats.shapiro(data) if p < 0.05: # Use non-parametric alternative stat, p = stats.mannwhitneyu(group1, group2) # Instead of ttest_ind


**问题：多重检验导致假阳性率升高**
```python
from statsmodels.stats.multitest import multipletests

Apply Bonferroni correction

p_values = [0.01, 0.03, 0.04, 0.02, 0.06] rejected, p_adjusted, _, _ = multipletests(p_values, method='bonferroni')


**Problem: Underpowered study (sample too small)**
```python
from statsmodels.stats.power import TTestIndPower

p_values = [0.01, 0.03, 0.04, 0.02, 0.06] rejected, p_adjusted, _, _ = multipletests(p_values, method='bonferroni')


**问题：研究功效不足（样本量过小）**
```python
from statsmodels.stats.power import TTestIndPower

Calculate required sample size

power_analysis = TTestIndPower() sample_size = power_analysis.solve_power( effect_size=0.5, # Medium effect (Cohen's d) power=0.8, # 80% power alpha=0.05 # 5% significance ) print(f"Required n per group: {sample_size:.0f}")


**Problem: Heterogeneous variances**
```python

power_analysis = TTestIndPower() sample_size = power_analysis.solve_power( effect_size=0.5, # Medium effect (Cohen's d) power=0.8, # 80% power alpha=0.05 # 5% significance ) print(f"Required n per group: {sample_size:.0f}")


**问题：方差不齐**
```python

Check with Levene's test

stat, p = stats.levene(group1, group2) if p < 0.05: # Use Welch's t-test (default in scipy) t, p = stats.ttest_ind(group1, group2, equal_var=False)


**Problem: Outliers affecting results**
```python
from scipy.stats import zscore

stat, p = stats.levene(group1, group2) if p < 0.05: # Use Welch's t-test (default in scipy) t, p = stats.ttest_ind(group1, group2, equal_var=False)


**问题：异常值影响结果**
```python
from scipy.stats import zscore

Detect outliers (|z| > 3)

z_scores = np.abs(zscore(data)) clean_data = data[z_scores < 3]

Or use robust statistics

median = np.median(data) mad = np.median(np.abs(data - median)) # Median Absolute Deviation

undefined

median = np.median(data) mad = np.median(np.abs(data - median)) # Median Absolute Deviation

undefined

Debug Checklist

调试清单

Check sample size adequacy (power analysis)
Test normality assumption (Shapiro-Wilk)
Test homogeneity of variance (Levene's)
Check for outliers (z-scores, IQR)
Apply multiple testing correction if needed
Report effect sizes, not just p-values

检查样本量是否充足（功效分析）
检验正态性假设（Shapiro-Wilk检验）
检验方差齐性（Levene检验）
检查异常值（Z分数、四分位距）
必要时应用多重检验校正
报告效应量，而非仅报告P值

statistical-analysis

Original

Translation

Statistical Analysis

统计分析

Quick Start

快速开始

Descriptive statistics

Descriptive statistics

Hypothesis testing

Hypothesis testing

Core Tests

核心检验方法

T-Test (Compare Means)

T-Test（均值比较）

One-sample: Compare to population mean

One-sample: Compare to population mean

Two-sample: Compare two groups

Two-sample: Compare two groups

Paired: Before/after comparison

Paired: Before/after comparison

Chi-Square (Categorical Data)

Chi-Square（分类数据检验）

ANOVA (Multiple Groups)

ANOVA（多组比较）

Confidence Intervals

置信区间

Correlation

相关性分析

Pearson (linear)

Pearson (linear)

Spearman (rank-based)

Spearman (rank-based)

Distributions

分布分析

Normal

Normal

Sampling

Sampling

Test normality

Test normality

A/B Testing Framework

A/B测试框架

Interpretation

结果解读

Common Pitfalls

常见误区

Troubleshooting

问题排查

Common Issues

常见问题

Check normality first

Check normality first

Apply Bonferroni correction

Apply Bonferroni correction

Calculate required sample size

Calculate required sample size

Check with Levene's test

Check with Levene's test

Detect outliers (|z| > 3)

Detect outliers (|z| > 3)

Or use robust statistics

Or use robust statistics

Debug Checklist

调试清单