Loading...
Loading...
Compare original and translation side by side
IRON LAW: Never Rank by Simple Average When Sample Sizes Differ
A 5.0 average from 1 review is NOT better than 4.8 from 1000 reviews.
Wilson Score lower bound accounts for sample uncertainty:
Items with few ratings get a LOWER bound, properly reflecting our
uncertainty about their true quality.IRON LAW: Never Rank by Simple Average When Sample Sizes Differ
A 5.0 average from 1 review is NOT better than 4.8 from 1000 reviews.
Wilson Score lower bound accounts for sample uncertainty:
Items with few ratings get a LOWER bound, properly reflecting our
uncertainty about their true quality.{
"rankings": [{"item": "Product_A", "wilson_lower": 0.89, "positive": 950, "total": 1000, "proportion": 0.95}],
"metadata": {"confidence": 0.95, "z": 1.96, "items_ranked": 500}
}{
"rankings": [{"item": "Product_A", "wilson_lower": 0.89, "positive": 950, "total": 1000, "proportion": 0.95}],
"metadata": {"confidence": 0.95, "z": 1.96, "items_ranked": 500}
}| Input | Expected | Why |
|---|---|---|
| 0 reviews | Cannot rank | n=0, undefined. Exclude or assign minimum |
| 0 positive, 100 total | Very low score | Genuinely bad item, high confidence |
| 1M positive, 1M total | Lower bound ≈ 1.0 | Massive sample, high confidence in 100% |
| 输入 | 预期结果 | 原因 |
|---|---|---|
| 0条评价 | 无法排名 | n=0,无定义。需排除或赋予最低值 |
| 0条好评,100条总评价 | 评分极低 | 确实是差评项目,置信度高 |
| 100万条好评,100万条总评价 | 置信下限≈1.0 | 样本量极大,对100%好评的置信度高 |
| Script | Description | Usage |
|---|---|---|
| Compute Wilson score interval and rank items | |
python scripts/wilson_score.py --verify| 脚本 | 描述 | 使用方法 |
|---|---|---|
| 计算Wilson Score区间并对项目排名 | |
python scripts/wilson_score.py --verifyreferences/bayesian-average.mdreferences/reddit-ranking.mdreferences/bayesian-average.mdreferences/reddit-ranking.md