data-visualization

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Data Visualization Skill

数据可视化技能

Chart selection guidance, Python visualization code patterns, design principles, and accessibility considerations for creating effective data visualizations.

包含图表选择指南、Python可视化代码模板、设计原则，以及创建高效数据可视化内容时的可访问性注意事项。

Chart Selection Guide

图表选择指南

Choose by Data Relationship

按数据关系选择

What You're Showing	Best Chart	Alternatives
Trend over time	Line chart	Area chart (if showing cumulative or composition)
Comparison across categories	Vertical bar chart	Horizontal bar (many categories), lollipop chart
Ranking	Horizontal bar chart	Dot plot, slope chart (comparing two periods)
Part-to-whole composition	Stacked bar chart	Treemap (hierarchical), waffle chart
Composition over time	Stacked area chart	100% stacked bar (for proportion focus)
Distribution	Histogram	Box plot (comparing groups), violin plot, strip plot
Correlation (2 variables)	Scatter plot	Bubble chart (add 3rd variable as size)
Correlation (many variables)	Heatmap (correlation matrix)	Pair plot
Geographic patterns	Choropleth map	Bubble map, hex map
Flow / process	Sankey diagram	Funnel chart (sequential stages)
Relationship network	Network graph	Chord diagram
Performance vs. target	Bullet chart	Gauge (single KPI only)
Multiple KPIs at once	Small multiples	Dashboard with separate charts

展示内容	最佳图表	替代方案
时间趋势	折线图	面积图（展示累计或构成情况时）
跨类别对比	垂直柱状图	水平柱状图（类别较多时）、棒棒糖图
排名情况	水平柱状图	点图、斜率图（对比两个时期）
整体构成占比	堆叠柱状图	树形图（层级数据）、华夫图
随时间变化的构成	堆叠面积图	100%堆叠柱状图（侧重占比时）
数据分布	直方图	箱线图（对比分组）、小提琴图、散点条图
双变量相关性	散点图	气泡图（用大小表示第三个变量）
多变量相关性	热力图（相关矩阵）	配对图
地理分布模式	分级统计图	气泡地图、六边形地图
流程/流向	桑基图	漏斗图（展示阶段顺序）
关系网络	网络图	弦图
实际 vs 目标绩效	子弹图	仪表盘（仅适用于单个KPI）
多KPI同时展示	小多图	包含多个独立图表的仪表盘

When NOT to Use Certain Charts

需避免使用的图表场景

Pie charts: Avoid unless <6 categories and exact proportions matter less than rough comparison. Humans are bad at comparing angles. Use bar charts instead.
3D charts: Never. They distort perception and add no information.
Dual-axis charts: Use cautiously. They can mislead by implying correlation. Clearly label both axes if used.
Stacked bar (many categories): Hard to compare middle segments. Use small multiples or grouped bars instead.
Donut charts: Slightly better than pie charts but same fundamental issues. Use for single KPI display at most.

饼图：除非类别少于6个且无需精确对比占比。人类对角度的感知能力较差，建议改用柱状图。
3D图表：绝对不要使用。它们会扭曲视觉感知，且无额外信息价值。
双轴图表：谨慎使用。容易误导用户，暗示数据间存在相关性。若必须使用，需清晰标注两个坐标轴。
多类别堆叠柱状图：中间分段难以对比，建议改用小多图或分组柱状图。
环形图：仅比饼图略好，但存在相同的本质问题，最多用于单个KPI展示。

Python Visualization Code Patterns

Python可视化代码模板

Setup and Style

环境配置与样式设置

python

import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import seaborn as sns
import pandas as pd
import numpy as np

python

import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import seaborn as sns
import pandas as pd
import numpy as np

Professional style setup

专业样式配置

plt.style.use('seaborn-v0_8-whitegrid') plt.rcParams.update({ 'figure.figsize': (10, 6), 'figure.dpi': 150, 'font.size': 11, 'axes.titlesize': 14, 'axes.titleweight': 'bold', 'axes.labelsize': 11, 'xtick.labelsize': 10, 'ytick.labelsize': 10, 'legend.fontsize': 10, 'figure.titlesize': 16, })

Colorblind-friendly palettes

适用于色弱人群的配色方案

PALETTE_CATEGORICAL = ['#4C72B0', '#DD8452', '#55A868', '#C44E52', '#8172B3', '#937860'] PALETTE_SEQUENTIAL = 'YlOrRd' PALETTE_DIVERGING = 'RdBu_r'

undefined

PALETTE_CATEGORICAL = ['#4C72B0', '#DD8452', '#55A868', '#C44E52', '#8172B3', '#937860'] PALETTE_SEQUENTIAL = 'YlOrRd' PALETTE_DIVERGING = 'RdBu_r'

undefined

Line Chart (Time Series)

折线图（时间序列）

python

fig, ax = plt.subplots(figsize=(10, 6))

for label, group in df.groupby('category'):
    ax.plot(group['date'], group['value'], label=label, linewidth=2)

ax.set_title('Metric Trend by Category', fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('Value')
ax.legend(loc='upper left', frameon=True)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

python

fig, ax = plt.subplots(figsize=(10, 6))

for label, group in df.groupby('category'):
    ax.plot(group['date'], group['value'], label=label, linewidth=2)

ax.set_title('各类别指标趋势', fontweight='bold')
ax.set_xlabel('日期')
ax.set_ylabel('数值')
ax.legend(loc='upper left', frameon=True)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

Format dates on x-axis

格式化X轴日期

fig.autofmt_xdate()

plt.tight_layout() plt.savefig('trend_chart.png', dpi=150, bbox_inches='tight')

undefined

fig.autofmt_xdate()

plt.tight_layout() plt.savefig('trend_chart.png', dpi=150, bbox_inches='tight')

undefined

Bar Chart (Comparison)

柱状图（对比分析）

python

fig, ax = plt.subplots(figsize=(10, 6))

python

fig, ax = plt.subplots(figsize=(10, 6))

Sort by value for easy reading

按数值排序，提升可读性

df_sorted = df.sort_values('metric', ascending=True)

bars = ax.barh(df_sorted['category'], df_sorted['metric'], color=PALETTE_CATEGORICAL[0])

df_sorted = df.sort_values('metric', ascending=True)

bars = ax.barh(df_sorted['category'], df_sorted['metric'], color=PALETTE_CATEGORICAL[0])

Add value labels

添加数值标签

for bar in bars: width = bar.get_width() ax.text(width + 0.5, bar.get_y() + bar.get_height()/2, f'{width:,.0f}', ha='left', va='center', fontsize=10)

ax.set_title('Metric by Category (Ranked)', fontweight='bold') ax.set_xlabel('Metric Value') ax.spines['top'].set_visible(False) ax.spines['right'].set_visible(False)

plt.tight_layout() plt.savefig('bar_chart.png', dpi=150, bbox_inches='tight')

undefined

for bar in bars: width = bar.get_width() ax.text(width + 0.5, bar.get_y() + bar.get_height()/2, f'{width:,.0f}', ha='left', va='center', fontsize=10)

ax.set_title('各类别指标排名', fontweight='bold') ax.set_xlabel('指标数值') ax.spines['top'].set_visible(False) ax.spines['right'].set_visible(False)

plt.tight_layout() plt.savefig('bar_chart.png', dpi=150, bbox_inches='tight')

undefined

Histogram (Distribution)

直方图（分布情况）

python

fig, ax = plt.subplots(figsize=(10, 6))

ax.hist(df['value'], bins=30, color=PALETTE_CATEGORICAL[0], edgecolor='white', alpha=0.8)

python

fig, ax = plt.subplots(figsize=(10, 6))

ax.hist(df['value'], bins=30, color=PALETTE_CATEGORICAL[0], edgecolor='white', alpha=0.8)

Add mean and median lines

添加均值和中位数线

mean_val = df['value'].mean() median_val = df['value'].median() ax.axvline(mean_val, color='red', linestyle='--', linewidth=1.5, label=f'Mean: {mean_val:,.1f}') ax.axvline(median_val, color='green', linestyle='--', linewidth=1.5, label=f'Median: {median_val:,.1f}')

ax.set_title('Distribution of Values', fontweight='bold') ax.set_xlabel('Value') ax.set_ylabel('Frequency') ax.legend() ax.spines['top'].set_visible(False) ax.spines['right'].set_visible(False)

plt.tight_layout() plt.savefig('histogram.png', dpi=150, bbox_inches='tight')

undefined

mean_val = df['value'].mean() median_val = df['value'].median() ax.axvline(mean_val, color='red', linestyle='--', linewidth=1.5, label=f'均值: {mean_val:,.1f}') ax.axvline(median_val, color='green', linestyle='--', linewidth=1.5, label=f'中位数: {median_val:,.1f}')

ax.set_title('数值分布情况', fontweight='bold') ax.set_xlabel('数值') ax.set_ylabel('频次') ax.legend() ax.spines['top'].set_visible(False) ax.spines['right'].set_visible(False)

plt.tight_layout() plt.savefig('histogram.png', dpi=150, bbox_inches='tight')

undefined

Heatmap

热力图

python

fig, ax = plt.subplots(figsize=(10, 8))

python

fig, ax = plt.subplots(figsize=(10, 8))

Pivot data for heatmap format

转换数据为热力图格式

pivot = df.pivot_table(index='row_dim', columns='col_dim', values='metric', aggfunc='sum')

sns.heatmap(pivot, annot=True, fmt=',.0f', cmap='YlOrRd', linewidths=0.5, ax=ax, cbar_kws={'label': 'Metric Value'})

ax.set_title('Metric by Row Dimension and Column Dimension', fontweight='bold') ax.set_xlabel('Column Dimension') ax.set_ylabel('Row Dimension')

plt.tight_layout() plt.savefig('heatmap.png', dpi=150, bbox_inches='tight')

undefined

pivot = df.pivot_table(index='row_dim', columns='col_dim', values='metric', aggfunc='sum')

sns.heatmap(pivot, annot=True, fmt=',.0f', cmap='YlOrRd', linewidths=0.5, ax=ax, cbar_kws={'label': '指标数值'})

ax.set_title('行维度与列维度的指标分布', fontweight='bold') ax.set_xlabel('列维度') ax.set_ylabel('行维度')

plt.tight_layout() plt.savefig('heatmap.png', dpi=150, bbox_inches='tight')

undefined

Small Multiples

小多图

python

categories = df['category'].unique()
n_cats = len(categories)
n_cols = min(3, n_cats)
n_rows = (n_cats + n_cols - 1) // n_cols

fig, axes = plt.subplots(n_rows, n_cols, figsize=(5*n_cols, 4*n_rows), sharex=True, sharey=True)
axes = axes.flatten() if n_cats > 1 else [axes]

for i, cat in enumerate(categories):
    ax = axes[i]
    subset = df[df['category'] == cat]
    ax.plot(subset['date'], subset['value'], color=PALETTE_CATEGORICAL[i % len(PALETTE_CATEGORICAL)])
    ax.set_title(cat, fontsize=12)
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)

python

categories = df['category'].unique()
n_cats = len(categories)
n_cols = min(3, n_cats)
n_rows = (n_cats + n_cols - 1) // n_cols

fig, axes = plt.subplots(n_rows, n_cols, figsize=(5*n_cols, 4*n_rows), sharex=True, sharey=True)
axes = axes.flatten() if n_cats > 1 else [axes]

for i, cat in enumerate(categories):
    ax = axes[i]
    subset = df[df['category'] == cat]
    ax.plot(subset['date'], subset['value'], color=PALETTE_CATEGORICAL[i % len(PALETTE_CATEGORICAL)])
    ax.set_title(cat, fontsize=12)
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)

Hide empty subplots

隐藏空的子图

for j in range(i+1, len(axes)): axes[j].set_visible(False)

fig.suptitle('Trends by Category', fontsize=14, fontweight='bold', y=1.02) plt.tight_layout() plt.savefig('small_multiples.png', dpi=150, bbox_inches='tight')

undefined

for j in range(i+1, len(axes)): axes[j].set_visible(False)

fig.suptitle('各类别趋势对比', fontsize=14, fontweight='bold', y=1.02) plt.tight_layout() plt.savefig('small_multiples.png', dpi=150, bbox_inches='tight')

undefined

Number Formatting Helpers

数值格式化工具

python

def format_number(val, format_type='number'):
    """Format numbers for chart labels."""
    if format_type == 'currency':
        if abs(val) >= 1e9:
            return f'${val/1e9:.1f}B'
        elif abs(val) >= 1e6:
            return f'${val/1e6:.1f}M'
        elif abs(val) >= 1e3:
            return f'${val/1e3:.1f}K'
        else:
            return f'${val:,.0f}'
    elif format_type == 'percent':
        return f'{val:.1f}%'
    elif format_type == 'number':
        if abs(val) >= 1e9:
            return f'{val/1e9:.1f}B'
        elif abs(val) >= 1e6:
            return f'{val/1e6:.1f}M'
        elif abs(val) >= 1e3:
            return f'{val/1e3:.1f}K'
        else:
            return f'{val:,.0f}'
    return str(val)

python

def format_number(val, format_type='number'):
    """为图表标签格式化数值。"""
    if format_type == 'currency':
        if abs(val) >= 1e9:
            return f'${val/1e9:.1f}B'
        elif abs(val) >= 1e6:
            return f'${val/1e6:.1f}M'
        elif abs(val) >= 1e3:
            return f'${val/1e3:.1f}K'
        else:
            return f'${val:,.0f}'
    elif format_type == 'percent':
        return f'{val:.1f}%'
    elif format_type == 'number':
        if abs(val) >= 1e9:
            return f'{val/1e9:.1f}B'
        elif abs(val) >= 1e6:
            return f'{val/1e6:.1f}M'
        elif abs(val) >= 1e3:
            return f'{val/1e3:.1f}K'
        else:
            return f'{val:,.0f}'
    return str(val)

Usage with axis formatter

与坐标轴格式化工具配合使用

ax.yaxis.set_major_formatter(mticker.FuncFormatter(lambda x, p: format_number(x, 'currency')))

undefined

ax.yaxis.set_major_formatter(mticker.FuncFormatter(lambda x, p: format_number(x, 'currency')))

undefined

Interactive Charts with Plotly

基于Plotly的交互式图表

python

import plotly.express as px
import plotly.graph_objects as go

python

import plotly.express as px
import plotly.graph_objects as go

Simple interactive line chart

简单交互式折线图

fig = px.line(df, x='date', y='value', color='category', title='Interactive Metric Trend', labels={'value': 'Metric Value', 'date': 'Date'}) fig.update_layout(hovermode='x unified') fig.write_html('interactive_chart.html') fig.show()

fig = px.line(df, x='date', y='value', color='category', title='交互式指标趋势', labels={'value': '指标数值', 'date': '日期'}) fig.update_layout(hovermode='x unified') fig.write_html('interactive_chart.html') fig.show()

Interactive scatter with hover data

带悬浮信息的交互式散点图

fig = px.scatter(df, x='metric_a', y='metric_b', color='category', size='size_metric', hover_data=['name', 'detail_field'], title='Correlation Analysis') fig.show()

undefined

fig = px.scatter(df, x='metric_a', y='metric_b', color='category', size='size_metric', hover_data=['name', 'detail_field'], title='相关性分析') fig.show()

undefined

Design Principles

设计原则

Color

色彩使用

Use color purposefully: Color should encode data, not decorate
Highlight the story: Use a bright accent color for the key insight; grey everything else
Sequential data: Use a single-hue gradient (light to dark) for ordered values
Diverging data: Use a two-hue gradient with neutral midpoint for data with a meaningful center
Categorical data: Use distinct hues, maximum 6-8 before it gets confusing
Avoid red/green only: 8% of men are red-green colorblind. Use blue/orange as primary pair

有目的性地使用色彩：色彩应用于编码数据，而非装饰
突出核心结论：用明亮的强调色突出关键洞察，其余内容用灰色弱化
连续型数据：使用单色调渐变（从浅到深）表示有序数值
发散型数据：使用双色调渐变搭配中性中点，展示带有明确基准的数据集
分类数据：使用区分度高的色调，最多6-8个类别，避免混淆
避免仅用红绿色：8%的男性存在红绿色盲，优先使用蓝橙色配对

Typography

排版规范

Title states the insight: "Revenue grew 23% YoY" beats "Revenue by Month"
Subtitle adds context: Date range, filters applied, data source
Axis labels are readable: Never rotated 90 degrees if avoidable. Shorten or wrap instead
Data labels add precision: Use on key points, not every single bar
Annotation highlights: Call out specific points with text annotations

标题直接点明洞察：“营收同比增长23%”优于“月度营收情况”
副标题补充上下文：包含时间范围、筛选条件、数据来源
坐标轴标签易读：尽量避免90度旋转，可缩短文字或换行
数据标签提升精度：仅在关键位置添加，无需为每个柱状图都添加
标注突出重点：用文本标注强调特定数据点

Layout

布局设计

Reduce chart junk: Remove gridlines, borders, backgrounds that don't carry information
Sort meaningfully: Categories sorted by value (not alphabetically) unless there's a natural order (months, stages)
Appropriate aspect ratio: Time series wider than tall (3:1 to 2:1); comparisons can be squarer
White space is good: Don't cram charts together. Give each visualization room to breathe

减少冗余元素：移除无信息价值的网格线、边框、背景
合理排序：按数值排序（而非字母顺序），除非存在自然顺序（如月份、流程阶段）
合适的宽高比：时间序列图宽大于高（3:1至2:1）；对比图可接近正方形
合理留白：不要过度拥挤，为每个可视化内容预留空间

Accuracy

准确性要求

Bar charts start at zero: Always. A bar from 95 to 100 exaggerates a 5% difference
Line charts can have non-zero baselines: When the range of variation is meaningful
Consistent scales across panels: When comparing multiple charts, use the same axis range
Show uncertainty: Error bars, confidence intervals, or ranges when data is uncertain
Label your axes: Never make the reader guess what the numbers mean

柱状图必须从零开始：否则会夸大差异，比如从95到100的柱状图会放大5%的变化
折线图可使用非零基线：当变化范围本身具有意义时
多图保持刻度一致：对比多个图表时，使用相同的坐标轴范围
展示不确定性：数据存在不确定性时，添加误差线、置信区间或范围
标注坐标轴：绝对不要让用户猜测数值含义

Accessibility Considerations

可访问性注意事项

Color Blindness

色弱适配

Never rely on color alone to distinguish data series
Add pattern fills, different line styles (solid, dashed, dotted), or direct labels
Test with a colorblind simulator (e.g., Coblis, Sim Daltonism)
Use the colorblind-friendly palette:
```
sns.color_palette("colorblind")
```

绝不单独依赖色彩区分数据系列
添加图案填充、不同线条样式（实线、虚线、点线）或直接标注
使用色弱模拟器测试（如Coblis、Sim Daltonism）
使用色弱友好配色：
```
sns.color_palette("colorblind")
```

data-visualization

Original

Translation

Data Visualization Skill

数据可视化技能

Chart Selection Guide

图表选择指南

Choose by Data Relationship

按数据关系选择

When NOT to Use Certain Charts

需避免使用的图表场景

Python Visualization Code Patterns

Python可视化代码模板

Setup and Style

环境配置与样式设置

Professional style setup

专业样式配置

Colorblind-friendly palettes

适用于色弱人群的配色方案

Line Chart (Time Series)

折线图（时间序列）

Format dates on x-axis

格式化X轴日期

Bar Chart (Comparison)

柱状图（对比分析）

Sort by value for easy reading

按数值排序，提升可读性

Add value labels

添加数值标签

Histogram (Distribution)

直方图（分布情况）

Add mean and median lines

添加均值和中位数线

Heatmap

热力图

Pivot data for heatmap format

转换数据为热力图格式

Small Multiples

小多图

Hide empty subplots

隐藏空的子图

Number Formatting Helpers

数值格式化工具

Usage with axis formatter

与坐标轴格式化工具配合使用

Interactive Charts with Plotly

基于Plotly的交互式图表

Simple interactive line chart

简单交互式折线图

Interactive scatter with hover data

带悬浮信息的交互式散点图

Design Principles

设计原则

Color

色彩使用

Typography

排版规范

Layout

布局设计

Accuracy

准确性要求

Accessibility Considerations

可访问性注意事项

Color Blindness

色弱适配

Screen Readers

屏幕阅读器适配

General Accessibility

通用可访问性

Accessibility Checklist

可访问性检查清单