seaborn

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Seaborn Statistical Visualization

Seaborn 统计可视化

Overview

概述

Seaborn is a Python visualization library for creating publication-quality statistical graphics. Use this skill for dataset-oriented plotting, multivariate analysis, automatic statistical estimation, and complex multi-panel figures with minimal code.
Seaborn是一个用于创建出版级统计图形的Python可视化库。使用该工具可实现面向数据集的绘图、多变量分析、自动统计估计,以及用极少代码生成复杂的多面板图形。

Design Philosophy

设计理念

Seaborn follows these core principles:
  1. Dataset-oriented: Work directly with DataFrames and named variables rather than abstract coordinates
  2. Semantic mapping: Automatically translate data values into visual properties (colors, sizes, styles)
  3. Statistical awareness: Built-in aggregation, error estimation, and confidence intervals
  4. Aesthetic defaults: Publication-ready themes and color palettes out of the box
  5. Matplotlib integration: Full compatibility with matplotlib customization when needed
Seaborn遵循以下核心原则:
  1. 面向数据集:直接处理DataFrame和命名变量,而非抽象坐标
  2. 语义映射:自动将数据值转换为视觉属性(颜色、大小、样式)
  3. 统计感知:内置聚合、误差估计和置信区间功能
  4. 美观默认值:开箱即用的出版级主题和调色板
  5. Matplotlib集成:需要时可完全兼容Matplotlib的自定义设置

Quick Start

快速开始

python
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
python
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

Load example dataset

加载示例数据集

df = sns.load_dataset('tips')
df = sns.load_dataset('tips')

Create a simple visualization

创建简单可视化

sns.scatterplot(data=df, x='total_bill', y='tip', hue='day') plt.show()
undefined
sns.scatterplot(data=df, x='total_bill', y='tip', hue='day') plt.show()
undefined

Core Plotting Interfaces

核心绘图接口

Function Interface (Traditional)

函数接口(传统)

The function interface provides specialized plotting functions organized by visualization type. Each category has axes-level functions (plot to single axes) and figure-level functions (manage entire figure with faceting).
When to use:
  • Quick exploratory analysis
  • Single-purpose visualizations
  • When you need a specific plot type
函数接口提供按可视化类型分类的专用绘图函数。每个类别都有轴级函数(绘制到单个轴)和图级函数(管理包含分面的整个图形)。
适用场景:
  • 快速探索性分析
  • 单一用途的可视化
  • 需要特定类型的图表

Objects Interface (Modern)

对象接口(现代)

The
seaborn.objects
interface provides a declarative, composable API similar to ggplot2. Build visualizations by chaining methods to specify data mappings, marks, transformations, and scales.
When to use:
  • Complex layered visualizations
  • When you need fine-grained control over transformations
  • Building custom plot types
  • Programmatic plot generation
python
from seaborn import objects as so
seaborn.objects
接口提供类似ggplot2的声明式、可组合API。通过链式调用方法来指定数据映射、标记、转换和缩放,从而构建可视化图形。
适用场景:
  • 复杂的分层可视化
  • 需要对转换进行细粒度控制
  • 构建自定义图表类型
  • 程序化生成图表
python
from seaborn import objects as so

Declarative syntax

声明式语法

( so.Plot(data=df, x='total_bill', y='tip') .add(so.Dot(), color='day') .add(so.Line(), so.PolyFit()) )
undefined
( so.Plot(data=df, x='total_bill', y='tip') .add(so.Dot(), color='day') .add(so.Line(), so.PolyFit()) )
undefined

Plotting Functions by Category

按类别划分的绘图函数

Relational Plots (Relationships Between Variables)

关系图(变量间的关系)

Use for: Exploring how two or more variables relate to each other
  • scatterplot()
    - Display individual observations as points
  • lineplot()
    - Show trends and changes (automatically aggregates and computes CI)
  • relplot()
    - Figure-level interface with automatic faceting
Key parameters:
  • x
    ,
    y
    - Primary variables
  • hue
    - Color encoding for additional categorical/continuous variable
  • size
    - Point/line size encoding
  • style
    - Marker/line style encoding
  • col
    ,
    row
    - Facet into multiple subplots (figure-level only)
python
undefined
用途: 探索两个或多个变量之间的关系
  • scatterplot()
    - 将单个观测值显示为点
  • lineplot()
    - 展示趋势和变化(自动聚合并计算置信区间)
  • relplot()
    - 支持自动分面的图级接口
关键参数:
  • x
    ,
    y
    - 主变量
  • hue
    - 用颜色编码额外的分类/连续变量
  • size
    - 点/线的大小编码
  • style
    - 标记/线的样式编码
  • col
    ,
    row
    - 分面为多个子图(仅图级函数支持)
python
undefined

Scatter with multiple semantic mappings

包含多种语义映射的散点图

sns.scatterplot(data=df, x='total_bill', y='tip', hue='time', size='size', style='sex')
sns.scatterplot(data=df, x='total_bill', y='tip', hue='time', size='size', style='sex')

Line plot with confidence intervals

带置信区间的线图

sns.lineplot(data=timeseries, x='date', y='value', hue='category')
sns.lineplot(data=timeseries, x='date', y='value', hue='category')

Faceted relational plot

分面关系图

sns.relplot(data=df, x='total_bill', y='tip', col='time', row='sex', hue='smoker', kind='scatter')
undefined
sns.relplot(data=df, x='total_bill', y='tip', col='time', row='sex', hue='smoker', kind='scatter')
undefined

Distribution Plots (Single and Bivariate Distributions)

分布图(单变量和双变量分布)

Use for: Understanding data spread, shape, and probability density
  • histplot()
    - Bar-based frequency distributions with flexible binning
  • kdeplot()
    - Smooth density estimates using Gaussian kernels
  • ecdfplot()
    - Empirical cumulative distribution (no parameters to tune)
  • rugplot()
    - Individual observation tick marks
  • displot()
    - Figure-level interface for univariate and bivariate distributions
  • jointplot()
    - Bivariate plot with marginal distributions
  • pairplot()
    - Matrix of pairwise relationships across dataset
Key parameters:
  • x
    ,
    y
    - Variables (y optional for univariate)
  • hue
    - Separate distributions by category
  • stat
    - Normalization: "count", "frequency", "probability", "density"
  • bins
    /
    binwidth
    - Histogram binning control
  • bw_adjust
    - KDE bandwidth multiplier (higher = smoother)
  • fill
    - Fill area under curve
  • multiple
    - How to handle hue: "layer", "stack", "dodge", "fill"
python
undefined
用途: 了解数据的分布范围、形状和概率密度
  • histplot()
    - 支持灵活分箱的基于条形的频率分布
  • kdeplot()
    - 使用高斯核的平滑密度估计
  • ecdfplot()
    - 经验累积分布(无需调整参数)
  • rugplot()
    - 单个观测值的刻度标记
  • displot()
    - 用于单变量和双变量分布的图级接口
  • jointplot()
    - 带边缘分布的双变量图
  • pairplot()
    - 数据集中变量间两两关系的矩阵图
关键参数:
  • x
    ,
    y
    - 变量(单变量分析时y可选)
  • hue
    - 按类别分离分布
  • stat
    - 归一化方式:"count", "frequency", "probability", "density"
  • bins
    /
    binwidth
    - 直方图分箱控制
  • bw_adjust
    - KDE带宽乘数(值越大越平滑)
  • fill
    - 填充曲线下方区域
  • multiple
    - 处理hue的方式:"layer", "stack", "dodge", "fill"
python
undefined

Histogram with density normalization

带密度归一化的直方图

sns.histplot(data=df, x='total_bill', hue='time', stat='density', multiple='stack')
sns.histplot(data=df, x='total_bill', hue='time', stat='density', multiple='stack')

Bivariate KDE with contours

带等高线的双变量KDE图

sns.kdeplot(data=df, x='total_bill', y='tip', fill=True, levels=5, thresh=0.1)
sns.kdeplot(data=df, x='total_bill', y='tip', fill=True, levels=5, thresh=0.1)

Joint plot with marginals

带边缘分布的联合图

sns.jointplot(data=df, x='total_bill', y='tip', kind='scatter', hue='time')
sns.jointplot(data=df, x='total_bill', y='tip', kind='scatter', hue='time')

Pairwise relationships

两两关系图

sns.pairplot(data=df, hue='species', corner=True)
undefined
sns.pairplot(data=df, hue='species', corner=True)
undefined

Categorical Plots (Comparisons Across Categories)

分类图(跨类别比较)

Use for: Comparing distributions or statistics across discrete categories
Categorical scatterplots:
  • stripplot()
    - Points with jitter to show all observations
  • swarmplot()
    - Non-overlapping points (beeswarm algorithm)
Distribution comparisons:
  • boxplot()
    - Quartiles and outliers
  • violinplot()
    - KDE + quartile information
  • boxenplot()
    - Enhanced boxplot for larger datasets
Statistical estimates:
  • barplot()
    - Mean/aggregate with confidence intervals
  • pointplot()
    - Point estimates with connecting lines
  • countplot()
    - Count of observations per category
Figure-level:
  • catplot()
    - Faceted categorical plots (set
    kind
    parameter)
Key parameters:
  • x
    ,
    y
    - Variables (one typically categorical)
  • hue
    - Additional categorical grouping
  • order
    ,
    hue_order
    - Control category ordering
  • dodge
    - Separate hue levels side-by-side
  • orient
    - "v" (vertical) or "h" (horizontal)
  • kind
    - Plot type for catplot: "strip", "swarm", "box", "violin", "bar", "point"
python
undefined
用途: 比较不同离散类别间的分布或统计量
分类散点图:
  • stripplot()
    - 带抖动的点图,展示所有观测值
  • swarmplot()
    - 无重叠的点图(蜂群算法)
分布比较:
  • boxplot()
    - 四分位数和异常值
  • violinplot()
    - KDE + 四分位数信息
  • boxenplot()
    - 针对大型数据集增强版箱线图
统计估计:
  • barplot()
    - 带置信区间的均值/聚合值
  • pointplot()
    - 带连接线的点估计值
  • countplot()
    - 每个类别的观测值计数
图级函数:
  • catplot()
    - 分面分类图(通过
    kind
    参数设置类型)
关键参数:
  • x
    ,
    y
    - 变量(通常其中一个是分类变量)
  • hue
    - 额外的分类分组
  • order
    ,
    hue_order
    - 控制类别顺序
  • dodge
    - 将hue水平并排显示
  • orient
    - "v"(垂直)或 "h"(水平)
  • kind
    - catplot的图类型:"strip", "swarm", "box", "violin", "bar", "point"
python
undefined

Swarm plot showing all points

展示所有点的蜂群图

sns.swarmplot(data=df, x='day', y='total_bill', hue='sex')
sns.swarmplot(data=df, x='day', y='total_bill', hue='sex')

Violin plot with split for comparison

带拆分的小提琴图,用于比较

sns.violinplot(data=df, x='day', y='total_bill', hue='sex', split=True)
sns.violinplot(data=df, x='day', y='total_bill', hue='sex', split=True)

Bar plot with error bars

带误差棒的条形图

sns.barplot(data=df, x='day', y='total_bill', hue='sex', estimator='mean', errorbar='ci')
sns.barplot(data=df, x='day', y='total_bill', hue='sex', estimator='mean', errorbar='ci')

Faceted categorical plot

分面分类图

sns.catplot(data=df, x='day', y='total_bill', col='time', kind='box')
undefined
sns.catplot(data=df, x='day', y='total_bill', col='time', kind='box')
undefined

Regression Plots (Linear Relationships)

回归图(线性关系)

Use for: Visualizing linear regressions and residuals
  • regplot()
    - Axes-level regression plot with scatter + fit line
  • lmplot()
    - Figure-level with faceting support
  • residplot()
    - Residual plot for assessing model fit
Key parameters:
  • x
    ,
    y
    - Variables to regress
  • order
    - Polynomial regression order
  • logistic
    - Fit logistic regression
  • robust
    - Use robust regression (less sensitive to outliers)
  • ci
    - Confidence interval width (default 95)
  • scatter_kws
    ,
    line_kws
    - Customize scatter and line properties
python
undefined
用途: 可视化线性回归和残差
  • regplot()
    - 轴级回归图,包含散点+拟合线
  • lmplot()
    - 支持分面的图级函数
  • residplot()
    - 用于评估模型拟合度的残差图
关键参数:
  • x
    ,
    y
    - 要回归的变量
  • order
    - 多项式回归阶数
  • logistic
    - 拟合逻辑回归
  • robust
    - 使用稳健回归(对异常值不敏感)
  • ci
    - 置信区间宽度(默认95)
  • scatter_kws
    ,
    line_kws
    - 自定义散点和线的属性
python
undefined

Simple linear regression

简单线性回归

sns.regplot(data=df, x='total_bill', y='tip')
sns.regplot(data=df, x='total_bill', y='tip')

Polynomial regression with faceting

带分面的多项式回归

sns.lmplot(data=df, x='total_bill', y='tip', col='time', order=2, ci=95)
sns.lmplot(data=df, x='total_bill', y='tip', col='time', order=2, ci=95)

Check residuals

检查残差

sns.residplot(data=df, x='total_bill', y='tip')
undefined
sns.residplot(data=df, x='total_bill', y='tip')
undefined

Matrix Plots (Rectangular Data)

矩阵图(矩形数据)

Use for: Visualizing matrices, correlations, and grid-structured data
  • heatmap()
    - Color-encoded matrix with annotations
  • clustermap()
    - Hierarchically-clustered heatmap
Key parameters:
  • data
    - 2D rectangular dataset (DataFrame or array)
  • annot
    - Display values in cells
  • fmt
    - Format string for annotations (e.g., ".2f")
  • cmap
    - Colormap name
  • center
    - Value at colormap center (for diverging colormaps)
  • vmin
    ,
    vmax
    - Color scale limits
  • square
    - Force square cells
  • linewidths
    - Gap between cells
python
undefined
用途: 可视化矩阵、相关性和网格结构数据
  • heatmap()
    - 带注释的颜色编码矩阵
  • clustermap()
    - 层次聚类热力图
关键参数:
  • data
    - 二维矩形数据集(DataFrame或数组)
  • annot
    - 在单元格中显示数值
  • fmt
    - 注释的格式字符串(例如".2f")
  • cmap
    - 颜色映射名称
  • center
    - 颜色映射的中心值(用于发散型颜色映射)
  • vmin
    ,
    vmax
    - 颜色刻度范围
  • square
    - 强制单元格为正方形
  • linewidths
    - 单元格之间的间隙
python
undefined

Correlation heatmap

相关性热力图

corr = df.corr() sns.heatmap(corr, annot=True, fmt='.2f', cmap='coolwarm', center=0, square=True)
corr = df.corr() sns.heatmap(corr, annot=True, fmt='.2f', cmap='coolwarm', center=0, square=True)

Clustered heatmap

聚类热力图

sns.clustermap(data, cmap='viridis', standard_scale=1, figsize=(10, 10))
undefined
sns.clustermap(data, cmap='viridis', standard_scale=1, figsize=(10, 10))
undefined

Multi-Plot Grids

多图网格

Seaborn provides grid objects for creating complex multi-panel figures:
Seaborn提供网格对象用于创建复杂的多面板图形:

FacetGrid

FacetGrid

Create subplots based on categorical variables. Most useful when called through figure-level functions (
relplot
,
displot
,
catplot
), but can be used directly for custom plots.
python
g = sns.FacetGrid(df, col='time', row='sex', hue='smoker')
g.map(sns.scatterplot, 'total_bill', 'tip')
g.add_legend()
基于分类变量创建子图。大多数情况下通过图级函数(
relplot
,
displot
,
catplot
)调用,但也可直接用于自定义绘图。
python
g = sns.FacetGrid(df, col='time', row='sex', hue='smoker')
g.map(sns.scatterplot, 'total_bill', 'tip')
g.add_legend()

PairGrid

PairGrid

Show pairwise relationships between all variables in a dataset.
python
g = sns.PairGrid(df, hue='species')
g.map_upper(sns.scatterplot)
g.map_lower(sns.kdeplot)
g.map_diag(sns.histplot)
g.add_legend()
展示数据集中所有变量间的两两关系。
python
g = sns.PairGrid(df, hue='species')
g.map_upper(sns.scatterplot)
g.map_lower(sns.kdeplot)
g.map_diag(sns.histplot)
g.add_legend()

JointGrid

JointGrid

Combine bivariate plot with marginal distributions.
python
g = sns.JointGrid(data=df, x='total_bill', y='tip')
g.plot_joint(sns.scatterplot)
g.plot_marginals(sns.histplot)
将双变量图与边缘分布结合。
python
g = sns.JointGrid(data=df, x='total_bill', y='tip')
g.plot_joint(sns.scatterplot)
g.plot_marginals(sns.histplot)

Figure-Level vs Axes-Level Functions

图级函数 vs 轴级函数

Understanding this distinction is crucial for effective seaborn usage:
理解这种区别对于有效使用Seaborn至关重要:

Axes-Level Functions

轴级函数

  • Plot to a single matplotlib
    Axes
    object
  • Integrate easily into complex matplotlib figures
  • Accept
    ax=
    parameter for precise placement
  • Return
    Axes
    object
  • Examples:
    scatterplot
    ,
    histplot
    ,
    boxplot
    ,
    regplot
    ,
    heatmap
When to use:
  • Building custom multi-plot layouts
  • Combining different plot types
  • Need matplotlib-level control
  • Integrating with existing matplotlib code
python
fig, axes = plt.subplots(2, 2, figsize=(10, 10))
sns.scatterplot(data=df, x='x', y='y', ax=axes[0, 0])
sns.histplot(data=df, x='x', ax=axes[0, 1])
sns.boxplot(data=df, x='cat', y='y', ax=axes[1, 0])
sns.kdeplot(data=df, x='x', y='y', ax=axes[1, 1])
  • 绘制到单个matplotlib
    Axes
    对象
  • 可轻松集成到复杂的matplotlib图形中
  • 接受
    ax=
    参数以精确定位
  • 返回
    Axes
    对象
  • 示例:
    scatterplot
    ,
    histplot
    ,
    boxplot
    ,
    regplot
    ,
    heatmap
适用场景:
  • 构建自定义多图布局
  • 组合不同类型的图表
  • 需要matplotlib级别的控制
  • 与现有matplotlib代码集成
python
fig, axes = plt.subplots(2, 2, figsize=(10, 10))
sns.scatterplot(data=df, x='x', y='y', ax=axes[0, 0])
sns.histplot(data=df, x='x', ax=axes[0, 1])
sns.boxplot(data=df, x='cat', y='y', ax=axes[1, 0])
sns.kdeplot(data=df, x='x', y='y', ax=axes[1, 1])

Figure-Level Functions

图级函数

  • Manage entire figure including all subplots
  • Built-in faceting via
    col
    and
    row
    parameters
  • Return
    FacetGrid
    ,
    JointGrid
    , or
    PairGrid
    objects
  • Use
    height
    and
    aspect
    for sizing (per subplot)
  • Cannot be placed in existing figure
  • Examples:
    relplot
    ,
    displot
    ,
    catplot
    ,
    lmplot
    ,
    jointplot
    ,
    pairplot
When to use:
  • Faceted visualizations (small multiples)
  • Quick exploratory analysis
  • Consistent multi-panel layouts
  • Don't need to combine with other plot types
python
undefined
  • 管理包含所有子图的整个图形
  • 通过
    col
    row
    参数内置分面功能
  • 返回
    FacetGrid
    ,
    JointGrid
    PairGrid
    对象
  • 使用
    height
    aspect
    设置尺寸(每个子图)
  • 无法放置在现有图形中
  • 示例:
    relplot
    ,
    displot
    ,
    catplot
    ,
    lmplot
    ,
    jointplot
    ,
    pairplot
适用场景:
  • 分面可视化(小倍数)
  • 快速探索性分析
  • 一致的多面板布局
  • 无需与其他类型的图表组合
python
undefined

Automatic faceting

自动分面

sns.relplot(data=df, x='x', y='y', col='category', row='group', hue='type', height=3, aspect=1.2)
undefined
sns.relplot(data=df, x='x', y='y', col='category', row='group', hue='type', height=3, aspect=1.2)
undefined

Data Structure Requirements

数据结构要求

Long-Form Data (Preferred)

长格式数据(推荐)

Each variable is a column, each observation is a row. This "tidy" format provides maximum flexibility:
python
undefined
每个变量是一列,每个观测值是一行。这种"整洁"格式提供最大的灵活性:
python
undefined

Long-form structure

长格式结构

subject condition measurement 0 1 control 10.5 1 1 treatment 12.3 2 2 control 9.8 3 2 treatment 13.1

**Advantages:**
- Works with all seaborn functions
- Easy to remap variables to visual properties
- Supports arbitrary complexity
- Natural for DataFrame operations
subject condition measurement 0 1 control 10.5 1 1 treatment 12.3 2 2 control 9.8 3 2 treatment 13.1

**优势:**
- 适用于所有Seaborn函数
- 轻松将变量重新映射到视觉属性
- 支持任意复杂度
- 适合DataFrame操作

Wide-Form Data

宽格式数据

Variables are spread across columns. Useful for simple rectangular data:
python
undefined
变量分布在多列中。适用于简单的矩形数据:
python
undefined

Wide-form structure

宽格式结构

control treatment 0 10.5 12.3 1 9.8 13.1

**Use cases:**
- Simple time series
- Correlation matrices
- Heatmaps
- Quick plots of array data

**Converting wide to long:**
```python
df_long = df.melt(var_name='condition', value_name='measurement')
control treatment 0 10.5 12.3 1 9.8 13.1

**适用场景:**
- 简单时间序列
- 相关矩阵
- 热力图
- 数组数据的快速绘图

**宽格式转长格式:**
```python
df_long = df.melt(var_name='condition', value_name='measurement')

Color Palettes

调色板

Seaborn provides carefully designed color palettes for different data types:
Seaborn为不同数据类型提供精心设计的调色板:

Qualitative Palettes (Categorical Data)

定性调色板(分类数据)

Distinguish categories through hue variation:
  • "deep"
    - Default, vivid colors
  • "muted"
    - Softer, less saturated
  • "pastel"
    - Light, desaturated
  • "bright"
    - Highly saturated
  • "dark"
    - Dark values
  • "colorblind"
    - Safe for color vision deficiency
python
sns.set_palette("colorblind")
sns.color_palette("Set2")
通过色调变化区分类别:
  • "deep"
    - 默认,鲜艳颜色
  • "muted"
    - 柔和,饱和度较低
  • "pastel"
    - 浅色,低饱和度
  • "bright"
    - 高饱和度
  • "dark"
    - 深色值
  • "colorblind"
    - 适合色觉障碍人群
python
sns.set_palette("colorblind")
sns.color_palette("Set2")

Sequential Palettes (Ordered Data)

顺序调色板(有序数据)

Show progression from low to high values:
  • "rocket"
    ,
    "mako"
    - Wide luminance range (good for heatmaps)
  • "flare"
    ,
    "crest"
    - Restricted luminance (good for points/lines)
  • "viridis"
    ,
    "magma"
    ,
    "plasma"
    - Matplotlib perceptually uniform
python
sns.heatmap(data, cmap='rocket')
sns.kdeplot(data=df, x='x', y='y', cmap='mako', fill=True)
展示从低到高的值变化:
  • "rocket"
    ,
    "mako"
    - 宽亮度范围(适合热力图)
  • "flare"
    ,
    "crest"
    - 有限亮度范围(适合点/线)
  • "viridis"
    ,
    "magma"
    ,
    "plasma"
    - Matplotlib感知均匀调色板
python
sns.heatmap(data, cmap='rocket')
sns.kdeplot(data=df, x='x', y='y', cmap='mako', fill=True)

Diverging Palettes (Centered Data)

发散调色板(中心数据)

Emphasize deviations from a midpoint:
  • "vlag"
    - Blue to red
  • "icefire"
    - Blue to orange
  • "coolwarm"
    - Cool to warm
  • "Spectral"
    - Rainbow diverging
python
sns.heatmap(correlation_matrix, cmap='vlag', center=0)
强调与中点的偏差:
  • "vlag"
    - 蓝到红
  • "icefire"
    - 蓝到橙
  • "coolwarm"
    - 冷色到暖色
  • "Spectral"
    - 彩虹发散色
python
sns.heatmap(correlation_matrix, cmap='vlag', center=0)

Custom Palettes

自定义调色板

python
undefined
python
undefined

Create custom palette

创建自定义调色板

custom = sns.color_palette("husl", 8)
custom = sns.color_palette("husl", 8)

Light to dark gradient

浅到深渐变

palette = sns.light_palette("seagreen", as_cmap=True)
palette = sns.light_palette("seagreen", as_cmap=True)

Diverging palette from hues

基于色调的发散调色板

palette = sns.diverging_palette(250, 10, as_cmap=True)
undefined
palette = sns.diverging_palette(250, 10, as_cmap=True)
undefined

Theming and Aesthetics

主题和美观设置

Set Theme

设置主题

set_theme()
controls overall appearance:
python
undefined
set_theme()
控制整体外观:
python
undefined

Set complete theme

设置完整主题

sns.set_theme(style='whitegrid', palette='pastel', font='sans-serif')
sns.set_theme(style='whitegrid', palette='pastel', font='sans-serif')

Reset to defaults

重置为默认值

sns.set_theme()
undefined
sns.set_theme()
undefined

Styles

样式

Control background and grid appearance:
  • "darkgrid"
    - Gray background with white grid (default)
  • "whitegrid"
    - White background with gray grid
  • "dark"
    - Gray background, no grid
  • "white"
    - White background, no grid
  • "ticks"
    - White background with axis ticks
python
sns.set_style("whitegrid")
控制背景和网格外观:
  • "darkgrid"
    - 灰色背景带白色网格(默认)
  • "whitegrid"
    - 白色背景带灰色网格
  • "dark"
    - 灰色背景,无网格
  • "white"
    - 白色背景,无网格
  • "ticks"
    - 白色背景带轴刻度
python
sns.set_style("whitegrid")

Remove spines

移除边框

sns.despine(left=False, bottom=False, offset=10, trim=True)
sns.despine(left=False, bottom=False, offset=10, trim=True)

Temporary style

临时样式

with sns.axes_style("white"): sns.scatterplot(data=df, x='x', y='y')
undefined
with sns.axes_style("white"): sns.scatterplot(data=df, x='x', y='y')
undefined

Contexts

上下文

Scale elements for different use cases:
  • "paper"
    - Smallest (default)
  • "notebook"
    - Slightly larger
  • "talk"
    - Presentation slides
  • "poster"
    - Large format
python
sns.set_context("talk", font_scale=1.2)
为不同使用场景缩放元素:
  • "paper"
    - 最小(默认)
  • "notebook"
    - 稍大
  • "talk"
    - 演示幻灯片
  • "poster"
    - 大格式
python
sns.set_context("talk", font_scale=1.2)

Temporary context

临时上下文

with sns.plotting_context("poster"): sns.barplot(data=df, x='category', y='value')
undefined
with sns.plotting_context("poster"): sns.barplot(data=df, x='category', y='value')
undefined

Best Practices

最佳实践

1. Data Preparation

1. 数据准备

Always use well-structured DataFrames with meaningful column names:
python
undefined
始终使用结构良好且列名有意义的DataFrame:
python
undefined

Good: Named columns in DataFrame

推荐:DataFrame中的命名列

df = pd.DataFrame({'bill': bills, 'tip': tips, 'day': days}) sns.scatterplot(data=df, x='bill', y='tip', hue='day')
df = pd.DataFrame({'bill': bills, 'tip': tips, 'day': days}) sns.scatterplot(data=df, x='bill', y='tip', hue='day')

Avoid: Unnamed arrays

避免:未命名数组

sns.scatterplot(x=x_array, y=y_array) # Loses axis labels
undefined
sns.scatterplot(x=x_array, y=y_array) # 丢失轴标签
undefined

2. Choose the Right Plot Type

2. 选择正确的图表类型

Continuous x, continuous y:
scatterplot
,
lineplot
,
kdeplot
,
regplot
Continuous x, categorical y:
violinplot
,
boxplot
,
stripplot
,
swarmplot
One continuous variable:
histplot
,
kdeplot
,
ecdfplot
Correlations/matrices:
heatmap
,
clustermap
Pairwise relationships:
pairplot
,
jointplot
连续x,连续y:
scatterplot
,
lineplot
,
kdeplot
,
regplot
连续x,分类y:
violinplot
,
boxplot
,
stripplot
,
swarmplot
单个连续变量:
histplot
,
kdeplot
,
ecdfplot
相关性/矩阵:
heatmap
,
clustermap
两两关系:
pairplot
,
jointplot

3. Use Figure-Level Functions for Faceting

3. 使用图级函数进行分面

python
undefined
python
undefined

Instead of manual subplot creation

推荐:使用图级函数而非手动创建子图

sns.relplot(data=df, x='x', y='y', col='category', col_wrap=3)
sns.relplot(data=df, x='x', y='y', col='category', col_wrap=3)

Not: Creating subplots manually for simple faceting

不推荐:简单分面时手动创建子图

undefined
undefined

4. Leverage Semantic Mappings

4. 利用语义映射

Use
hue
,
size
, and
style
to encode additional dimensions:
python
sns.scatterplot(data=df, x='x', y='y',
                hue='category',      # Color by category
                size='importance',    # Size by continuous variable
                style='type')         # Marker style by type
使用
hue
,
size
style
编码额外维度:
python
sns.scatterplot(data=df, x='x', y='y',
                hue='category',      # 按类别着色
                size='importance',    # 按连续变量设置大小
                style='type')         # 按类型设置标记样式

5. Control Statistical Estimation

5. 控制统计估计

Many functions compute statistics automatically. Understand and customize:
python
undefined
许多函数会自动计算统计量。请理解并自定义这些设置:
python
undefined

Lineplot computes mean and 95% CI by default

Lineplot默认计算均值和95%置信区间

sns.lineplot(data=df, x='time', y='value', errorbar='sd') # Use standard deviation instead
sns.lineplot(data=df, x='time', y='value', errorbar='sd') # 改为使用标准差

Barplot computes mean by default

Barplot默认计算均值

sns.barplot(data=df, x='category', y='value', estimator='median', # Use median instead errorbar=('ci', 95)) # Bootstrapped CI
undefined
sns.barplot(data=df, x='category', y='value', estimator='median', # 改为使用中位数 errorbar=('ci', 95)) # 自助法置信区间
undefined

6. Combine with Matplotlib

6. 与Matplotlib结合使用

Seaborn integrates seamlessly with matplotlib for fine-tuning:
python
ax = sns.scatterplot(data=df, x='x', y='y')
ax.set(xlabel='Custom X Label', ylabel='Custom Y Label',
       title='Custom Title')
ax.axhline(y=0, color='r', linestyle='--')
plt.tight_layout()
Seaborn可与Matplotlib无缝集成以进行微调:
python
ax = sns.scatterplot(data=df, x='x', y='y')
ax.set(xlabel='自定义X轴标签', ylabel='自定义Y轴标签',
       title='自定义标题')
ax.axhline(y=0, color='r', linestyle='--')
plt.tight_layout()

7. Save High-Quality Figures

7. 保存高质量图形

python
fig = sns.relplot(data=df, x='x', y='y', col='group')
fig.savefig('figure.png', dpi=300, bbox_inches='tight')
fig.savefig('figure.pdf')  # Vector format for publications
python
fig = sns.relplot(data=df, x='x', y='y', col='group')
fig.savefig('figure.png', dpi=300, bbox_inches='tight')
fig.savefig('figure.pdf')  # 用于出版的矢量格式

Common Patterns

常见模式

Exploratory Data Analysis

探索性数据分析

python
undefined
python
undefined

Quick overview of all relationships

快速概览所有关系

sns.pairplot(data=df, hue='target', corner=True)
sns.pairplot(data=df, hue='target', corner=True)

Distribution exploration

分布探索

sns.displot(data=df, x='variable', hue='group', kind='kde', fill=True, col='category')
sns.displot(data=df, x='variable', hue='group', kind='kde', fill=True, col='category')

Correlation analysis

相关性分析

corr = df.corr() sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)
undefined
corr = df.corr() sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)
undefined

Publication-Quality Figures

出版级图形

python
sns.set_theme(style='ticks', context='paper', font_scale=1.1)

g = sns.catplot(data=df, x='treatment', y='response',
                col='cell_line', kind='box', height=3, aspect=1.2)
g.set_axis_labels('Treatment Condition', 'Response (μM)')
g.set_titles('{col_name}')
sns.despine(trim=True)

g.savefig('figure.pdf', dpi=300, bbox_inches='tight')
python
sns.set_theme(style='ticks', context='paper', font_scale=1.1)

g = sns.catplot(data=df, x='treatment', y='response',
                col='cell_line', kind='box', height=3, aspect=1.2)
g.set_axis_labels('处理条件', '响应值(μM)')
g.set_titles('{col_name}')
sns.despine(trim=True)

g.savefig('figure.pdf', dpi=300, bbox_inches='tight')

Complex Multi-Panel Figures

复杂多面板图形

python
undefined
python
undefined

Using matplotlib subplots with seaborn

使用Matplotlib子图和Seaborn

fig, axes = plt.subplots(2, 2, figsize=(12, 10))
sns.scatterplot(data=df, x='x1', y='y', hue='group', ax=axes[0, 0]) sns.histplot(data=df, x='x1', hue='group', ax=axes[0, 1]) sns.violinplot(data=df, x='group', y='y', ax=axes[1, 0]) sns.heatmap(df.pivot_table(values='y', index='x1', columns='x2'), ax=axes[1, 1], cmap='viridis')
plt.tight_layout()
undefined
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
sns.scatterplot(data=df, x='x1', y='y', hue='group', ax=axes[0, 0]) sns.histplot(data=df, x='x1', hue='group', ax=axes[0, 1]) sns.violinplot(data=df, x='group', y='y', ax=axes[1, 0]) sns.heatmap(df.pivot_table(values='y', index='x1', columns='x2'), ax=axes[1, 1], cmap='viridis')
plt.tight_layout()
undefined

Time Series with Confidence Bands

带置信带的时间序列

python
undefined
python
undefined

Lineplot automatically aggregates and shows CI

Lineplot自动聚合并显示置信区间

sns.lineplot(data=timeseries, x='date', y='measurement', hue='sensor', style='location', errorbar='sd')
sns.lineplot(data=timeseries, x='date', y='measurement', hue='sensor', style='location', errorbar='sd')

For more control

更多控制选项

g = sns.relplot(data=timeseries, x='date', y='measurement', col='location', hue='sensor', kind='line', height=4, aspect=1.5, errorbar=('ci', 95)) g.set_axis_labels('Date', 'Measurement (units)')
undefined
g = sns.relplot(data=timeseries, x='date', y='measurement', col='location', hue='sensor', kind='line', height=4, aspect=1.5, errorbar=('ci', 95)) g.set_axis_labels('日期', '测量值(单位)')
undefined

Troubleshooting

故障排除

Issue: Legend Outside Plot Area

问题:图例超出绘图区域

Figure-level functions place legends outside by default. To move inside:
python
g = sns.relplot(data=df, x='x', y='y', hue='category')
g._legend.set_bbox_to_anchor((0.9, 0.5))  # Adjust position
图级函数默认将图例放在外部。要移到内部:
python
g = sns.relplot(data=df, x='x', y='y', hue='category')
g._legend.set_bbox_to_anchor((0.9, 0.5))  # 调整位置

Issue: Overlapping Labels

问题:标签重叠

python
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
python
plt.xticks(rotation=45, ha='right')
plt.tight_layout()

Issue: Figure Too Small

问题:图形太小

For figure-level functions:
python
sns.relplot(data=df, x='x', y='y', height=6, aspect=1.5)
For axes-level functions:
python
fig, ax = plt.subplots(figsize=(10, 6))
sns.scatterplot(data=df, x='x', y='y', ax=ax)
对于图级函数:
python
sns.relplot(data=df, x='x', y='y', height=6, aspect=1.5)
对于轴级函数:
python
fig, ax = plt.subplots(figsize=(10, 6))
sns.scatterplot(data=df, x='x', y='y', ax=ax)

Issue: Colors Not Distinct Enough

问题:颜色不够区分

python
undefined
python
undefined

Use a different palette

使用不同的调色板

sns.set_palette("bright")
sns.set_palette("bright")

Or specify number of colors

或指定颜色数量

palette = sns.color_palette("husl", n_colors=len(df['category'].unique())) sns.scatterplot(data=df, x='x', y='y', hue='category', palette=palette)
undefined
palette = sns.color_palette("husl", n_colors=len(df['category'].unique())) sns.scatterplot(data=df, x='x', y='y', hue='category', palette=palette)
undefined

Issue: KDE Too Smooth or Jagged

问题:KDE太光滑或太粗糙

python
undefined
python
undefined

Adjust bandwidth

调整带宽

sns.kdeplot(data=df, x='x', bw_adjust=0.5) # Less smooth sns.kdeplot(data=df, x='x', bw_adjust=2) # More smooth
undefined
sns.kdeplot(data=df, x='x', bw_adjust=0.5) # 更粗糙 sns.kdeplot(data=df, x='x', bw_adjust=2) # 更光滑
undefined

Resources

资源

This skill includes reference materials for deeper exploration:
本技能包含用于深入探索的参考资料:

references/

references/

  • function_reference.md
    - Comprehensive listing of all seaborn functions with parameters and examples
  • objects_interface.md
    - Detailed guide to the modern seaborn.objects API
  • examples.md
    - Common use cases and code patterns for different analysis scenarios
Load reference files as needed for detailed function signatures, advanced parameters, or specific examples.
  • function_reference.md
    - 所有Seaborn函数的综合列表,包含参数和示例
  • objects_interface.md
    - 现代seaborn.objects API的详细指南
  • examples.md
    - 不同分析场景的常见用例和代码模式
根据需要加载参考文件,以获取详细的函数签名、高级参数或特定示例。