Seaborn statistical plots
Seaborn understands DataFrames and adds statistical awareness — learn when pairplot, heatmap, catplot, and boxplot each reveal insight.
- Explain seaborn's positioning relative to matplotlib
- Identify when pairplot, heatmap, catplot, and boxplot each provide useful insight
- Describe what "statistical visualisation" means in practice
Matplotlib is a drawing library — it draws lines, rectangles, and text precisely where you tell it to. Seaborn is a statistical visualisation library. The difference is that seaborn understands DataFrames natively, knows about groups and categories, and applies sensible statistical defaults automatically.
When you call sns.boxplot(data=df, x="category", y="value"), seaborn groups
the data by category, computes the box statistics, and draws a labelled plot
— in one line. The matplotlib equivalent requires grouping manually, computing
quartiles, and calling the lower-level ax.boxplot() with the right parameters.
Seaborn draws on matplotlib under the hood. Everything it creates is a
matplotlib figure, so you can always retrieve the axes and customise with
standard ax.set_title() calls.
Four plots worth knowing
pairplot — multi-variable overview
sns.pairplot(df) takes a DataFrame and draws a grid where each numeric column
is plotted against every other. The diagonal shows each column's distribution;
the off-diagonal cells show scatter plots for each pair. It is the fastest way
to see all bivariate relationships in a dataset at once — a standard first step
in exploratory analysis.
Use it when you have four to ten numeric columns and want a broad overview. With more than ten columns the grid becomes unreadably dense.
heatmap — correlations and matrices
sns.heatmap(df.corr()) converts a correlation matrix into a colour-coded grid.
Strong positive correlations are one colour (often red or blue), strong negative
correlations are the other, and near-zero correlations are neutral. At a glance
you can see which features move together.
Heatmaps also work for any matrix of values — for example, a pivot table of sales by product and region.
catplot — grouped comparisons
sns.catplot(data=df, x="category", y="value", kind="bar") is a high-level
interface for categorical plots. Setting kind= to "bar", "box", "strip",
or "violin" switches the plot type without changing anything else. When you
also set hue= to a second categorical column, seaborn automatically splits
each group into coloured sub-bars or sub-boxes.
boxplot — distributions and outliers
sns.boxplot(data=df, x="group", y="value") is the most direct way to compare
distributions across groups. The box spans the interquartile range, the whiskers
reach to 1.5× IQR, and points beyond that are plotted individually as outliers.
It is the right first tool when you suspect outliers or skewed distributions in
a grouped dataset.
A seaborn plot is also a matplotlib axes object. After calling any seaborn
function, you can call plt.title(), plt.xlabel(), or retrieve the current
axes with plt.gca() and use any ax. method on it.
When to use each
| Plot | Best for |
|---|---|
pairplot | First-look overview of all numeric columns |
heatmap | Correlation matrix or any value grid |
catplot | Comparing a numeric value across categories |
boxplot | Spread, skew, and outliers within groups |
Where to go next
Next: seaborn in practice — running pairplot, heatmap, and catplot on an inline DataFrame so you can see exactly what each produces.