Code of the Day
IntermediateVisualisation

Anatomy of a good chart

Data-ink ratio, accessible colour palettes, and targeted annotation — the principles that separate clear charts from cluttered ones.

Data ScienceIntermediate6 min read
By the end of this lesson you will be able to:
  • Apply the data-ink ratio principle by identifying elements to remove
  • Choose the right colour palette type for ordered, diverging, and categorical data
  • Annotate a chart to highlight the interesting point without adding clutter

A chart can be technically correct and still be hard to read. The culprit is usually visual noise — elements that occupy space without carrying information. Three principles address the most common causes.

Data-ink ratio

Edward Tufte introduced the data-ink ratio: the fraction of the chart's ink that is doing the work of representing data. A high ratio means almost every pixel is a data point or a label that helps the reader interpret one. A low ratio means the chart is full of decoration.

In practice, the easiest wins are:

  • Remove the top and right spines (ax.spines["top"].set_visible(False)) — they frame the chart without adding information.
  • Reduce gridlines — one set of horizontal gridlines at a low opacity is usually sufficient. Vertical gridlines rarely help a bar or line chart.
  • Remove the legend box borderax.legend(frameon=False) is cleaner.
  • Avoid 3-D effects — 3-D bar charts distort the visual comparison because depth creates an apparent height difference that does not exist in the data.

None of these changes alter what the chart says. They reduce the time a reader spends filtering noise before reaching the signal.

Colour palette types

Colour is encoding. Choosing the wrong palette type obscures the data.

Sequential palettes (light-to-dark, single hue) are for data with a natural order where higher means more or darker. Choropleth maps of population density, heatmaps of a single positive metric. Using a diverging palette here implies there is a meaningful midpoint when there is not.

Diverging palettes (two hues meeting at a neutral centre) are for data where the midpoint matters: correlation coefficients centred at zero, temperature anomalies relative to an average, profit/loss. The neutral centre is visually prominent; values pull away from it in two directions.

Categorical palettes (distinct, unordered colours) are for nominal groups: product names, countries, experiment conditions. These colours should be perceptually distinct and of similar luminance — no one group should visually dominate because it is brighter.

Red-green combinations are the most common colour accessibility problem — roughly 8% of men have red-green colour deficiency. Use palette tools that check for colour-blind safety: seaborn's "colorblind" palette and matplotlib's "tab10" default are reasonable starting points.

Annotate the interesting point, not every point

Annotation adds text directly to the chart. The temptation is to label everything — every bar value, every data point. That produces a chart that is technically complete and practically unreadable.

The better approach: annotate the one or two points that tell the story.

  • If a line chart shows a sudden spike, add a text label at the spike explaining what caused it.
  • If a scatter plot has one clear outlier, label that point with its identifier.
  • If a bar chart has a clear winner, annotate that bar's value; leave the others unlabelled.

The annotation should answer the reader's first question ("what is the interesting thing here?") before they have to hunt for it.

Putting it together

These three principles reinforce each other. A chart stripped of redundant ink has room for one or two annotations without becoming cluttered. Correct colour encoding means annotations are shorter because the colour already conveys the group. The result is a chart where the reader's eye goes directly to the data, not to the frame around it.

Where to go next

Next: lab — visual summary — build a four-panel figure from a single dataset, applying chart selection, matplotlib/seaborn code, and the design principles from this lesson all at once.

Finished reading? Mark it complete to track your progress.

On this page