Code of the Day
IntermediateVisualisation

Choosing a chart

Match your chart type to the question you are asking — bar for comparison, line for trend, scatter for relationship, histogram for distribution, box for spread.

Data ScienceIntermediate7 min read
By the end of this lesson you will be able to:
  • Match each of five core chart types to the question it answers
  • Identify common chart mistakes that distort interpretation
  • State the one question that determines which chart to use

Every chart answers a question. Pick the chart before you pick the library. The most common mistake in data visualisation is choosing a chart type first and finding a dataset to fit it — the result looks busy but says nothing.

The question that determines the chart: "What am I showing, and how are these values related to each other?" Five relationships cover the vast majority of analysis work.

Bar chart — comparison across categories

Use a bar chart when you are comparing a numeric value across discrete groups: revenue by product, headcount by department, average rating by country. The height of each bar encodes the value; the categories are on the axis.

A classic mistake: truncating the y-axis so it does not start at zero. A difference that looks dramatic in a truncated chart may be trivial in proportion to the actual values. If your axis starts at 90 and the bars sit at 91 and 99, you are hiding the scale.

Line chart — change over time

A line chart implies that the x-axis is a continuous or ordered sequence — typically time. The line connecting points signals that there is a meaningful relationship between adjacent values (each follows from the previous). Do not use a line chart for unordered categories; use a bar chart instead.

Scatter plot — relationship between two numeric variables

A scatter plot puts one numeric variable on each axis and draws a point per observation. It is the correct tool for asking "is there a relationship between X and Y?" — correlation, clusters, and outliers all become visible. Adding a third variable as point colour or size is useful but keep the encoding simple.

Histogram — distribution of a single variable

A histogram bins a continuous variable and shows how many observations fall in each bin. It answers "what does the spread of this variable look like?" — is it symmetric, skewed, bimodal, or does it have long tails? Use it before modelling to understand what you are working with.

Box plot — spread and outliers across groups

A box plot summarises a distribution compactly: the box spans the interquartile range (25th to 75th percentile), the line inside is the median, and the whiskers extend to the data range (with outliers plotted individually). Box plots are most useful when comparing the spread of the same variable across several groups.

Chart typeBest questionCommon mistake
BarHow do categories compare?Truncated y-axis
LineHow does a value change over time?Using for unordered categories
ScatterIs there a relationship between X and Y?Overplotting with too many points
HistogramWhat is the distribution of this variable?Too few or too many bins
BoxHow does spread compare across groups?Omitting the sample size

Pie charts are conspicuously absent from this list. They work when you have two or three slices and the proportional story matters — "half of sales came from one customer." With more than four slices, the human eye cannot reliably compare arc lengths. Use a bar chart instead.

Where to go next

Next: matplotlib essentials — creating figures with plt.subplots(), plotting lines and bars, adding labels and legends, all in runnable code.

Finished reading? Mark it complete to track your progress.

On this page