Choosing a model

A practical decision guide — linear regression, decision trees, and k-NN — applied to the same dataset so you can see the difference.

Knowing the bias-variance tradeoff tells you what can go wrong. The next step is a working heuristic for which model to reach for first. No algorithm wins on every problem — but practical defaults narrow the field quickly.

A three-way decision guide

Linear regression is the right starting point when you expect the relationship between inputs and output to be roughly linear, or when you need a model that is easy to inspect and explain. Coefficients have direct interpretability: doubling a feature value doubles its contribution to the prediction (modulo scale). If the residuals are systematic (curved), linear regression has hit its complexity ceiling.

Decision trees shine when interpretability matters more than raw accuracy and the relationships are non-linear. A shallow tree (max depth 3–5) is human- readable: you can print it and follow the logic by hand. Trees handle mixed feature types natively, require no scaling, and model interactions automatically. Their weakness is variance — a deep tree memorises noise.

k-Nearest Neighbours (k-NN) is lazy in the technical sense: it stores the training set and, at prediction time, averages the k closest training examples. It works well on small datasets where the feature space is low-dimensional and the local structure of the data matters. It scales poorly to large datasets (prediction time grows with training set size) and degrades in high dimensions.

Comparing all three on the same data

The cleanest way to apply the guide is empirically: fit all three, measure MSE on a held-out split, and let the numbers confirm or challenge your intuition.

Python — editable, runs in your browser

Run this and examine the numbers. Because the true relationship includes x2² (a non-linear term), linear regression pays a bias penalty — it cannot model the curve. The decision tree and k-NN can both capture it, so their MSE should be lower. If you remove the x2² term and make the relationship purely linear, linear regression wins.

That is the decision guide in action: match the model's assumed form to the actual structure of the data. When you do not know that structure, start simple (linear), measure, then move to more flexible models if the residuals show systematic patterns.

MSE comparisons are only valid when both models see the same train/test split. Using a single split is a fast sanity check; for robust comparisons use cross-validation, covered in the Model Evaluation module.

Where to go next

You can now choose a starting model. The lab that follows puts this into practice: compare three classifiers on the same dataset and identify which one generalises best.

Finished reading? Mark it complete to track your progress.

A three-way decision guide

Comparing all three on the same data

Where to go next

On this page