Code of the Day
AdvancedML Concepts

Lab: compare classification models

Fit three classifiers on the same dataset, measure accuracy and confusion matrices, and identify which generalises best.

Lab · optionalData ScienceAdvanced25 min
Recommended first
By the end of this lesson you will be able to:
  • Fit LogisticRegression, DecisionTreeClassifier, and KNeighborsClassifier on the same dataset
  • Score each model on a held-out test set
  • Build and read confusion matrices
  • Write a one-paragraph interpretation of which model generalises best and why

The model-selection heuristics from the previous lesson become concrete when you run all three on the same classification problem. The goal here is not just to get numbers — it is to develop the habit of asking why the numbers differ and what that tells you about the data and the models.

The dataset

We use a synthetic binary classification problem: 300 samples, 2 informative features, modest class overlap. It is small enough that all three models run instantly, but large enough for the train/test split to give stable estimates.

Python — editable, runs in your browser

Accuracy is a starting point, but it does not tell you where each model makes mistakes. A model might be correct 85% of the time but wrong on exactly the cases that matter most. The confusion matrix shows the breakdown.

Checkpoint 2 — confusion matrices

Python — editable, runs in your browser

Reading the confusion matrix: each row is the true class; each column is the predicted class. The diagonal is correct predictions (true negatives and true positives). Off-diagonal entries are errors — false positives (predicted positive when actually negative) and false negatives (missed positives).

Two models with identical accuracy can have very different confusion matrices. One might favour false positives; another might favour false negatives. Which is worse depends entirely on the problem domain — a medical test prefers false positives over missed diagnoses.

Checkpoint 3 — full metric sweep

Python — editable, runs in your browser

Interpretation exercise

After running the three blocks above, write your interpretation (mentally, or in a notebook). A complete interpretation addresses:

  1. Which model has the highest accuracy, and is the gap meaningful?
  2. Do precision and recall differ substantially across models? What would that mean for a use case where false negatives are costly?
  3. Does the confusion matrix pattern match what the accuracy numbers suggest?
  4. Based on the bias-variance framework: is any model likely overfitting at these settings? How would you test that hypothesis?

For this synthetic dataset, logistic regression often performs comparably to the tree and k-NN because the decision boundary is close to linear. If you increase n_informative or add polynomial structure, the gap will widen. Changing the data and re-running is a fast way to build intuition about when each model earns its keep.

Where to go next

The ML Concepts module is complete. Next: Sklearn in Practice — the uniform fit/predict/transform API that makes all of these models composable into pipelines.

Finished reading? Mark it complete to track your progress.

On this page