Binary search trees
Define the BST invariant, understand O(log n) average search and insertion, and see why balance matters when sorted input degrades a BST to O(n).
- State the BST invariant and explain why it enables binary search
- Describe O(log n) average-case search and insertion in a balanced BST
- Explain why inserting sorted values produces a degenerate O(n) BST
A binary search tree (BST) is a binary tree with a structural promise: for every node, every value in its left subtree is smaller, and every value in its right subtree is larger. This is the BST invariant.
5
/ \
3 8
/ \ \
1 4 9Node 5: left subtree contains 4 — all < 5. Right subtree contains 9 — all > 5. The invariant holds at every node, not just the root.
Why the invariant enables binary search
To search for a value k starting from the root, compare k to the current
node's value:
- If
kequals the node's value — found. - If
kis smaller — the value can only be in the left subtree; discard the entire right subtree. - If
kis larger — the value can only be in the right subtree; discard the entire left subtree.
Each comparison eliminates roughly half the remaining nodes. In a balanced tree of n nodes, the height is approximately log₂(n), so the search takes at most O(log n) comparisons.
Insertion follows the same logic: traverse down as if searching, then insert
the new value at the None slot where the search would have ended. This
preserves the invariant.
Average vs worst case
"Balanced" is the critical word. The O(log n) guarantee holds when the tree's height is O(log n) — roughly equal numbers of nodes on each side at every level.
Consider inserting the values 1, 2, 3, 4, 5 in sorted order:
1
\
2
\
3
\
4
\
5Every new value is larger than all existing values, so it always goes right. The result is a tree with height n − 1: a linked list in disguise. Searching for 5 requires visiting all 5 nodes — O(n), not O(log n).
Never insert sorted (or reverse-sorted) data into a plain BST. The height degrades to O(n) and all operations follow. In practice, shuffle the input before insertion, or use a self-balancing variant.
Self-balancing BSTs
AVL trees and red-black trees maintain a balance invariant via rotations — local structural rearrangements that restore balance after each insertion or deletion. Both guarantee O(log n) worst-case for search, insert, and delete.
Python's standard library does not expose a BST directly, but the sortedcontainers
third-party package provides a SortedList with O(log n) operations built on
a balanced structure. For most interview and course problems you implement a
plain BST and assume reasonably balanced input.
Inorder traversal revisited
The BST invariant explains something from the previous lesson: inorder traversal of any BST produces values in sorted ascending order. The proof follows directly from the invariant — every left subtree contains only smaller values (visited first) and every right subtree contains only larger values (visited last), so the root always falls in its correct sorted position.
This relationship is bidirectional: if inorder traversal of a binary tree produces a sorted sequence, the tree is a valid BST.
Where to go next
Next: lab — graphs and trees — implement BFS shortest path, connected components, BST insertion, and BST search in four graded checkpoints.