BFS and DFS
Implement breadth-first search and depth-first search on an adjacency list, use a visited set to avoid cycles, and find connected components.
- Implement BFS on an adjacency list using a deque and a visited set
- Implement DFS on an adjacency list recursively
- Use DFS to find all connected components in an undirected graph
Once a graph is in memory as an adjacency list, the two fundamental questions
are: "which nodes can I reach from here?" and "what order should I visit them
in?" Breadth-first search (BFS) answers both by spreading outward level by
level — it visits all nodes one edge away before any node two edges away.
Depth-first search (DFS) dives as deep as possible along each branch before
backtracking. Both need a visited set to avoid looping forever on cycles.
BFS — level by level
BFS uses a queue. You enqueue the start node, then repeatedly dequeue a node, record it as visited, and enqueue any unvisited neighbours. Because the queue is FIFO, you always finish one "level" of distance before moving to the next.
A visited set is not optional — without it, any cycle in the graph causes BFS
to enqueue the same node repeatedly and run forever.
DFS — depth first
DFS uses a stack — either the call stack (recursion) or an explicit stack. The recursive form is concise: visit the current node, then recurse on each unvisited neighbour. Like BFS, it must track which nodes have been seen.
Connected components
An undirected graph may consist of several disconnected pieces — connected components. To enumerate them, iterate over every node. If a node has not been visited yet, it belongs to a new component; run DFS from that node to visit every node in that component. Repeat until all nodes are covered.
Why the visited set matters
Consider the simplest possible cycle: nodes 0 and 1 with edges in both
directions. Without a visited set, BFS enqueues 1 from 0, then enqueues 0
from 1, then 1 again from 0, and so on forever. The seen set (added before
enqueuing, not after dequeuing) ensures a node is placed in the queue at most
once.
Add a node to seen when you enqueue it, not when you dequeue it.
Waiting until dequeue means the same node can be enqueued multiple times
before it is processed — wasting memory and potentially causing incorrect
results.
BFS vs DFS — when to use which
BFS guarantees the shortest path (fewest edges) to any reachable node in an unweighted graph. It is the right choice whenever distance matters: shortest-path problems, "degrees of separation" queries, level-by-level processing of trees or graphs.
DFS is lighter on memory (the call stack or explicit stack only needs to
hold one path at a time, not the entire frontier). It is natural for cycle
detection, topological sort, and problems where you want to enumerate all
paths. The connected_components function above uses an iterative DFS because
it avoids Python's recursion limit on large graphs.
Both algorithms visit every node and edge at most once: O(V + E) time and O(V) space for the visited set.
Where to go next
Next: tree fundamentals — a tree is a special graph (acyclic and connected) with its own vocabulary and traversal conventions.
Graph representation
Represent graphs as adjacency lists and adjacency matrices, compare their space and time tradeoffs, and distinguish directed, undirected, and weighted graphs.
Tree fundamentals
Define a tree as an acyclic connected graph, understand binary tree structure, and identify what inorder, preorder, and postorder traversals each produce.