BFS and DFS

Implement breadth-first search and depth-first search on an adjacency list, use a visited set to avoid cycles, and find connected components.

Once a graph is in memory as an adjacency list, the two fundamental questions are: "which nodes can I reach from here?" and "what order should I visit them in?" Breadth-first search (BFS) answers both by spreading outward level by level — it visits all nodes one edge away before any node two edges away. Depth-first search (DFS) dives as deep as possible along each branch before backtracking. Both need a visited set to avoid looping forever on cycles.

BFS — level by level

BFS uses a queue. You enqueue the start node, then repeatedly dequeue a node, record it as visited, and enqueue any unvisited neighbours. Because the queue is FIFO, you always finish one "level" of distance before moving to the next.

A visited set is not optional — without it, any cycle in the graph causes BFS to enqueue the same node repeatedly and run forever.

DFS — depth first

DFS uses a stack — either the call stack (recursion) or an explicit stack. The recursive form is concise: visit the current node, then recurse on each unvisited neighbour. Like BFS, it must track which nodes have been seen.

Connected components

An undirected graph may consist of several disconnected pieces — connected components. To enumerate them, iterate over every node. If a node has not been visited yet, it belongs to a new component; run DFS from that node to visit every node in that component. Repeat until all nodes are covered.

Python — editable, runs in your browser

from collections import deque

def bfs(graph, start):
  """Return nodes in BFS order starting from start."""
  visited_order = []
  seen = {start}
  queue = deque([start])
  while queue:
      node = queue.popleft()
      visited_order.append(node)
      for neighbour in graph.get(node, []):
          if neighbour not in seen:
              seen.add(neighbour)
              queue.append(neighbour)
  return visited_order

def dfs(graph, start, visited=None):
  """Return nodes in DFS order starting from start (recursive)."""
  if visited is None:
      visited = []
  visited.append(start)
  for neighbour in graph.get(start, []):
      if neighbour not in visited:
          dfs(graph, neighbour, visited)
  return visited

def connected_components(graph):
  """Return the number of connected components in an undirected graph."""
  visited = set()
  count = 0
  for node in graph:
      if node not in visited:
          count += 1
          # DFS from this node marks everything in its component.
          stack = [node]
          while stack:
              n = stack.pop()
              if n not in visited:
                  visited.add(n)
                  for nb in graph.get(n, []):
                      stack.append(nb)
  return count

# Undirected graph with two separate components:
#   0 -- 1 -- 3        4 -- 5
#   |
#   2
graph = {
  0: [1, 2],
  1: [0, 3],
  2: [0],
  3: [1],
  4: [5],
  5: [4],
}

print("BFS from 0:", bfs(graph, 0))
print("DFS from 0:", dfs(graph, 0))
print("Connected components:", connected_components(graph))

Why the visited set matters

Consider the simplest possible cycle: nodes 0 and 1 with edges in both directions. Without a visited set, BFS enqueues 1 from 0, then enqueues 0 from 1, then 1 again from 0, and so on forever. The seen set (added before enqueuing, not after dequeuing) ensures a node is placed in the queue at most once.

Add a node to seen when you enqueue it, not when you dequeue it. Waiting until dequeue means the same node can be enqueued multiple times before it is processed — wasting memory and potentially causing incorrect results.

BFS vs DFS — when to use which

BFS guarantees the shortest path (fewest edges) to any reachable node in an unweighted graph. It is the right choice whenever distance matters: shortest-path problems, "degrees of separation" queries, level-by-level processing of trees or graphs.

DFS is lighter on memory (the call stack or explicit stack only needs to hold one path at a time, not the entire frontier). It is natural for cycle detection, topological sort, and problems where you want to enumerate all paths. The connected_components function above uses an iterative DFS because it avoids Python's recursion limit on large graphs.

Both algorithms visit every node and edge at most once: O(V + E) time and O(V) space for the visited set.

Where to go next

Next: tree fundamentals — a tree is a special graph (acyclic and connected) with its own vocabulary and traversal conventions.

Finished reading? Mark it complete to track your progress.