Lab: choosing the right structure
Feel why a set or dict beats a list for the right job — then use them to solve real tasks.
- Experience the speed difference between a list scan and a set lookup
- Choose a data structure from the operation a task needs
- Solve real tasks with dicts and sets
This is an optional lab. It's built around a question the lessons answered in words — why does the structure matter? Here you'll feel it, then put it to work. Learn by doing.
The data-structures idea is easy to nod along to and easy to forget. So before any graded work, let's make it physical: the same membership check, on a list versus a set.
Feel the difference
Run this. It checks whether a number is present, first in a list (scanned one by one) and then in a set (a direct hashed lookup). Watch the timings:
Same answer, wildly different cost. The list has to walk every element to be sure; the set jumps straight to the answer regardless of size. That is why "use a set for membership" matters — and you can now see it, not just take it on faith. The structure, not cleverer code, made it fast.
Checkpoint 1 — count word frequencies
A dict shines when you're tallying things by name. Write count_words to
return how many times each word appears. (Hint: text.split() with no argument
splits on whitespace and gives [] for an empty string — which the checks rely
on.)
Write count_words(text) returning a dict mapping each word to its count. Split on whitespace.
count_words("a b a") → {"a": 2, "b": 1}The clean pattern is counts[word] = counts.get(word, 0) + 1 — get with a
default avoids a KeyError on the first sighting of a word.
Checkpoint 2 — find the first repeat
Now use a set for its real superpower: fast "have I seen this before?" Write
first_repeated to return the first element that appears a second time (or None
if every element is unique). Track what you've seen in a set as you go.
Write first_repeated(items) returning the first element that appears more than once, scanning left to right. Return None if there are no repeats.
first_repeated([1, 2, 3, 2, 1]) → 2first_repeated([1, 2, 3]) → NoneNotice you reached for a set here because of the operation — repeated membership checks — exactly the reasoning the feel-the-difference demo made concrete. You're now choosing structures by what they make cheap.
Done?
Two green checkpoints, and a gut-level sense of why structure matters. Carry that question — "what operation does this need to be cheap?" — into every program you write. Next in the module: handling things going wrong with errors and exceptions.