Hash Tables5 min read

Sets & Maps

Instant membership and key-value lookup — the everyday hash table
Set lookup: \(O(1)\) avgMap lookup: \(O(1)\) avgMap insert: \(O(1)\) avgspace: \(O(n)\)

Hash tables power two of the most useful data structures in everyday programming: Sets and Maps.

A Set is like a bouncer at a club with a guest list — you can ask "Is this person on the list?" and get an instant yes or no. No duplicates allowed. A Map (also called a dictionary or hash map) is like a phone book — given a name (key) you get a phone number (value) immediately.

Python — dict and set

Python's built-in dict is a hash map. Python's set is basically a dict with no values, just keys. Both use open addressing with a custom probing strategy.

pseudocode
1
2
3
4
5
6
7
freq = {} # Map: word → count
for word in words:
freq[word] = freq.get(word, 0) + 1
seen = set() # Set: membership test
if x not in seen:
seen.add(x)

JavaScript — Map, Set, and plain objects

JS has three options and they are not equivalent:

  • Object — key-value store but keys must be strings/symbols; has prototype pollution risk; no guaranteed insertion-order before ES2015
  • Map — keys can be any type; guaranteed insertion-order iteration; has .size property; preferred for algorithmic use
  • Set — unique values of any type; iteration in insertion order

Rule of thumb: use Map/Set when your keys aren't simple strings, or when you need insertion-order iteration. Use a plain object when you know keys are strings and you want JSON-serializable output.

Explore sets and maps

← → arrow keys to step · click elements to interact

Sets & Maps: Core Operations

# Set operations
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
print(A | B) # union: {1, 2, 3, 4, 5, 6}
print(A & B) # intersection: {3, 4}
print(A - B) # difference: {1, 2}
print(3 in A) # membership: True (O(1))
# Map (dict) operations
scores = {'Alice': 95, 'Bob': 87}
scores['Carol'] = 91 # insert
scores['Alice'] = 98 # update
print(scores.get('Bob', 0)) # 87
print(scores.get('Dave', 0)) # 0 (default)
del scores['Bob'] # delete
print(list(scores.keys())) # ['Alice', 'Carol']
Output
{1, 2, 3, 4, 5, 6}
{3, 4}
{1, 2}
True
87
0
['Alice', 'Carol']

Common interview patterns using Map/Set

Two Sum — store each number's index in a Map. For each new number check if the complement exists in \(O(1)\).

Contains Duplicate — add elements to a Set. If .add() would create a duplicate, the Set size stops growing — or just check has() before adding.

Group Anagrams — sort each word to get a canonical key ("eat" → "aet"), use that as a Map key, append the word to the list.

Ordered vs Unordered

Standard hash-based Maps/Sets are unordered — iteration order is not guaranteed to match insertion order (language-dependent). If you need sorted iteration, use a TreeMap (Java) or SortedDict (Python sortedcontainers), which use a balanced BST internally and give \(O(\log n)\) operations.

Note: When you see a problem asking for duplicates, frequency counts, or whether something has been seen before — reach for a Set or Map first. It almost always turns an \(O(n^2)\) brute-force into \(O(n)\).

Complexity

Set.has / Set.addHash to bucket, done
Instant\(O(1)\) avg
Map.get / Map.setHash key, read/write value
Instant\(O(1)\) avg
Map.delete / Set.deleteHash key, remove entry
Instant\(O(1)\) avg
Iterate all entriesVisit every stored key-value pair
Linear\(O(n)\)
SpaceOne slot per unique key
Linear\(O(n)\)

Two Sum Using a Map

def two_sum(nums, target):
seen = {} # value -> index
for i, n in enumerate(nums):
complement = target - n
if complement in seen:
return [seen[complement], i]
seen[n] = i
return []
print(two_sum([2, 7, 11, 15], 9)) # [0, 1]
print(two_sum([3, 2, 4], 6)) # [1, 2]
Output
[0, 1]
[1, 2]
Challenge

Quick check

You want to count word frequencies in a document. Which structure is most natural?

Continue reading

Hash Functions
Turn any key into a bucket number — instantly
Frequency Patterns
Count once, answer anything — the hash map superpower
CachingSystem Design
Remember answers so you don't compute them twice
Dictionaries & SetsPython
Look things up by name instead of by number