Hash Tables5 min read

Sets & Maps

Instant membership and key-value lookup β€” the everyday hash table
Set lookup: \(O(1)\) avgMap lookup: \(O(1)\) avgMap insert: \(O(1)\) avgspace: \(O(n)\)

Hash tables power two of the most useful data structures in everyday programming: Sets and Maps.

A Set is like a bouncer at a club with a guest list β€” you can ask "Is this person on the list?" and get an instant yes or no. No duplicates allowed. A Map (also called a dictionary or hash map) is like a phone book β€” given a name (key) you get a phone number (value) immediately.

Python β€” dict and set

Python's built-in dict is a hash map. Python's set is basically a dict with no values, just keys. Both use open addressing with a custom probing strategy.

pseudocode
1
2
3
4
5
6
7
freq = {} # Map: word β†’ count
for word in words:
freq[word] = freq.get(word, 0) + 1
seen = set() # Set: membership test
if x not in seen:
seen.add(x)

JavaScript β€” Map, Set, and plain objects

JS has three options and they are not equivalent:

  • Object β€” key-value store but keys must be strings/symbols; has prototype pollution risk; no guaranteed insertion-order before ES2015
  • Map β€” keys can be any type; guaranteed insertion-order iteration; has .size property; preferred for algorithmic use
  • Set β€” unique values of any type; iteration in insertion order

Rule of thumb: use Map/Set when your keys aren't simple strings, or when you need insertion-order iteration. Use a plain object when you know keys are strings and you want JSON-serializable output.

β—ˆ Explore sets and maps

← β†’ arrow keys to step Β· click elements to interact

Sets & Maps: Core Operations

# Set operations
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
print(A | B) # union: {1, 2, 3, 4, 5, 6}
print(A & B) # intersection: {3, 4}
print(A - B) # difference: {1, 2}
print(3 in A) # membership: True (O(1))
# Map (dict) operations
scores = {'Alice': 95, 'Bob': 87}
scores['Carol'] = 91 # insert
scores['Alice'] = 98 # update
print(scores.get('Bob', 0)) # 87
print(scores.get('Dave', 0)) # 0 (default)
del scores['Bob'] # delete
print(list(scores.keys())) # ['Alice', 'Carol']
Output
{1, 2, 3, 4, 5, 6}
{3, 4}
{1, 2}
True
87
0
['Alice', 'Carol']

Common interview patterns using Map/Set

Two Sum β€” store each number's index in a Map. For each new number check if the complement exists in \(O(1)\).

Contains Duplicate β€” add elements to a Set. If .add() would create a duplicate, the Set size stops growing β€” or just check has() before adding.

Group Anagrams β€” sort each word to get a canonical key ("eat" β†’ "aet"), use that as a Map key, append the word to the list.

Ordered vs Unordered

Standard hash-based Maps/Sets are unordered β€” iteration order is not guaranteed to match insertion order (language-dependent). If you need sorted iteration, use a TreeMap (Java) or SortedDict (Python sortedcontainers), which use a balanced BST internally and give \(O(\log n)\) operations.

Note: When you see a problem asking for duplicates, frequency counts, or whether something has been seen before β€” reach for a Set or Map first. It almost always turns an \(O(n^2)\) brute-force into \(O(n)\).

Complexity

Set.has / Set.add
Hash to bucket, done
Instant \(O(1)\) avg
Map.get / Map.set
Hash key, read/write value
Instant \(O(1)\) avg
Map.delete / Set.delete
Hash key, remove entry
Instant \(O(1)\) avg
Iterate all entries
Visit every stored key-value pair
Linear \(O(n)\)
Space
One slot per unique key
Linear \(O(n)\)

Two Sum Using a Map

def two_sum(nums, target):
seen = {} # value -> index
for i, n in enumerate(nums):
complement = target - n
if complement in seen:
return [seen[complement], i]
seen[n] = i
return []
print(two_sum([2, 7, 11, 15], 9)) # [0, 1]
print(two_sum([3, 2, 4], 6)) # [1, 2]
Output
[0, 1]
[1, 2]
Challenge

Quick check

You want to count word frequencies in a document. Which structure is most natural?

Continue reading