Introducing the Python defaultdict

Wednesday, Jun 23 2010

A defaultdict is a dictionary with a default value for keys, so that keys for which no value has been explicitly defined can be accessed without errors. Let’s see it in action.

Frequencies

Suppose you are given a list of words and you are asked to compute frequencies. You could do something like this:

frequencies = {}
for word in wordlist:
    frequencies[word] += 1

Unfortunately, Python throws a KeyError the first time through, because you cannot increment the the values for the words, as they have never been initialized. A workaround would be to catch the KeyError exception:

for word in wordlist:
    try:
        frequencies[word] += 1
    except KeyError:
        frequencies[word] = 1

Or you could use an if/else block:

for word in wordlist:
    if word in frequencies:
        frequencies[word] += 1
    else:
        frequencies[word] = 1
 

The defaultdict solution

In Python 2.5 and later, though, the  collections.defaultdict class comes to the rescue! A defaultdict is just like a regular Python dict, except that it supports an additional argument at initialization: a function. If someone attempts to access a key to which no value has been assigned, that function will be called (without arguments) and its return value is used as the default value for the key. Clever, right?

Going back to our example, we want the default value to be 0, so we pass the built-in function int()to the defaultdict constructor. When called without arguments, the int() function simply returns 0.

from collections import defaultdict
frequencies = defaultdict(int)
for word in wordlist:
    frequencies[word] += 1