# List, Dict and Set Comprehensions By Example

One type of syntactic sugar that sets Python apart from more verbose languages is comprehensions. Comprehensions are a special notation for building up lists, dictionaries and sets from other lists, dictionaries and sets, modifying and filtering them in the process.

They allow you to express complicated looping logic in a tiny amount of space.

#### List Comprehensions

List comprehensions are the best known and most widely used. Let’s start with an example.

A common programming task is to iterate over a list and transform each element in some way, e.g:

`>>> squares = [] >>> for numin range(10):         squares.append(num**2) >>> squares [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] `

That’s the kind of thing you might do if you were a Java programmer. Luckily for us, though, list comprehensions allow the same idea to be expressed in much fewer lines.

`>>> squares = [x**2 for x in range(10)] >>> squares [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] `

The basic syntax for list comprehensions is this: [EXPRESSION FOR ELEMENT IN SEQUENCE] .

Another common task is to filter a list and create a new list composed of only the elements that pass a certain condition. The next snippet constructs a list of every number from 0 to 9 that has a modulus with 2 of zero, i.e. every even number.

`>>> [x for x in range(10) if x % 2 == 0] [0, 2, 4, 6, 8] `

Using an IF-ELSE construct works slightly differently to what you might expect. Instead of putting the ELSE at the end, you need to use the ternary operator – x if y else z .

The following list comprehension generates the squares of even numbers and the cubes of odd numbers in the range 0 to 9.

`>>> [x**2 if x % 2 == 0 else x**3 for x in range(10)] [0, 1, 4, 27, 16, 125, 36, 343, 64, 729] `

List comprehensions can also be nested inside each other. Here is how we can generate a two-dimensional list, populated with zeros. (I have wrapped the comprehension in pprint to make the output more legible.)

`>>> pprint([[0 for x in range(10)] for y in range(10)]) [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]] `

(As you have probably noticed, it is possible to create list comprehensions that are utterly illegible, so please think about who has to touch your code after you and exercise some restraint.)

On the other hand, the syntax of basic comprehensions might seem complicated to you now, but I promise that with time it will become second nature.

#### Dict Comprehensions

On top of list comprehensions, Python now supports dict comprehensions, which allow you to express the creation of dictionaries at runtime using a similarly concise syntax.

A dictionary comprehension takes the form `{key: value for (key, value) in iterable}` . This syntax was introduced in Python 3 and backported as far as Python 2.7, so you should be able to use it regardless of which version of Python you have installed.

A canonical example is taking two lists and creating a dictionary where the item at each position in the first list becomes a key and the item at the corresponding position in the second list becomes the value.

`>>> {k: v for (k, v) in zip(string.ascii_lowercase, range(26))} {'u': 20, 'c': 2, 'k': 10, 'v': 21, 'n': 13, 'l': 11, 'q': 16, 'g': 6, 'a': 0, 'm': 12, 'r': 17, 'e': 4, 'j': 9, 'd': 3, 'f': 5, 'z': 25, 'p': 15, 'x': 23, 's': 18, 'i': 8, 't': 19, 'b': 1, 'w': 22, 'h': 7, 'o': 14, 'y': 24} `

(Look how jumbled up it is. A reminder that dicts have no natural ordering.)

The `zip` function used inside this comprehension returns an iterator of tuples, where each element in the tuple is taken from the same position in each of the input iterables. In the example above, the returned iterator contains the tuples (“a”, 1), (“b”, 2), etc.

Any iterable can be used in a dict comprehension, including strings. The following code might be useful if you wanted to generate a dictionary that stores letter frequencies, for instance.

`>>> {c: 0 for c in string.ascii_lowercase} {'u': 0, 'c': 0, 'k': 0, 'v': 0, 'n': 0, 'l': 0, 'q': 0, 'g': 0, 'a': 0, 'm': 0, 'r': 0, 'e': 0, 'j': 0, 'd': 0, 'f': 0, 'z': 0, 'p': 0, 'x': 0, 's': 0, 'i': 0, 't': 0, 'b': 0, 'w': 0, 'h': 0, 'o': 0, 'y': 0} `

(The code above is just an example of using a string as an iterable inside a comprehension. If you really want to count letter frequencies, you should check out collections.Counter.)

Dict comprehensions can use complex expressions and IF-ELSE constructs too. This one maps the numbers in a specific range to their cubes:

`>>> {x: x**3 for x in range(10)} {0: 0, 1: 1, 2: 8, 3: 27, 4: 64, 5: 125, 6: 216, 7: 343, 8: 512, 9: 729} `

And this one omits cubes that are not divisible by 4:

`>>> {x: x**3 for x in range(10) if x**3 % 4 == 0} {0: 0, 8: 512, 2: 8, 4: 64, 6: 216} `

#### Set Comprehensions

A set is an unordered collection of elements in which each element can only appear once. Although sets have existed in Python since 2.4, Python 3 introduced the set literal syntax.

`>>> nums = {1, 54, 124} >>> nums {1, 124, 54} `

Python 3 also introduced set comprehensions.

`>>> nums = {numfor numin range(10)} >>> nums {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} `

Prior to this, you could use a list comprehension and pass the result to the set built-in function.

`>>> nums = set([numfor numin range(10)]) >>> nums {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} `

The syntax for set comprehensions is almost identical to that of list comprehensions, but it uses curly brackets instead of square brackets. The pattern is {EXPRESSION FOR ELEMENT IN SEQUENCE} .

The result of a set comprehension is the same as passing the output of the equivalent list comprehension to the set function.

That’s it for the theory. Now let’s dissect some examples of comprehensions.

#### List of files with the .png extension

The os module contains a function called `listdir` that returns a list of filenames in a given directory. We can use the `endswith` method on the strings to filter the list of files.

`def list_files_with_extension(where, ext):     return [f for f in os.listdir(where)             if f.endswith(ext)] `

Here it is in usage:

`>>> list_files_with_extension("/home/me/pics", ".png") ["grumpycat.png", "alien.png"] `

#### Merge two dictionaries

Merging two dictionaries together can be achieved easily in a dict comprehension. The trick is to call the `items` method on each dictionary. `items` returns a generator. Generators don’t support the + syntax for extending one generator with another, so it is necessary to convert the generators to lists first.

`def merge_dicts(d1, d2):     return {k: v for (k, v)             in list(d1.items()) + list(d2.items())} `

Here is `merge_dicts` in action:

`>>> boys_ages = {"Tom": 14, "Patrick": 12, "Sean": 15} >>> girls_ages = {"Jeanne": 14, "Marie": 12} >>> merge_dicts(boys_ages, girls_ages) {'Sean': 15, 'Tom': 14, 'Marie': 12, 'Patrick': 12, 'Jeanne': 14} `

#### Sieve of Eratosthenes

The Sieve of Eratosthenes is an ancient algorithm for finding prime numbers. You might remember it from school. It works like this:

• Starting at 2, which is the first prime number, exclude all multiples of 2 up to n.
• Move on to 3. Exclude all multiples of 3 up to n.
• Keep going like that until you reach n.

And here’s the code:

`def erathostenes(n):     not_prime = {j for i in range(2, n)                  for j in range(i*2, n, i)}       return {i for i in range(2, n)             if i not in not_prime} `

The first thing to note about the function is the use of a double loop in the first set comprehension. Contrary to what you might expect, the leftmost loop is the outer loop and the rightmost loop is the inner loop. The pattern for double loops in list comprehensions is [x for b in a for x in b] .

The third argument in the rightmost call to `range` represents the step size.

It would be possible to use a list comprehension for this algorithm, but the `not_primes` list would be filled with duplicates. It is better to use the automatical deduplication behaviour of the set to avoid that.

#### Exercises

1.Write a function called `generate_matrix` that takes two positional arguments – `m` and `n` – and a keyword argument `default` that specifies the value for each position. It should use a nested list comprehension to generate a list of lists with the given dimensions. If default is provided, each position should have the given value, otherwise the matrix should be populated with zeroes.

2.Write a function called `initcap` that replicates the functionality of the `string.title` method, except better. Given a string, it should split the string on whitespace, capitalize each element of the resulting list and join them back into a string. Your implementation should use a list comprehension.

3.Write a function called `make_mapping` that takes two lists of equal length and returns a dictionary that maps the values in the first list to the values in the second. The function should also take an optional keyword argument called `exclude` , which expects a list. Values in the list passed as `exclude` should be omitted as keys in the resulting dictionary.

4.Write a function called `compress_dict_keys` that takes a dictionary with string keys and returns a new dictionary with the vowels removed from the keys. For instance, the dictionary `{"foo": 1, "bar": 2}` should be transformed into `{"f": 1, "br": 2}` . The function should use a list comprehension nested inside a dict comprehension.

5.Write a function called `dedup_surnames` that takes a list of surnames names and returns a set of surnames with the case normalized to uppercase. For instance, the list `["smith", "Jones", "Smith", "BROWN"]` should be transformed into the set `{"SMITH", "JONES", "BROWN"}` .

#### Solutions

1.Nest two list comprehensions to generate a 2D list with m rows and n columns. Use `default` for the value in each position in the inner comprehension.

`def generate_matrix(m, n, default=0):     return [[defaultfor i in range(n)]              for i in range(m)] `

2.Disassemble the sentence passed into the function using `split` , then use the slicing syntax to call `upper` on the first character of each word, then use `join` to reassemble the sentence.

`def initcap(s):     return (' '.join([w[0].upper() + w[1:]             for w in s.split()])) `

3.Join the two lists a and b using `zip` , then use the zipped lists in the dictionary comprehension.

`def make_mapping(a, b, exclude=[]):     return {k: v for (k, v) in zip(a, b)             if k not in exclude} `

4.Iterate over the key-value pairs from the passed-in dictionary and, for each key, remove the vowels using a comprehension with an IF construct.

`def compress_dict_keys(d):     return {''.join(c for c in k             if c not in "aeiou"): v             for (k, v) in d.items()} `

5.Use the set comprehension syntax (with curly brackets) to iterate over the given list and call `upper` on each name in it. The deduplication will happen automatically due to the nature of the set data structure.

`def dedup_surnames(names):     return {name.upper() for namein names} `

I’ll leave it there for now. If you’ve worked your way through this post and given the exercises a good try, you should be ready to use comprehensions in your own code.

If you’ve got any questions or other remarks, let me know in the comments.