There are many aspects of Python that appeal to mathematicians. To mention only a few: built-in support for tuples, lists and sets, all with notation that is very similar to conventional mathematical notation, and list comprehensions which are very similar to set comprehension and the "set-builder" notation used for them.

Another set of features that are very appealing to the mathematically-minded
are Python's iterators and generators, and the related `itertools` package.
These tools make it easy to write elegant code that deals with such
mathematical objects as infinite sequences, stochastic processes, recurrence
relations, and combinatorial structures. This post covers my notes on iterators
and generators, and goes over some of my experiments while I was learning more
about these features of Python.

## Iterators

Iterators are objects that allow iteration over a collection. Such collections need not be of objects that already exist in memory, and because of this, they need not necessarily be finite. For official documentation, see "Iterator Types" in Python's documentation.

Let's be a bit more precise in our definitions. An *iterable* is defined as an
object that has an `__iter__` method, which is required to return an iterator
object. An *iterator* in turn is an object that has the two methods
`__iter__` and `next` (or `__next__` in Python 3), with the former
returning an iterator object and the latter returning the next element of the
iteration. As far as I know, iterators always simply return `self` in their
`__iter__` method because they are their own iterators.

Generally speaking, you should avoid calling `__iter__` and `next`
directly. By using `for` or list comprehensions, Python will automatically
call them for you. If you have to call them manually, use Python's built-in
functions `iter` and `next` and pass the iterator or container as
parameter. For example, if `c` is
an iterable, then use `iter(c)` instead of `c.__iter__()` and
similarly, if `a` is an iterator, use `next(a)` instead of
`a.next()`. This is similar to how you would use `len`.

Speaking of `len`, it is worth mentioning that iterators do not have to, and
often do not, have a well-defined length. As such, they often do not implement
`__len__`. If you would like to count the number of items in an iterator, you
will have to do so manually, or use `sum`. An example of this is shown at the
end of this article, under the "`itertools` Module" section.

Some iterables are not iterators and instead have other objects as their
iterators. For example, the `list` object is an iterable but not an iterator (it
implements `__iter__` but not `next`). Iterators for `list` objects are
of type `listiterator` as you can see in the example below. Also notice how
while `list` objects have a well-defined length, `listiterator` objects do
not.

```
>>> a = [1, 2]
>>> type(a)
<type 'list'>
>>> type(iter(a))
<type 'listiterator'>
>>> it = iter(a)
>>> next(it)
1
>>> next(it)
2
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> len(a)
2
>>> len(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'listiterator' has no len()
```

When an iterator finishes it is expected by Python's interpreter to raise a
`StopIteration` exception. However, as mentioned, iterators can iterate over
an infinite set. For such iterators, it is the user's responsibility to ensure
their use does not lead to an infinite loop. An example of this is shown in a bit.

A very basic example of an iterator is given below. This iterator will simply
start counting at `0` and go up indefinitely. It is a simpler version of
`itertools.count`.

```
class count_iterator(object):
n = 0
def __iter__(self):
return self
def next(self):
y = self.n
self.n += 1
return y
```

Usage example is below. Note that the last line attempts to convert the iterator object to a list, which will result in an infinite loop since the particular iterator never ends.

```
>>> counter = count_iterator()
>>> next(counter)
0
>>> next(counter)
1
>>> next(counter)
2
>>> next(counter)
3
>>> list(counter) # This will result in an infinite loop!
```

Finally, to be accurate we should make an amendment to the above: objects that
do not have the method `__iter__` defined can still be iterable if they
define `__getitem__`. In this case when Python's built-in function `iter`
will return an iterator of type `iterator` for the object, which uses
`__getitem__` to go over the items in the list. If either `StopIteration`
or `IndexError` is raised by `__getitem__` then the iteration stops. Let's
see an example of this:

```
class SimpleList(object):
def __init__(self, *items):
self.items = items
def __getitem__(self, i):
return self.items[i]
```

And its use:

```
>>> a = SimpleList(1, 2, 3)
>>> it = iter(a)
>>> next(it)
1
>>> next(it)
2
>>> next(it)
3
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
```

Now let's look at a more interesting example: generating the Hofstadter Q sequence
given initial conditions using an iterator. Hofstadter first mentioned this
nested recurrence in his book "GĂ¶del, Escher, Bach: An Eternal Golden Braid"
and since then the problem of proving that the sequence is well-defined for all
values of `n` has been open. The code below uses an iterator to generate a
sequence given by the nested recurrence

given a list as the initial conditions. For example, `qsequence([1, 1])` will
generate exactly the Hofstadter sequence. We use the `StopIteration`
exception to indicate the sequence can not go any further because an invalid
index is required to generate the next element. For example, if the initial
conditions are [1,2], then the sequence "terminates" immediately.

```
class qsequence(object):
def __init__(self, s):
self.s = s[:]
def next(self):
try:
q = self.s[-self.s[-1]] + self.s[-self.s[-2]]
self.s.append(q)
return q
except IndexError:
raise StopIteration()
def __iter__(self):
return self
def current_state(self):
return self.s
```

And here's how you can use it:

```
>>> Q = qsequence([1, 1])
>>> next(Q)
2
>>> next(Q)
3
>>> [next(Q) for __ in xrange(10)]
[3, 4, 5, 5, 6, 6, 6, 8, 8, 8]
```

## Generators

Generators are iterators that are defined using a simpler function notation.
More specifically, a generator is a function that uses the `yield` expression
somewhere in it. Generators can not return values, and instead yield results
when they are ready. Python automates the process of remembering a generator's
context, that is, where its current control flow is, what the value its local
variables are, etc. Each time a generator is called using `next` it yields
the next value in the iteration. The `__iter__` method will also be
automatically implemented, meaning generators can be used anywhere an iterator
is needed. Let's look at an example. This implementation is functionally
equivalent to our iterator class above, but is much more compact and readable.

```
def count_generator():
n = 0
while True:
yield n
n += 1
```

Now let's see it in action:

```
>>> counter = count_generator()
>>> counter
<generator object count_generator at 0x106bf1aa0>
>>> next(counter)
0
>>> next(counter)
1
>>> iter(counter)
<generator object count_generator at 0x106bf1aa0>
>>> iter(counter) is counter
True
>>> type(counter)
<type 'generator'>
```

Now, let's implement Hofstadter's Q sequence using a generator. Notice that
the implementation is much simpler. However, we no longer can implement
methods such as `current_state` above. As far as I know,
it is NOT possible to access any variables that are stored in the context of
a generator from outside the generator and as such `current_state` can not be
accessed from the generator object. (OK, there is the `gi_frame.f_locals`
dictionary but it is implementation specific, namely specific to CPython,
and is not a standard part of the language. As such it is *not* recommended
for use.) One possible solution in these circumstances is to yield both the
result, and the list so far but I will leave that as an exercise.

```
def hofstadter_generator(s):
a = s[:]
while True:
try:
q = a[-a[-1]] + a[-a[-2]]
a.append(q)
yield q
except IndexError:
return
```

Notice that a simple `return` statement with no parameter ends the iteration
of a generator. Internally, this raises a `StopIteration` exception.

Next example for generators comes from the Groupon randomness extraction
interview question. See
the article here.
Here we use two generators to implement first a Bernoulli process, which is an
infinite sequence of random boolean values, with `True` having probability \(p\) and
`False` probability \(q=1-p\), and then to implement a von Neumann extractor that takes
as input a Bernoulli process with \(0<p<1\) as a source of entropy and returns a
fair Bernoulli process, with \(p=0.5\).

```
import random
def bernoulli_process(p):
if p > 1.0 or p < 0.0:
raise ValueError("p should be between 0.0 and 1.0.")
while True:
yield random.random() < p
def von_neumann_extractor(process):
while True:
x, y = process.next(), process.next()
if x != y:
yield x
```

Finally, generators are great tools for implementing discrete dynamical systems. The following example shows how the famous tent map dynamical system can be implemented nicely using generators. (On a side note, have a look at how numerical inaccuracies begin to creep in relatively fast and then grow exponentially, a key feature of dynamical systems such as the tent map.)

```
>>> def tent_map(mu, x0):
... x = x0
... while True:
... yield x
... x = mu * min(x, 1.0 - x)
...
>>>
>>> t = tent_map(2.0, 0.1)
>>> for __ in xrange(30):
... print t.next()
...
0.1
0.2
0.4
0.8
0.4
0.8
0.4
0.8
0.4
0.8
0.4
0.8
0.4
0.8
0.4
0.8
0.4
0.799999999999
0.400000000001
0.800000000003
0.399999999994
0.799999999988
0.400000000023
0.800000000047
0.399999999907
0.799999999814
0.400000000373
0.800000000745
0.39999999851
0.79999999702
```

Another somewhat similar example would be the Collatz sequence.

```
def collatz(n):
yield n
while n != 1:
n = n / 2 if n % 2 == 0 else 3 * n + 1
yield n
```

Notice again that in this example also, we do not have to raise
`StopIteration` manually since it will be raised automatically when
control flow reaches the end of the function.

Example of the `collatz` generator in use:

```
>>> # If the Collatz conjecture is true then list(collatz(n)) for any n will
... # always terminate (though your machine might run out of memory first!)
>>> list(collatz(7))
[7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]
>>> list(collatz(13))
[13, 40, 20, 10, 5, 16, 8, 4, 2, 1]
>>> list(collatz(17))
[17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]
>>> list(collatz(19))
[19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]
```

## Recursive Generators

Generators can be recursive, just like any other function. Let's
see an example of this by implementing our own simpler version of
`itertools.permutations`, namely a generator that generates all permutations
of a given list of items. (Note: In practice use `itertools.permutations`
instead of the method given here. It's faster!) The basic idea here is simple:
for each element in the list, we move it to be the first element by swapping it
with the first element, and then we permute the rest of the list.

```
def permutations(items):
if len(items) == 0:
yield []
else:
pi = items[:]
for i in xrange(len(pi)):
pi[0], pi[i] = pi[i], pi[0]
for p in permutations(pi[1:]):
yield [pi[0]] + p
```

```
>>> for p in permutations([1, 2, 3]):
... print p
...
[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 1, 2]
[3, 2, 1]
```

## Generator Expressions

Generator expressions let you define generators using an even simpler, inline,
notation. This notation is very similar to Python's list comprehension
notation. For example, the following provides a generator that iterates over
all perfect squares. Note how the results of generator expressions are an
objects of type `generator`, and as such they implement both `next` and
`__iter__` methods.

```
>>> g = (x ** 2 for x in itertools.count(1))
>>> g
<generator object <genexpr> at 0x1029a5fa0>
>>> next(g)
1
>>> next(g)
4
>>> iter(g)
<generator object <genexpr> at 0x1029a5fa0>
>>> iter(g) is g
True
>>> [g.next() for __ in xrange(10)]
[9, 16, 25, 36, 49, 64, 81, 100, 121, 144]
```

We can also implement the Bernoulli process given in previous section using a
generator expression, in this example with \(p=0.4\). If a generator
expressions requires another iterator to be used as the "loop counter" then
`itertools.count` is a good choice if the generator expression is to produce
an infinite sequence. Otherwise `xrange` can be used.

```
>>> g = (random.random() < 0.4 for __ in itertools.count())
>>> [g.next() for __ in xrange(10)]
[False, False, False, True, True, False, True, False, False, True]
```

As mentioned, generator expressions can be passed to any function that requires an iterator as an argument. For example, we can sum the first ten perfect squares by writing:

```
>>> sum(x ** 2 for x in xrange(10))
285
```

More examples of generator expressions are given in next section.

## The `itertools` Module

The module `itertools` provides a set of iterators that can help simplify
working with permutations, combinations, Cartesian products,
and other such combinatorial constructs. You can read up on all the
functions here.

Before reading the rest, note that none of the algorithms given below are optimal in practice, and are only presented here as examples of how to use permutations and combinations. In practice, you should try to avoid enumerating all permutations or combinations unless you have good reason to believe that it is necessary to do so (e.g. you know the problem is NP-complete), as such enumerations grow exponentially in size.

Now let's go over some interesting use cases. For our first example, let's look an alternative way of writing a relatively common pattern: looping over all indices of a 3-dimensional array, and also looping over all indices with \(0 \le i < j < k \le n\).

```
from itertools import combinations, product
n = 4
d = 3
def visit(*indices):
print indices
# Loop through all possible indices of a 3-D array
for i in xrange(n):
for j in xrange(n):
for k in xrange(n):
visit(i, j, k)
# Equivalent using itertools.product
for indices in product(*([xrange(n)] * d)):
visit(*indices)
# Now loop through all indices 0 <= i < j < k <= n
for i in xrange(n):
for j in xrange(i + 1, n):
for k in xrange(j + 1, n):
visit(i, j, k)
# And equivalent using itertools.combinations
for indices in combinations(xrange(n), d):
visit(*indices)
```

The alternatives using `itertools` enumerators have two advantages: they are
shorter to write (one line, instead of 3 in this case) and they are much easier
to generalize to higher dimensions. That being said, I have not tested the
difference in performance between the two, which might be significant for
larger \(n\). Use your best judgement, and do some performance tests if
you're concerned.

For the second example, let's do something more mathematically interesting:
using generator expressions and `itertools.combinations` and
`itertools.permutations` to calculate the inversion number of a permutation, then sum the
inversion numbers of all the permutations of a list. The sum of inversion
numbers of all the permutations of \(n\) is OEIS A001809 and it is relatively easy to show
that the closed form is \(\frac{n!n(n-1)}{4}\). In practice using this
formula will be much more efficient than the code given below, but we are
writing it as an exercise in using `itertools` enumerators.

```
import itertools
import math
def inversion_number(A):
"""Return the number of inversions in list A."""
return sum(1 for x, y in itertools.combinations(xrange(len(A)), 2) if A[x] > A[y])
def total_inversions(n):
"""Return total number of inversions in permutations of n."""
return sum(inversion_number(A) for A in itertools.permutations(xrange(n)))
```

And usage:

```
>>> [total_inversions(n) for n in xrange(10)]
[0, 0, 1, 9, 72, 600, 5400, 52920, 564480, 6531840]
>>> [math.factorial(n) * n * (n - 1) / 4 for n in xrange(10)]
[0, 0, 1, 9, 72, 600, 5400, 52920, 564480, 6531840]
```

For the third example, let's calculate the
recontres numbers using a
brute-force counting method. Given \(n\) and \(k\), the recontres number \(D_{n,k}\) is
defined as the number of permutations of a set of size \(n\) that have exactly \(k\)
fixed points. First we write a function that
uses a generator expression inside a sum to count the number of fixed points of
a permutation. Then using `itertools.permutations` and another generator
expression inside a sum, we count the total number of permutations of \(n\) numbers
which have exactly \(k\) fixed points. Here's the result. Of course, this
is a *very* inefficient way of calculating recontres numbers and should not be
used in practice. Again, this is just an exercise in using generator expressions
and `itertools` functions and is not meant to be usable code.

```
def count_fixed_points(p):
"""Return the number of fixed points of p as a permutation."""
return sum(1 for x in p if p[x] == x)
def count_partial_derangements(n, k):
"""Returns the number of permutations of n with k fixed points."""
return sum(1 for p in itertools.permutations(xrange(n)) if count_fixed_points(p) == k)
```

Usage:

```
# Usage:
>>> [count_partial_derangements(6, i) for i in xrange(7)]
[265, 264, 135, 40, 15, 0, 1]
```

## Further Reading

Pramode's article on generators contains a very interesting example implementing an efficient but simple sieve of Eratosthenes.

David Beazley also has an excellent page on generators found here. Make sure to check it out for more interesting examples and use cases.

## Acknowledgments

Some improvements to this article were made based on suggestions by reddit user jms_nh.

## Comments