The last month was a very busy one, with the ACM ICPC regionals taking place last week. This post is based on a problem on the competition, "Crusher's Code". You can see the full list of problem statements here.

First, a brief explanation of the problem:

Two sort algorithms are given. Both take an input array \(A\) of size \(n\) as input. AlgorithmMontychooses random numbers \(i\) and \(j\) with \(0 \le i, j < n,\) swapping them if necessary to ensure \(i < j\) and then swaps \(A[i]\) and \(A[j]\) if \(A[i] > A[j].\) AlgorithmCarloson the other hand picks a random \(i\) with \(0 \le i < n -1\) and swaps \(A[i]\) and \(A[i+1]\) if \(A[i] > A[i+1].\) Both algorithms continue until the array is sorted. How many iterations are each expected to run for before terminating?

My solution to this problem started with coming up with a recurrence for the
expected number of iterations in each case. For `Monty` we reduce the problem
to a "smaller" problem as soon as we randomly come across a a pair \((i,
j)\) that form an inversion of the array. Let's say the set of inversions of
\(A\) is denoted by \(I(A).\) Then choosing a pair \((i, j)\) at
random has a total of \(n^2\) possibilities, of which exactly
\(2|I(A)|\) result in an inversion. Hence this can be seen as a Bernoulli
process with success rate \(p = \frac{2|I(A)|}{n^2}.\) The expected number
of trials before a success is therefore simply \(\frac{n^2}{2|I(A)|}.\)

This gives us the following recursion for the expected number of iterations of
`Monty`:

where by \(A_{i,j}\) here I mean the array \(A\) with its \(i\) and \(j\) indices swapped. The \(\frac{1}{|I(A)}\) is the probability that any particular inversion is the one that we get to first during the sort.

Following a similar line of thought, for `Carlos` we can let \(I'(A)\)
denote the set of indices \(i\) such that \(A[i] > A[i+1].\) Then we
get the following recurrence for the expected number of iterations of `Carlos`:

Of course, there are a lot of overlaps in the recurrence, so memoizing the values are key to a fast enough algorithm. During the competition I achieved this in Java by first getting the inversion sequence of the array, and the representing the inversion sequence as a base \(n\) number, and then using this representation as a key for a hash-table to be used as a cache.

In Python, this can be done much simpler by just hashing with the array itself as the key, converted to a tuple. Here is an implementation in Python for the above algorithm. Note that \(I(A)\) and \(I'(A)\) are implemented as separate functions and passed to the main driver as parameters. The "cost" functions are also passed as parameters similarly, allowing for a simple driver function to calculate both numbers.

```
from __future__ import division
from sys import stdin
def expected_iterations(A, cache, candidates, cost):
n = len(A)
h = tuple(A)
if h in cache:
return cache[h]
E = 0.0
C = candidates(A)
if C:
for (i, j) in C:
A[i], A[j] = A[j], A[i]
E += expected_iterations(A, cache, candidates, cost)
A[i], A[j] = A[j], A[i]
E += cost(n)
E /= len(C)
cache[h] = E
return E
def inversions(A):
return [(i, j)
for i in xrange(len(A) - 1)
for j in xrange(i + 1, len(A))
if A[i] > A[j]]
def adj_inversions(A):
return [(i, i + 1)
for i in xrange(len(A) - 1)
if A[i] > A[i + 1]]
monty_cost = lambda n: n * n / 2.0
carlos_cost = lambda n: n - 1
def process_test_case(A):
carlos_cache = {}
monty_cache = {}
m = expected_iterations(A, monty_cache, inversions, monty_cost)
c = expected_iterations(A, carlos_cache, adj_inversions, carlos_cost)
print 'Monty {:.6f} Carlos {:.6f}'.format(m, c)
def main():
T = int(stdin.readline())
for __ in xrange(T):
A = [int(x) for x in stdin.readline().split(' ')]
# ignore the first element since is is length of the array
process_test_case(A[1:])
if __name__ == '__main__':
main()
```

## Comments