# Expected Running Time of Two Randomized Sort Algorithms (ACM ICPC 2013 PACNW Regional)

The last month was a very busy one, with the ACM ICPC regionals taking place last week. This post is based on a problem on the competition, "Crusher's Code". You can see the full list of problem statements here.

First, a brief explanation of the problem:

Two sort algorithms are given. Both take an input array $$A$$ of size $$n$$ as input. Algorithm Monty chooses random numbers $$i$$ and $$j$$ with $$0 \le i, j < n,$$ swapping them if necessary to ensure $$i < j$$ and then swaps $$A[i]$$ and $$A[j]$$ if $$A[i] > A[j].$$ Algorithm Carlos on the other hand picks a random $$i$$ with $$0 \le i < n -1$$ and swaps $$A[i]$$ and $$A[i+1]$$ if $$A[i] > A[i+1].$$ Both algorithms continue until the array is sorted. How many iterations are each expected to run for before terminating?

My solution to this problem started with coming up with a recurrence for the expected number of iterations in each case. For Monty we reduce the problem to a "smaller" problem as soon as we randomly come across a a pair $$(i, j)$$ that form an inversion of the array. Let's say the set of inversions of $$A$$ is denoted by $$I(A).$$ Then choosing a pair $$(i, j)$$ at random has a total of $$n^2$$ possibilities, of which exactly $$2|I(A)|$$ result in an inversion. Hence this can be seen as a Bernoulli process with success rate $$p = \frac{2|I(A)|}{n^2}.$$ The expected number of trials before a success is therefore simply $$\frac{n^2}{2|I(A)|}.$$

This gives us the following recursion for the expected number of iterations of Monty:

\begin{equation*} E_M(A) = \frac{n^2}{2|I(A)|} + \sum_{(i, j) \in I(A)} \frac{1}{|I(A)|} E_M(A_{i,j}) \end{equation*}

where by $$A_{i,j}$$ here I mean the array $$A$$ with its $$i$$ and $$j$$ indices swapped. The $$\frac{1}{|I(A)}$$ is the probability that any particular inversion is the one that we get to first during the sort.

Following a similar line of thought, for Carlos we can let $$I'(A)$$ denote the set of indices $$i$$ such that $$A[i] > A[i+1].$$ Then we get the following recurrence for the expected number of iterations of Carlos:

\begin{equation*} E_C(A) = \frac{n - 1}{|I'(A)|} + \sum_{i \in I'(A)} \frac{1}{|I'(A)|} E_C(A_{i,i+1}). \end{equation*}

Of course, there are a lot of overlaps in the recurrence, so memoizing the values are key to a fast enough algorithm. During the competition I achieved this in Java by first getting the inversion sequence of the array, and the representing the inversion sequence as a base $$n$$ number, and then using this representation as a key for a hash-table to be used as a cache.

In Python, this can be done much simpler by just hashing with the array itself as the key, converted to a tuple. Here is an implementation in Python for the above algorithm. Note that $$I(A)$$ and $$I'(A)$$ are implemented as separate functions and passed to the main driver as parameters. The "cost" functions are also passed as parameters similarly, allowing for a simple driver function to calculate both numbers.

from __future__ import division
from sys import stdin

def expected_iterations(A, cache, candidates, cost):
n = len(A)
h = tuple(A)
if h in cache:
return cache[h]
E = 0.0
C = candidates(A)
if C:
for (i, j) in C:
A[i], A[j] = A[j], A[i]
E += expected_iterations(A, cache, candidates, cost)
A[i], A[j] = A[j], A[i]
E += cost(n)
E /= len(C)
cache[h] = E
return E

def inversions(A):
return [(i, j)
for i in xrange(len(A) - 1)
for j in xrange(i + 1, len(A))
if A[i] > A[j]]

return [(i, i + 1)
for i in xrange(len(A) - 1)
if A[i] > A[i + 1]]

monty_cost = lambda n: n * n / 2.0
carlos_cost = lambda n: n - 1

def process_test_case(A):
carlos_cache = {}
monty_cache = {}
m = expected_iterations(A, monty_cache, inversions, monty_cost)
c = expected_iterations(A, carlos_cache, adj_inversions, carlos_cost)
print 'Monty {:.6f} Carlos {:.6f}'.format(m, c)

def main():