I was looking at ACM ICPC world final problems the other day when this particular problem caught my eyes. Having spent a few hours working on the Fibonacci primitive roots problem it felt like a good choice for a problem to try to tackle. You can find the source here. It is also on the UVA online judge as problem 1282.
Fibonacci words are defined by letting and where is the string concatenation operation. Hence is a string for Specifically, these strings are referred to as Fibonacci words.
The problem requires an efficient algorithm to find the number of occurrences of a given pattern in a given Fibonacci word Noting that the length of Fibonacci words grows exponentially, at an exponential rate given by the golden ratio , generating the Fibonacci word in memory and then performing a search is simply not an option for large
For the rest of this post, let be the regular Fibonacci sequence given by and and for
Let be the number of occurrences of in
My solution involved first noticing the following about Fibonacci words:
- The lengths of Fibonacci words are given by
- is a prefix of
- is a suffix of Hence is a suffix of for all
- For we have
- Similarly, we have
Let be the smallest number that is large enough such that Then for there are four exclusive possibilities for :
- it is not contained in or
- it is contained in or
- it is contained in or
- it occurs in and overlaps both, or
- it occurs in and overlaps both.
We can now find a recurrence relation for the number of times occurs in each of the above sub-cases. The first case is simply counted by and the second by Let be the number of times occurs in and overlaps both, and the number of times occurs in and overlaps both. Then for we have
For given that we have
Similarly, for given that we have
This is enough information to come up with a rather efficient algorithm:
- First find as described above, by finding the first number such that
- Then find the number of occurrences of in using an efficient pattern matching algorithm such as KMP. Classify each occurrence as one of the sub-cases above, and by doing so find the values for and which are going to be our base cases.
- Now, based on the recurrences above, and in a bottom to top approach, calculate the value of for the given
Since we have and so So the KMP string matching part of this algorithm will take time and the rest will take time. Hence the total time is for finding a pattern of length in This is definitely reasonable since is much shorter compared to
Here's the Java code for the above algorithm. I use a KMP implementation that I translated over from an older Python project I did. I might write a post on KMP itself soon!
import java.util.LinkedList;
import java.util.Scanner;
public class Problem1282 {
public static String[] F;
public static long[] f;
// Computes F up until F[m] such that M < F[m-3]
public static void precomputeF(long M) {
F = new String[101];
F[0] = "0";
F[1] = "1";
for (int i = 1; i < 3 || F[i - 3].length() < M; i++) {
F[i + 1] = F[i] + F[i - 1];
}
}
public static void precomputef(int M) {
f = new long[M + 1];
f[0] = 1;
f[1] = 1;
for (int i = 1; i < M; i++) {
f[i + 1] = f[i] + f[i - 1];
}
}
public static long occurrences(int n, String p) {
if (p.equals("0")) {
if (n < 2) {
return n == 0 ? 1 : 0;
}
return f[n - 2];
}
if (p.equals("1")) {
if (n < 2) {
return n == 0 ? 0 : 1;
}
return f[n - 1];
}
int m = 3;
while (f[m - 3] <= p.length()) {
m++;
}
if (m > n) {
m = n;
}
long a = 0; // This will start with O(m-3)
long b = 0; // And this one with O(m-2)
long[] o = new long[2]; // Overlapping count
// Only need to find occurrences up until index f[m-1] since past that
// we are back to F(n-2) and since F(n) starts with F(n-2) we have
// already found all the occurrences in F(n-2) already.
LinkedList<Integer> occurrences = KMP.search(F[m], p, (int) f[m - 1]);
for (int found : occurrences) {
if (found < f[m - 2]) {
if (found + p.length() - 1 < f[m - 2]) {
b++;
} else {
o[0]++;
}
} else if (found < f[m - 1]) {
if (found + p.length() - 1 < f[m - 1]) {
a++;
} else {
o[1]++;
}
}
}
int i;
for (i = 0; i <= n - m + 1; i++) {
long old_b = b;
b = b + a + o[i % 2];
a = old_b;
}
return b;
}
public static void main(String[] args) {
precomputeF(100000);
precomputef(100);
java.util.Scanner scanner = new Scanner(System.in);
for (int c = 1; scanner.hasNext(); c++) {
int n = Integer.parseInt(scanner.nextLine());
String p = scanner.nextLine();
System.out.format("Case %d: %d\n", c, occurrences(n, p));
}
}
}
class KMP {
// Calculate the border array (i.e. failure function) of string x.
// Direct translation of my older Python code
private static int[] borderArray(String s) {
char[] x = s.toCharArray();
int n = x.length;
int[] beta = new int[n];
int b = 0;
for (int i = 1; i < n; i++) {
while (b > 0 && x[i] != x[b]) {
b = beta[b - 1];
}
beta[i] = b = (x[i] == x[b] ? b + 1 : 0);
}
return beta;
}
// Return an array of indices where p occurs in x.
// Overlapping matches are allowed. For example kmpSearch("ababa", "aba")
// yields 0 and 2.
public static LinkedList<Integer> search(String xx, String pp, int max_index) {
char[] x = xx.toCharArray();
char[] p = pp.toCharArray();
int[] beta = borderArray(pp);
int i = 0, m = 0;
LinkedList<Integer> results = new LinkedList<Integer>();
while (m + i < x.length) {
if (m > max_index) {
break;
}
if (x[m + i] == p[i]) {
if (i == p.length - 1) {
results.add(m);
} else {
i++;
continue;
}
}
if (i > 0) {
m = m + i - beta[i - 1];
i = beta[i - 1];
} else {
m = m + 1;
i = 0;
}
}
return results;
}
}
Comments