Frictionless Bananas

Team name: Frictionless Bananas
Team members: Jeremy Sawicki (jeremy at sawicki dot us)
Code: submission.tar.gz (authorization token removed from the code)
Data: myproblems.json, status.json

The Frictionless Bananas had only one team member this year. This page describes my 2013 ICFP Programming Contest entry.

Implementation

My solution was written in C++. I used Jansson for JSON manipulation and libcurl for server communication.

Representation

I represented a program as a sequence of operations in postfix order, which can be evaluated using a stack. For example, the zero operation pushes the number 0 onto the stack and the plus operation adds the top two values on the stack and places the result on the stack. The fold operation invokes a loop that executes a second sequence of operations representing the lambda body.

Generation

The core of my program is a program generator that enumerates all programs of a given size using only operations in a given set. For each generated program, it invokes a callback function (an std::function<> object, typically a lambda expression provided by the caller).

My initial implementation generated the sequence of operations in postfix order, choosing the first operation and then calling itself recursively to choose the rest. I used that implementation for the lightning round, but it ultimately proved difficult to work with. It required awkward logic to avoid pushing more values onto the stack than can be consumed by the remaining operations, and it was not clear how to extend it to prune the search by prohibiting certain combinations of operations.

My final implementation works top-down, choosing the top-level operation and the sizes of each subexpession and then recursively generating subexpressions of the chosen sizes. To support operations with more than one subexpression, I used continuation passing style. The program generator calls itself recursively to generate one subexpression, passing a continuation (a lambda expression) to generate the next subexpression. It was strange to see so many functional programming concepts like lambda expressions and continuations appearing in C++ code.

Counting

I also implemented a routine to quickly compute the number of programs for a given size and operation set. It uses a dynamic programming algorithm that builds a table of program counts for each size in increasing order. When I started pruning the search space by prohibiting certain combinations of operations, I added additional dimensions to the table to represent the fact that different sets of expressions are permitted depending on the context in which the expression appears.

Having an accurate program count is useful for several reasons. For one thing, it provides a useful estimate of the difficulty of a given problem. More importantly, the program generator is able to use the counts to generate a subset of the possible programs corresponding to an arbitrary range of indices, efficiently skipping parts of the search space and knowing exactly how many programs were skipped. I used that capability to generate arguments to commutative binary operators in only one of two possible orders. After an expression of index k is chosen for the left subexpression, the search for the right subexpression is restricted to indices greater than k. Another possible use is to divide the search space into equal sized pieces for parallelization, but I never took advantage of that possibility.

Strategy

I used a relatively simple approach. My program solves problems in real time, first submitting an initial set of inputs to evaluate and then generating and testing as many candidate programs as possible until a correct solution is found or time runs out. I did not precompute anything ahead of time, and I only used a single CPU core for computations. I chose to focus my efforts on improving the set of generated programs rather than utilizing more computing resources.

My initial implementation generated programs of exactly the target size, and included the constraint that the program must contain every operation from the target set at least once. My final implementation generates programs less than or equal to the target size in increasing order, and permits some of the target operations to be unused.

Evaluation

My initial evaluation uses the following inputs:

The random looking values are constants taken from the SHA-512 hash function.

Pruning

To reduce the search space, my program avoids generating combinations of operations that are equivalent to other generated expressions. The following constructs are not generated:

After reading my writeup, another contestant pointed out that (if0 (not a) b c) is not equivalent to (if0 a c b) because not is bitwise. It means my program would have to search a bit harder to find an equivalent of (if0 (not a) b c), for example (if0 (plus 1 a) b c).

Heuristics

I took advantage of certain patterns observed in the training problems to further reduce the search space:

I observed that for some problem types if0 tends to appear only once, but I did not take advantage of that. I also did not think of the approach that some other teams came up with of independently searching for a solution to one branch of an if0 that produces the correct answer for half of the inputs. I did not spend much time on bonus problems in general.

Precomputation

At one point I started an implementation that would precompute a table of programs and corresponding sample inputs/outputs which could quickly be queried, but I eventually abandoned that effort.

I am not convinced that precomputation is very helpful. When solving a problem in real-time, it is easy to restrict the search to the set of operations used in that problem. When precomputing, it it necessary to generate either a much larger set of programs using all operations or a separate set of programs for each problem, neither of which is very appealing. For example, if I look at the simple (fold-free) problems of size 14 in my problem set, I must generate between 2 million and 700 million programs for each problem, depending on the set of operations used in the problem. My code can generate the largest of those sets within roughly 5 minutes. On the other hand, to generate programs of size 14 using all operators, I would have to generate 12.9 billion programs, likely taking an hour or two. If I knew in advance which problems were difficult, it may be worth precomputing programs for just those problems using the correct set of operations. But most problems can be solved easily, with only a few requiring large, difficult to find solutions, so I would be forced to do a lot of precomputation for all problems when only a few need it. Overall it did not seem like a good approach.

Results

I scored 425 points in the lighting round, which should be around 15th place based on the score distribution published by the organizers (10th place is 457, 25th place is 345). I greatly underestimated the time required to communicate with the server and solve problems due to rate limiting, so I was not able to run as many problems as I intended before the time expired. Still, I only intended to solve problems during the lighting round where I was fairly confident that my program could exhaustively enumerate all possible solutions.

I scored 1359 points in the full contest, which should also be around 15th place (11th place is around 1400, 25th place is around 1250). Despite my experience with the lighting round, I again underestimated the amount of time necessary to communicate with the server and try all of the problems. I finished running the non-bonus problems about five minutes before the contest ended, leaving little time to attempt the bonus problems. (I also somehow missed one of the tfold problems.) Here are the statistics:

Looking at only the non-bonus problems, I scored 1346 points. Based on unofficial scores published by some teams, the next highest score I have seen for the non-bonus problems is 1273, so I think there is a chance that my code is among the top performing in the contest. Unfortunately I did not successfully make use of it in time. I also did not implement a sophisticated solution for if0 in the bonus problems, so I still may not have been competitive with the best teams on the full set of problems.

Overall impression

This year’s task proved to be quite fun. It was a straightforward problem, not involving an unreasonable amount of preliminary code or complex reverse engineering to get started.

I am always a little skeptical of tasks where the judges will not run code submitted by participants. Of course that makes the judging process simpler, and it also lets teams work with any platforms and tools they choose, but it sometimes can make the contest less of a true “programming” contest because you do not need to write code that can handle any input that it may be presented with. I think the judges succeeded with this year’s task.

The only downside I see to this year’s task was the large amount of time required to solve all of the problems due to rate limiting by the server. Ideally it should be possible to submit solutions as fast as your code is able to generate them. I understand that the server’s job was not easy though, involving proving the equivalence of programs.


The Frictionless Bananas