Frictionless Bananas

Nickname: Frictionless Bananas
Name: Jeremy Sawicki (jeremy at sawicki dot us)
Language: C++
Final results: 5th place

This page describes my CodinGame Summer Challenge 2025 entry.

The task

The task is to play a game where you control a team of 3-5 agents, battling another team with water balloons and squirt guns.

Solution approach

My solution uses Monte Carlo Tree Search (MCTS) for movement and heuristics for combat. Each agent has a separate MCTS tree, so the branching factor is at most 5.

Movement: Multi-tree MCTS

Each agent maintains a separate MCTS tree, with nodes having up to five children representing the different movement directions. Each agent independently chooses a desired direction using the usual UCT formula. Then, heuristics are used to choose combat actions for each team, with knowledge of that team’s moves but not the opposing team’s moves. All of the movement and combat actions are combined to simulate the new game state. The process repeats until all trees reach leaf nodes, or the game ends. The value of the resulting game state is determined using an evaluation function, and the MCTS nodes are updated with that value. When the desired movement direction differs from the actual direction, the desired direction is used to update the tree nodes, while the actual direction is used to traverse to child nodes.

Combat heuristics

Combat actions for each team are chosen via a set of heuristics.

First, I attempt to determine as much as possible about the locations where each agent may end up. For a team’s own agents, there is usually only one possible location, though in cases of potential collision with the opposing team there may be two locations. For the opposing team, there are more possibilities, but some locations can be ruled out based on cover or collisions with the team’ own agents.

Next, bomb actions are chosen. Bombs are never thrown where they may hit the same team’s agents, and they are only thrown where they are guaranteed to hit at least one of the opposing team’s agents. Bomb locations are chosen to maximize the number of guaranteed hits, and as a tie-breaker, to maximize the expected number of hits (assuming all opposing team moves are equally likely).

Finally, shooting actions are chosen. Again, actions are chosen to maximize guaranteed damage with expected damage as a tie-breaker, but there are a few inaccuracies in the calculation. If a team’s own agent can move or not move depending on collinions, I assume that the movement takes place. Also, I assume that agents will not hunker down when their cooldown is zero, since they should be shooting.

I made some tweaks to the above to try to favor attacking agents with higher wetness and agents which deal higher damage per turn (power / (cooldown + 1)), though I’m not sure it was very effective.

State evaluation

State evaluation is difficult because there are two independent scoring methods that can determine the game result: wetness and territory. It is possible to be ahead on one and behind on the other. At first, I combined them in an ad hoc way, but later I tried an approach that is more sound (at least in theory).

During the game, there are three quantities that are changing, and that can eventually result in the game ending: the wetness of the two teams, and the territory score (the difference between the two team scores). When any of the quantities changes enough, the game ends. To combine them, I attempt to determine the number of turns that it will take for each game ending condition to be reached. For wetness, I divide the remaining dryness (100 - wetness) by an estimate of the amount of damage that the opposing team can inflict per turn. For the territory score, I assume that the current control zones will remain the same, and the score will change by that much per turn. The game ends when the absolute value of the territory score exceeds 600, or when 100 turns are reached, whichever comes first.

The final value of a game state is a weighted average of the values of each game ending condition, where the weights are the reciprocals of the number of turns to reach each condition. The value of a game ending condition is either -1 or 1 when the game ends before 100 turns, or a value between -1 and 1 when the game ends due to reaching the 100-turn limit.

Performance

I implemented a SIMD version of the score calculation (territory / control zones) to help with performance. About 5% of my program’s runtime is spent in that code. My program performs around 3000 MCTS rollouts in 45 ms.

The Frictionless Bananas