Solving Mastermind With Python

Last Christmas I received a cryptic-looking game as a gift. It came with no instructions, just a board, and a couple of dozen colored wooden pegs.

I looked up the only identifying words associated with the board: Master Mind. With that search, I discovered a challenging two-player code cracking game.

In Mastermind, one player (the Codebreaker) tries to uncover a secret pattern made by the opposing player (the Codemaker) by making repeated guesses and receiving hints.

I also discovered that the version of Mastermind I had was way more difficult than the original version! The original has 4 slots and 6 colors, giving 6•6•6•6 = 1,296 possible codes. It also gives the Codebreaker 10 chances to guess the answer. My version has 5 slots and 5 colors, giving 5•5•5•5•5 = 3,125 possible codes. It also only gives the Codebreaker six chances. Players of the version I received need to deal with more than twice as many potential codes, and four fewer guesses!

After losing far more times than I care to admit, I decided that I wanted to crush this game using my programming knowledge. I started doing some research and saw that Donald Knuth, a computer science legend, had actually written an algorithm to solve Mastermind.

Perhaps it was born out of the same frustration that I was feeling with the game. I decided to implement Knuth’s paper: “The Computer As Master Mind”. If I couldn’t beat the game, I would least get the satisfaction of writing a program to conquer the game for me.

The Rules

Start screen of the Mastermind written using Python and Pygame

I think the rules of the game are a little confusing until you’ve played a few times, but I’ll do my best to explain them here. If you want to get some games in first you can play the original in your browser here. My implementation is available on Github. If you already know how to play Mastermind, feel free to skip this section.

At the beginning of the game, the Codemaker constructs a secret pattern made up of 5 pegs of varying colors. There are no constraints on how many times a color is used. For example, the Codemaker could choose Red, Red, Red, Red, Red (all of one color, a weak code) or Red, Green, Blue, Yellow, Pink (all colors, a better code). The Codemaker hides this pattern, and the Codebreaker has 6 tries to guess this hidden pattern correctly.

To make a guess, the Codebreaker constructs their own pattern and places it on the board in the Decoding Slot. The Codemaker then places either black or white pegs on the Hint Slot using the following logic:

Black: Placed for each guess peg that is both the correct color and in the correct slot. If the guess was Red, Red, Red, Red, Blue and the hidden answer was Red, Green, Green, Green, Blue, the Codemaker would place two black pegs for the correctly placed Red and Blue pegs.
White: Placed for each guess peg that is the correct color but in the incorrect slot. If the guess was Red, Red, Red, Red, Blue and the hidden answer was Blue, Green, Green, Green, Red, the Codemaker would place two white pegs for the incorrectly placed Red and Blue pegs.

The Codebreaker uses information gained from previous guesses to hone in on the only possible answer. If they can accomplish this in 6 turns or less, then they win.

So far, this sounds a lot like the popular game Wordle. The twist that really amps up the difficulty is that hints are randomized. Getting a black peg in the first hole doesn’t necessarily mean that the peg in the first hole is correct, it just means that one of the five pegs placed by the Codebreaker is in the correct spot.

The equivalent analogy for Wordle would be if when you guessed a word, the game just showed you a blank green tile, leaving you to figure out which letter was actually in the correct spot.

I usually count myself lucky if I win at all in Mastermind, but the baseline for the computer is 5 guesses or less. Let’s examine how it wins so consistently!

The Algorithm

In order to solve this problem, we will maintain two separate data structures. The first is a list of all possible answers for the game. This list will be exactly 3,125 answers long since we have five possible colors and five pegs to place these colors in. We’ll call this the Answer List. The second is a massive dictionary that enumerates the score of all possible guess-answer combinations. We’ll call this the Score Dictionary.

Visualization of the Answer List and Score Dictionary data structures

Knuth’s algorithm is all about reducing the number of potential answers as much as possible with each guess. On the first guess, the algorithm is hardcoded to always choose (Red, Red, Green, Green, Blue). If it is simply chose randomly, we can’t guarantee that the algorithm would always terminate in five or fewer guesses. A choice of all red pegs doesn’t give as much information as the guess stated previously. After the initial guess, the algorithm will choose different subsequent guesses depending on the hint feedback given by the Codemaker.

For example, let’s imagine that after its first guess the algorithm receives a hint of (Black, White, Black). This tells us that two pegs are in the correct place, and one is in the wrong place. Additionally, two pegs that are in the answer are missing from the guess. Using this information, we can take our Answer List and remove any answer where:

_Score((Red, Red, Green, Green, Blue), answer) != (Black, White, Black)_

This pseudocode corresponds to lines 11–12 of the Python code below.

After pruning the Answer List of all impossible answers, we can move on to the Score Dictionary. For each guess in the Score Dictionary, we remove the answer and its associated score if the answer is not in the new Answer List. We’re basically shrinking both the Answer List and the Score Dictionary to only have values that are possible given the information we acquired from our last guess.

Now that we’ve shrunk both the Answer List and Score Dictionary, we need to figure out what combination to guess next. Knuth’s algorithm uses worst-case analysis to decide on an optimal combination. For each guess in the Score Dictionary, we look at all possible answers and corresponding scores associated that guess-answer combination.

Then we group those answers by their scores, and count the number of answers in each group. The largest group represents the worst-case scenario for a guess: if we choose a guess and the correct answer is in that largest group, then we will receive the least possible amount of information.

We run this analysis for every possible guess in the Score Dictionary, building up a list of these worst-case numbers. After iterating through every guess, we take the minimum of this list. This gives us the best possible worst-case scenario. In other words, as Donald Knuth wrote, we are “minimizing the maximum number of remaining possibilities”. Strangely, the guess that accomplishes this may be an invalid answer. While the algorithm prioritizes guesses that can actually solve the game, it may occasionally guess combinations that are known to be impossible in order to gain more information.

The guessing algorithm in action (GIF by author)

Using this algorithm, it is guaranteed that the computer will guess the correct answer in six moves or less, meaning it never loses. Not quite as good as Knuth’s five guesses, but we have 1,829 more potential answers to contend with. On average, the computer guesses the correct answer in 4.44 moves.

Optimizations

Although the algorithm I described above is how my implementation solved Mastermind, it can be quite slow. The Score Dictionary in particular takes quite a long time to create since for each of the 3,125 possible guesses it calculates a score against each of the 3,125 possible answers. This produces a dictionary with 3,125 • 3,125 or 9,765,625 entries. This takes about a minute to generate each time I want to play a game. That’s pretty unacceptable. To fix this, I decided to create the Score Dictionary just once and store it as a binary file, loading it afresh each time I wanted to solve a new combination. This took solve time down to about three seconds.

However, I also realized that the algorithm is completely deterministic. For any given secret code, the algorithm will always play the same sequence of guesses. So I decided to run it against all 3,125 possible answers and just save the sequence of guesses to a dictionary. Now for any answer, the computer can just look up the sequence of guesses it needs to make.

This approach is a bit less flexible and doesn’t allow me to change the “difficulty” level of the computer opponent. With the original approach, if I wanted to make a worse computer opponent, I could just give it a heuristic that doesn’t always minimize the maximum number of remaining possibilities. That’s impossible with this approach because the algorithm is just a static lookup table with the optimal set of moves. This tradeoff is more than acceptable to me because solving time has become instantaneous.

Last Thoughts

I wanted to mention that Donald Knuth’s paper was a bit hard to understand at times. To get an idea of what I mean, see below:

Page 4 of "The Computer As Master Mind" by Donald Knuth. This was pretty intimidating at first glance.

Whenever I got too confused by Knuth’s paper, I turned to this excellent talk by Adam Forsyth explaining how to implement the algorithm in terms I could easily understand. This project would have taken me much longer had I not come across this resource. If anything I’ve said about the algorithm is unclear, I would definitely encourage you to check out Adam’s presentation.

Also, if you want to play against this algorithm yourself or just check out the underlying logic, all my code is publicly available on Github. Thanks for reading!