Designing a JavaScript Sudoku Puzzle: An Adventure in Algorithms

As part of my neverending quest towards improving my programming skills and learning new patterns and algorithms, I had the idea of finding myself a well-defined challenge I can conquer from start to finish. After a bit of searching, I found one: programming a Sudoku puzzle game.

good_luck_morgan_freeman


Why thank you, Mr. Freeman, but I don’t need luck, I just need my trusty keyboard, my text editor, and my wit; because how hard can a Sudoko game be… (ahem)

A tl;dr Special Note

This article is meant to detail my adventures in planning, designing, failing and fixing a standalone javascript implementation of a Sudoku game with a solver. In this case, the process itself was the benefit — so that’s what this article will be about. If you don’t care about the process and you’re only here for the final code, you can skip to the bottom of the page. The code is available under GPL license, and you’re more than welcome to use it, improve it, and tweak it yourself. And if your experience is as educational as mine was, share it in the comments!

Understanding Sudoku Puzzles

I would assume that, by now, everyone knows what Sudoku puzzles are (and if you don’t, there’s a ton of articles explaining them online.) But there’s a difference between “knowing what the game is” and knowing it enough to actually design and implement it. So, as a proper scientist, my first step was to do some hands-on research. I went online and looked up the puzzle rules and some tips and tricks on how to solve it repeatedly. I also practiced a bit with a couple of easy puzzles, just to get the feel for it.

The rules

The rules are really straight forward:

  1. The board is comprised of a 9×9 matrix, divided into 9 sub sections
  2. Each square can have a number from 1 to 9
  3. Numbers must be unique per row
  4. Numbers must be unique per column
  5. Numbers must be unique per section

Deciding the Mechanics

The rules might be simple, but the mechanics of the puzzle can vary. Now that I know how the game is played and what is expected, I tried to make a list of relevant questions I should sort out for my version of the game:

  1. How many game boards do I want to be able to play concurrently?
    (One at a time)
  2. Is the validation done on each inserted number, or only at the end?
    (Let the user decide between validation on insertion and validation on demand)
  3. Should the inputs be limited to numbers?
    (No, let the user insert whatever they want, but mark these as invalid)
  4. Should I include a solver?
    (Yes!)

Right. Yes. Well, number 4 turned out to be slightly harder than I thought… but we’ll get to that.

More importantly, however, I started to think what each of those decisions meant to the structure and mechanics of the puzzle itself.

The Pattern

Since I decided on a one-at-a-time game board, the chosen structure is a singleton.

I also decided I will separate the logic of the singleton (user accessed public methods) from the logic of the actual game, by creating an encapsulated game object that will be referenced from inside the singleton. This is useful for a couple of reasons: First, it has a much more organized structure I could reference more logically. Second, having the game logic itself encapsulated in its own object meant I (or anyone else) could take it out of a singleton and into other patterns relatively easily.

Repeated Validation

Validation of a specific cell means checking whether the inserted number exists in the 8 other values in the row, then the 8 other values in the column, then the 8 other values in the section. So, each time the user inserts a number, I will have to go over 8+8+8=24 values to check for duplicates.

Considering the fact I have a 9x9x9 matrix, validating the entire matrix this way would mean having (9^2*24) = 1944 tests…

maybe-not

Yeah, that’s a bit excessive. I wasn’t entirely sure I can get rid of all of those tests, but at the very least I wanted to keep this number in mind, so I can aim towards a more efficient validation method.

Planning a solver

Planning ahead for the option of a solver means I should account for things like choosing a random number from a certain range and having some “bank” of available numbers per cell. This turned out to be a good thing to plan for, as you’ll see.

So, I had some sort of basic plan on how to design my puzzle, it was time to sit down and start implementing it.

proud_to_press_keyboard

Implementation

As I mentioned before, I decided to implement the puzzle with a singleton wrapping a game object. The basic structure looks like this:

I could then start building the GUI and add in the actual game logic.

The GUI

This was one of the easier parts of the implementation. The method produces a 9×9 table, each cell holding an <input> element that’s limited to 1 character. I included two ‘section’ styles: “sudoku-section-one” and “sudoku-section-two” so I could separate the sections visually. Each cell received one of these styles depending on its position in the matrix. More specifically (if you care about the math) each row and column are divided by three to produce a section. Then, using the trusty modulus, each section is given ‘-one’ or ‘-two’ class.

The result was this basic operational game board:

sudoku_base

Input Validation

Each <input> element listened to a ‘keyup’ event which did the following:

  1. If the user chose ‘validate on insertion’:
    • The inserted number is validated per its row, column and section.
    • The number is added to three validation arrays for its row, column and sections.
  2. The inserted number is stored in a matrix cache.

The matrix cache is a reflection of the values in the board. It serves two main purposes: First, it saved multiple calls to the element’s .val() by storing the values, and second, it always held the value before validation – so I could use it to always check what the previous number was at that cell.

The validation arrays were my solution to the multiple unnecessary comparison tests. The idea was to keep dynamically growing arrays that hold the existing values in each row, column and section. When a new number is inserted, I search for duplicates only in the values that are in the array. If the number already exists, it is invalid. If it doesn’t, it’s valid, and is inserted into the validation arrays. If the number replaced another number (so, the user changed some cell from ‘1’ to ‘2’ for example) I used the caching matrix object to recall the old number and remove it from the validation arrays so it is available for insertion again.

So now, instead of always checking 8+8+8 values on each validation, I check only the relevant, existing numbers. The only time I end up with a full 8+8+8 validation is when the entire row, column and section are filled.

Expand the code below if you want to see the full methods.

 

That works.

cpnkirkapproved

It was time to deal with the solver.

Sudoku Solver: Attempt #1

Admittedly, I don’t really play Sudoku very often, but for the purpose of this exercise, I sat to solve a couple of boards. After manually trying to solve a couple of games, I thought I had the bestest idea ever.

My initial idea was simple:

  1. Start with a matrix that serves as a “bank” of numbers. At first, the bank is full; each row, column and section has all the numbers from 1 to 9 available.
  2. Go over empty boxes one by one and pick out a random digit from the available numbers in the bank.
  3. Use that number to fill in the box.
  4. Take that number out of the respective row/column/segment bank.
  5. Rinse, repeat.

It sounded like an awesome idea. I even wrote a method that computes an intersection of a group of array elements, so all I had to do is ask for the intersection of the current cell’s row, column and segment arrays and receive a new array with available numbers. I then pick one at random and carry on.

It sounded like such a simple, elegant idea, such a great solution, such a wonderful plan.

So much so, in fact, that I smugly sat in front of my browser, and gave it a go, ready and willing to pat myself in the back:

Solver attempt #1

Solver attempt #1

Yes! I did it! It works, it– wait a minute.

dean-winchester-supernatural

What is that hole in there? And another, and another. Holes! My solutions are peppered with holes and missing numbers. That’s no solution!

That was rather anti climactic, and I could have sworn I heard the air bursting out of the huge bubble that my ego previously occupied. It was time to sit and examine my solver strategy more closely.

The Problem of the Missing Numbers

The issue, it turns out, is that solving puzzles is not just a matter of applying “what fits here”, but rather applying “what fits the entire board”. That makes things a bit more complicated.

Let’s take a second look at how my faulty solver worked (or, well, failed.) I went over each empty box and took the intersection of the available numbers in the row, in the column, and in the section, then plugged one of those available numbers into the input, taking it out of the banks, and carried on. At some point along the way, however, between checking for the remaining available numbers and taking numbers out of the array bank, I was left with cells that had no available numbers at all.

Take a look at the middle segment in the third row, for example. When the solver reached the third line (right after the number 5) it took the intersection of the available numbers in the row, column and section:

  • Available numbers in the row: 8, 9
  • Available numbers in the column: 2, 4, 6, 8
  • Available numbers in the section: 1, 4

Intersection: NONE!

computertoss

Alright, alright, don’t panic. These things happen. It’s not really a failure, it’s just a slight, tiny, hole-filled setback. It’s time to get back to the drawing board.

Sudoku Solver: Attempt #2

Undiscouraged, I decided to surrender myself to the endless pit of educational bliss that is Wikipedia. I went over articles about Sudoku-solving algorithms, about exact-cover problems, about recursion and backtracking and brute-force.

My initial goal was to plan, design and implement a Sudoku game. I ended up learning about new algorithms and new types of problems and solutions I had no idea existed. Win win.

So it turns out I wasn’t completely off track with my first attempt, I was just missing an extension to the idea. I was missing backtracking.

Note: There are actually a couple of ways to write a puzzle-solving algorithm, as I mentioned above (brute force, exact-cover, etc) but I settled on the algorithm that I felt most comfortable with theoretically.

The Backtracking Algorithm

The backtracking algorithm is a pretty cool way of solving puzzles by trial and error. The algorithm works by plugging in a legal possible number into empty squares in sequence, very much like my first attempt. The main difference, however, is that when the algorithm gets “stuck” with a cell that has no more legal numbers, it backtracks to the previous value, replaces it with a different possible number, and tries forward again. If no more legal numbers were found, it backtracks again, tries yet another legal number, and so on, recursively.

Not only does this make sense in theory, it also makes sense in practice. Check out this pasuedocode (taken from this post on StackOverflow)

It’s a pretty cool, rather elegant algorithm. As the posters on StackOverflow state, the efficiency of the implementation depends mostly on how the system finds a new empty square, and on how it finds the legal values.

Finding the Closest Empty Square

This one’s pretty straight forward. Starting from the current square, I walk the matrix to find the next value. Nothing too special see here.

Finding Legal Numbers per Cell

That one was a bit more elaborate, but it’s also where my first attempt helped. Starting with a ‘bank’ of numbers from 1 to 9, I go over the row, the column and the segment, taking out the numbers that are already in use. The elements that are left are ones that I can use in the given cell.

Testing

This time, however, I was curious to see how much time the solving method takes. I added another configuration option ‘show_solver_timer’ that, if enabled, prints the elapsed solving time to the console.

Satisfied, I tried my solver again on an empty board:

Elapsed time: 41ms

Elapsed time: 41ms

Okay.. okay, not too bad, not too bad at all.

But notice the first row — it goes by the sequence 1-9, and the sequence continues more or less in broken patterns throughout the board. This is because my ‘available numbers’ array is ordered, and the backtracking algorithm takes each option in order itself. This results in the ordered sequence, and also in the fact that the same exact solution will appear for each scenario.

That sounded like it should be improved, so I decided to add shuffling to my ‘available numbers’ array, and try again on an empty board:

random_solver

Elapsed time: 19ms

That’s more like it.

It’s also faster. Much faster.

… Wait a minute. Uh.. Why is it much faster?

huhsideways

The Counter-intuitive Speed Factor

This baffled me.

Intuitively, I expected the shuffled, randomized solver to take more time than the non-shuffled one, simply because I added yet another array loop each time a value is inserted. It should, theoretically, take longer. And yet, it’s faster.

After careful thought, I suspected this has to do with the recursion and backtracking. When I shuffle the numbers, the inputted numbers are distributed a lot better than if they go in order. I suspected that this distribution makes it easier for the algorithm to solve with fewer recursions and backtracks, which makes things faster in the bigger scheme of things.

It was time to test my hypothesis. I added yet another configuration option to the game: ‘show_recursion_counter’ that, if set to true, counts and displays the amount of recursions and backtracks that the solver performed. With this in my code, I began my tests. All solving routines were done on an empty board:

Non randomized solver:

  • Recursions: 392 (always the same)
  • Backtracks: 310 (always the same)

Randomized/Shuffled solver:

  • Recursions: 92-162 (range of 10 attempts)
  • Backtracks: 10-80 (range of 10 attempts)

Woah! What a difference!

drwhodance

That makes perfect sense, though, doesn’t it? The better distribution lowers the number of recursions and backtracks the solver performs. By a lot. Yay!

Summary

I don’t think I can stress enough just how AWESOME this experience was. Seriously. I mean, really:

lizlemon_highfive_myself

Let me bask in the glory that is my success. But it’s not just the success I’m basking in. To be perfectly honest, I enjoyed the challenge — and the learning itself — much more than the final working puzzle.

So, to summarize, here are the lessons I’m going to take with me.

  • Planning is important.
    Before I even started typing a single line of code, I already had a tentative list of what I should expect to deal with. From having to cache values to constructing valid number “banks”, keeping these in mind made the difference. I could concentrate on developing and improving the working code instead of rewriting the whole thing again and again.
  • Time efficiency might be counter intuitive.
    Something may sound less efficient, but under closer scrutiny, it fixes a bunch of issues that makes the process faster.
  • Don’t assume. Check.
    This relates to the counter-intuitiveness of things. Checking, testing and validating your hunches (or checking why your hunches were wrong) is the best way to gain experience and learn.
  • Algorithms matter!
    Really. They do. A lot.
    While it was a lot of fun trying to devise my own algorithm for solving the game, learning about existing algorithms helps me plan future ones on my own. It’s just like my days in Classical Mechanics 101 — learning the existing equations helps you solve problems. Learning why they work and how they were devised helps in formulating your own strategy for modelling a new system.

So, you! yes, you! Go out there, find yourself a challenge, and attack it head-on. And then come back and share your thoughts and lessons in the comments, to let me know this animated blog post was useful.

Your brain on algorithms.

Your brain on algorithms.

** Many Thanks to John Christopher Woodgate for some really cool questions that made me improve the code!

The Full Code

The full and documented code is available here for your pleasure.

It’s shared under GPL licence. Feel free to use it as you like, but please keep the attribute. Also, if you have more ideas on how to make it even more algorithm-awesome, go for it!

Resources

Animated gifs on this page were taken from http://reactiongifs.me/http://www.reactiongifs.com/ and http://www.reactiongifs.us/

Tags: ,