Writing a simple Sudoku solver in F#

To continue learning about F# and to practice programming functionally, I have written a simple Sudoku solver. I use the word simple, because some simple Sudoku problems can be solved only by going over the board and removing entries without having to make “guesses” and backtracking.

This guessing and backtracking usually involves implementing a depth-first search when writing a Sudoku algorithm.

In the interest of time (there’s another project which I have been delaying for months and need to be starting) I have only implemented a solution to the easy type of Sudokus.

The first part is writing code that checks whether a board is correctly solved. This part can be done first and it’s better to do so since we will need it to check whether our Sudoku solver actually works.

The process is pretty easy, in Board.fs I implement the types I need to represent the board (Value) and it’s state (BoardSolved).

Checking the board implies collecting all it’s rows, columns and boxes (which are nine 3×3 regions on the board) and then checking them for duplicates.

To solve a board we go over all rows, columns and boxes and for each item in these collections we remove from the list of possible values those values which have already been selected in another item. I call this “locking” the values.

This operation is applied successively until the board is solved. The most complicated part came because I used a 2D array to represent the board but I needed to convert the rows, columns and boxes values to lists. Converting the lists generated from the boxes was the single most complicated and longest part of the whole program.

I guess it pays to just use lists for everything and skip on using arrays. The language and it’s libraries are pretty much made for working with lists and sequences so when you deviate from that you end up having to write additional code.

Here is the GitHub repo with the whole thing.

To make this solver capable of solving any Sudoku, we would need to change the

| BoardUnlocked -> failwith “Board not yet solved”

line with calling a depth-first search which would select the first unlocked value, try and select the first possible value for this item and then continue recursively trying to solve the board until either it succeeds or it fails. When/if it fails backtrack and continue it’s search using another value.

Genesis Procedural Content Generator

One of the things I have been working on is Genesis, a procedural content generator written in F#. It generates random names and terrain maps.

The project has almost nothing to do with what I had envisioned when I started it.

Here are some sample maps :

 

 

Mountains are in grey, grasslands/plains in pale green, forests in dark green, desert/arid climate in brown and water in blue.

It’s still lacking rivers and lakes and the output isn’t satisfactory. Sadly I think I would need to start over to get realistic terrain generation, which was one of my goal. The main problem is that I used midpoint displacement and noise functions which while good enough for video games don’t generate photo-realistic results alone.

To try and correct that, I devised my own solution using a crude simulation of water erosion rather than look into and implement standard algorithms.

It was a learning experience and a lot of fun but I would have had quicker and better results by copying some existing algorithm.

I don’t know what to do with the project right now as there are other things I want to spend my time on, but I would still like to work on it some more in the future. Maybe take a whole different approach and generate fantasy maps with brushes and icons which are much less detailed.

How to create an F# project under Linux

There isn’t a ton of info about starting out a new F# project under Linux so I’ve decided to document how I do it.

Install Mono

Following the instructions on the official F# site, install Mono and F#:

sudo apt-get install mono-complete fsharp

You can test that F# is installed by typing

fsharpc

to bring up the F# compiler.

Install Visual Studio Code

You could use any editor, I chose Visual Studio Code because it offers superb F# integration when combined with the Ionide plugin.

After installing VS Code press Ctr-Shift-X to open the Extension window and search for “Ionide” and install the following extensions:

  • Ionide-fsharp
  • Ionide-Paket
  • Ionide-FAKE

Start a new project

Using Ctr-P bring up the command window and type:

>f#:new project

Follow the instruction and select a class library project.

This will create the base project scaffolding including some files and folders. Now might be a good time to do a git init.

Paket

To build your project you must use the build.sh script. If you try this now, the command will fail because it won’t be able to find the Paket bootstrapper.

Paket is the package manager that downloads, installs and manages dependencies,  much like NuGet, Cargo or RubyGem on other platforms.

So head on over and download the latest Paket bootstrapper specifically paket.bootstrapper.exe and drop it in the .paket folder that was created by Ionide.

You can now run

./build.sh

And Paket will create a paket.dependencies and paket.lock.

Adding some code to the project

There should be a folder with the project name you specified to Ionide . Let’s add a new file to this project.

You have two choices here:

  1. Add it manually
  2. Use Ionide to automatically add the file to the project

Add it manually

To add the file manually you must create a new file with a .fs extension using the file system or VS Code.

Next you must add your file to the project by editing the .fsproj file.

Then open your .fsproj file and locate the ItemGroup section which includes the Compile Include tags. Similar to this:
<CompileInclude=”genesis2.fs”/>
<NoneInclude=”Script.fsx”/>
Add a new entry:
<CompileInclude=”NewFile.fs”/>
<CompileInclude=”genesis2.fs”/>
<NoneInclude=”Script.fsx”/>
Be careful, the order of the elements is important. If a file A is a dependency for file B, it must come before file B in this listing. In this example NewFile.fs can be a dependency for genesis2.fs but the reverse can’t be true.
The manual technique, even though it is tedious, is useful, particularly to debug your fsproj.

Use Ionide

The alternative is to use Ionide a to add a new file for you.

Create the new .fs file like before, but this time use Ctr-P, type in

>f#

And select, Add current file to project. Voila!

Adding a new dependency to our existing project

Let’s add a new dependency to our project, MathNet.Numerics, a library that provides methods and algorithms for numerical computations.

Again let’s see how to add it both manually and using the Ionide plugin.

Before installing any packages make sure that the .paket/paket.exe file that was previously downloaded by the bootstrapper  is executable. To do so, change the permissions via the command line or the GUI.

Add it manually

First open the paket.dependencies file in your project root and add the following line:

nuget MathNet.Numerics
And then in your project folder (for each of the projects where you want to install the dependencies) locate the paket.references file and add this line:
MathNet.Numerics
Then run:
.paket/paket.exe install
To check that everything is working you can by adding the following in one of your source file:
open MathNet.Numerics.Distributions
And build your project again.

Use Ionide

Open the .fsproj file of the project you want to add the dependency to, type in Ctr-P and then:

> paket: Add NuGet to current project

Debugging the project

Twice I’ve had the project refusing to build because of the paket dependencies after a Mono upgrade. In this case your best bet is to delete the dependency info in the fsproj and add them again.

Algorithm to generate random names in F#

I remade and improved my random name generator algorithm I had done in Ruby several years ago, but this time in F#.

It works by taking a sample file which contains names, the names should be thematically similar, and uses it to create chains of probabilities. That is, when we find the letter A in the sample, what are the possible letters that can follow this A and what probability is there for each of these letter to come up.

This probability chain can have any length bigger than one.

Here are the steps for this algorithm:

  1. Build a probability table from the input file.
  2. Generate name length info from the input file.
  3. Generate a name with the name length and probability table.

Build a probability table

Here’s what a probability table looks like:

{“probabilities”:{” “:{“al”:0.973913,”am”:0.965217,”ar”:0.93913,”at”:0.930435,”au”:0.921739,”ba”:0.904348,”be”:0.886957,”bi”:0.878261,”bo”:0.86087,”bu”:0.852174,”ca”:0.843478,”co”:0.834783,”da”:0.808696,”de”:0.791304,”do”:0.782609,”dr”:0.773913,”el”:0.756522,”eo”:0.747826,”fa”:0.73913,”ga”:0.721739,”gh”:0.713043,”gi”:0.704348,”gr”:0.695652,”gu”:0.669565,”ha”:0.643478,”ho”:0.626087,”is”:0.617391,”je”:0.6,”ju”:0.591304,”ka”:0.582609,”ko”:0.556522,”ku”:0.547826,”la”:0.53913,”li”:0.521739,”lo”:0.513043,”lu”:0.495652,”ma”:0.486957,”me”:0.478261,”mh”:0.46087,”mi”:0.452174,”mo”:0.426087,”no”:0.4,”on”:0.391304,”or”:0.373913,”pa”:0.356522,”ph”:0.330435,”pu”:0.321739,”qa”:0.313043,”qu”:0.304348,”ra”:0.278261,”rh”:0.269565,”ri”:0.26087,”ro”:0.234783,”ru”:0.226087,”sa”:0.191304,”se”:0.182609,”sh”:0.173913,”ta”:0.147826,”th”:0.13913,”to”:0.130435,”tu”:0.121739,”ul”:0.113043,”va”:0.086957,”vo”:0.078261,”wa”:0.069565,”wi”:0.06087,”xa”:0.052174,”xe”:0.043478,”yu”:0.026087,”ze”:0.017391,”zi”:0.008696,”zu”:0.0},”a”:{“ba”:0.970874,”be”:0.951456,”de”:0.941748,”di”:0.932039,”ev”:0.92233,”go”:0.893204,”gu”:0.883495,”hd”:0.873786,”hr”:0.864078,”ie”:0.854369,”ig”:0.84466,”im”:0.834951,”

,”nameLengthInfo”:{“mean”:7.4273504273504276,”standardDeviation”:1.7261995176214626}}

This table can be serialized to prevent recomputing it each time we call the algorithm.

The algorithm works with sub strings of size X where a small X will provide more random results (less close to the original result) but a larger X will provide results more closely aligned with the sample file.

Results more closely aligned with the sample file better reflect the sample but face a higher risk of ending up as a pastiche of 2 existing names or in some cases being one of the sample’s name as is.

Here’s how the sub strings work:

If we have the following name in our sample file:

Gimli

Using a sub string length of 2 would add all these sub strings in our probability table:

G i

i m

m l

l i

While using a sub string length of 3 would add these sub strings:

G im

i ml

m li

And so on as we increase the length of the sub strings.

After we have counted all the possible occurrences of each sub string over the whole file we assign a probability to each one.

For example, if for our whole file we would have the following possible sub strings for G

G im

G lo

G an

Each of these would be assigned a probability of 33.3%.

Generate name length info from the input file

The generate names of a length representative of our input sample we simply count the  length of each name and derive a mean value and standard deviation. Using the mean and standard deviation will then easily allow us to draw a value from the normal distribution of word lengths.

Additionally it’s best to enforce a minimum word length. Even if our sample contains shorter names (2 or 3 letters long), from experience the algorithm doesn’t produce convincing results on these shorter lengths.

This is because it doesn’t differentiate sub strings for long and short names.

Generate a name with the name length and probability table

To generate a name we start by finding our desired name length using our name length info. Then we select our first character, the white space character.

We then generate a number between 0.0 and 1.0 (or 0 and 100) and using a prebuilt dictionary containing the probability table, find the next item.

Code

The code is also available on GitHub in a more readable format.

Sample and Examples

The larger the sample the better. Also the more thematically aligned the sample, the better. What I mean by thematically aligned is if you include the names of all Greek masculine mythological figures, you will get results that resemble the names of the Greek heroes and Gods.

For example:

Thedes
Kratla
Pourseus

On the other hand if you build your samples with names from the Lord of The Rings but include an equal part of Hobbits, Dwarf, Elven and Orcish names you will end up with a mishmash that does not make much sense.

Finally here are some results of the algorithm using the this sample file containing the names of some of the locations in the games Final Fantasy XI and Final Fantasy XIV:

Bastok SanDoria Windurst Jeuno Aragoneu Derfland Elshimo Fauregandi Gustaberg Kolshushu Kuzotz LiTelor Lumoria Movalpolos Norvallen Qufim Ronfaure Sarutabaruta Tavnazian TuLia Valdeaunia Vollbow Zulkheim Arrapago Halvung Oraguille Jeuno Rulude Selbina Mhaura Kazham Norg Rabao Attohwa Garlaige Meriphataud Sauromugue Beadeaux Rolanberry Pashhow Yuhtunga Beaucedine Ranguemont Dangruf Korroloka Gustaberg Palborough Waughroon Zeruhn Bibiki Purgonorgo Buburimu Onzozo Shakhrami Mhaura Tahrongi Altepa Boyahda RoMaeve ZiTah AlTaieu Movalpolos Batallia Davoi Eldieme Jugner Phanauet Delkfutt Bostaunieux Ghelsba Horlais Ranperre Yughott Balga Giddeus Horutoto Toraimarai Lufaise Misareaux Phomiuna Riverne Xarcabard Gusgen Valkurm Ordelle LaTheine Konschtat Arrapago Carteneau Thanalan Coerthas Noscea Matoya MorDhona Gridania Rhotano Uldah Limsa Lominsa Dravanian Ishgard Doma Sastasha Tamtara Halatali Haukke Qarn Aurum Amdapor Pharos Xelphatol Daniffen Aldenard Garlea Eorzea Vanadiel

Note that this sample is very small and not thematically consistent, still here are the results using a sub string length of 2:

Gazormon
Vasamaur
Ltemenolp
Zonaone
Ldausa
Zorvaie
Kugo
Limoruxeab
Raullaiat
Jelphorshi

A sub string length of 3:

Arzergid
Mhaure
Phowindugh
Rhonearlos
Tuto
Saltabaolo
Qangarle

And a sub string length of 5:

Arronfais
Sautotara
Tahimorut
Batahranperr
Movernguegan

I feel that the algorithm could still use some improvements but is still very satisfactory considering the bad quality of the sample file used.