In the era of modern tabletop games such as Apples to Apples, Dixit, Sushi Go, Hive, Zombie Dice, and Tokaido (all great games), the classics are often left behind. While many of these games deserve to be forgotten (Monopoly, Jenga, Risk, Life, and Snakes and Ladders; just to name a few), the venerable Boggle still reigns as one of the few great word games. Boggle has a lot to offer: it is a fast paced game that scales well to many players, everyone plays at the same time, winning requires skill (pattern recognition and a decent vocabulary) and strategy (trying to find words that other players aren't likely to spot) rather than luck, and it has a high replay value (every game is different). It is also easy to level the playing field for younger players, e.g. by counting their words even if found by other players or reducing the value of longer words.
Boggle is interesting from a programmer's perspective as well. While crafting an algorithm to efficiently find all of the words in a Boggle game is pretty straight forward, answering questions such as:
Boggle is interesting from a programmer's perspective as well. While crafting an algorithm to efficiently find all of the words in a Boggle game is pretty straight forward, answering questions such as:
- What is the average number of words/points in a game and how does this differ between different sets of dice?
- What percentage of total points do 3-letter words constitute? 4-letter words? Etc.
- How can we determine whether a given arrangement constitutes a valid board for a particular game?
- How many words in a given dictionary cannot be spelled using any possible arrangement of the provided dice?
- What is the highest scoring Boggle board?
About Boggle
The Boggle game consists of 16 6-sided
dice, each side containing a letter. The dice are shuffled and
arranged onto a 4x4 grid where players are given 3 minutes to find as
many words as possible by stringing together adjacent letters from
the top face of each die. Words may be 3 letters or longer and a die
cannot be used more than once in the formation of a word. Scoring is
based on the length of the word. Players only score points for words
that nobody else finds, encouraging the scouting out of less familiar
words. Dictionary words are allowed, proper nouns and contractions
are not.
Boggle was originally launched in 1973
and there have been over a dozen official versions released over the
last 40 years. Changes include different letters printed on the dice
and different board sizes including a 5x5 version and, more recently,
a 6x6 version (Super Big Boggle). There are now dozens of popular variations available
in digital form including Zynga's “Scramble with Friends” and MAG
Interactive's “Ruzzle”, both of which incorporate letter values
and letter and word multipliers into the scoring. Sven Studios'
“Word Hero” is another popular variation that uses letter values
and a score multiplier based on word length.
All of these games share the same basic characteristics: a square grid of letters, usually populated from (virtual) dice where the goal is to obtain the highest score by finding as many words as possible. WGS can be used to solve and answer questions about any of these games, many of which are pre-configured in the stock distribution of WGS. While the focus here will be on two versions of 4x4 Boggle, the same principles apply to similar games.
All of these games share the same basic characteristics: a square grid of letters, usually populated from (virtual) dice where the goal is to obtain the highest score by finding as many words as possible. WGS can be used to solve and answer questions about any of these games, many of which are pre-configured in the stock distribution of WGS. While the focus here will be on two versions of 4x4 Boggle, the same principles apply to similar games.
Finding
the Words
The first order of business is
constructing a list of valid words on a given board for analysis. This is
not a particularly difficult problem and there are two common
approaches: 1) iterate through a dictionary trying to find a spelling
for each word on the board and 2) walk the board building up letters
and check each combination against a dictionary. I went with the
second approach. The basic idea is to build a trie
from the dictionary at the beginning and then walk the game grid,
consulting the trie at each step to determine if the current path
could potentially lead to a valid word (in other words, is the
current set of letters a prefix of a word in the dictionary). If a
given path consists of letters that do not form the prefix of any
word, that path is not explored further (it is pruned). This
path-pruning is essential for the algorithm to work efficiently as
the vast majority of paths do not leads to valid word prefixes.
As words are
found, the current path is processed by a scoring function which
produces a solution object which is added to the solution list. It
is necessary to keep the path for scoring purposes (some games use
letter-multipliers so two instances of the same letter may need to be
treated differently) as well as certain analysis purposes (requesting
the number of words that can be formed using the die at a specific
position). When the solution list is built, the provided command
determines how it is used: the solve command will display the
solutions using the provided format and the analyze command
will output the requested board statistics. It is possible to
analyze about 2000 boards per second on a modern CPU core and the
work is very easily distributed across multiple cores.
Generating
and Analyzing the Boards
With a mechanism to find words, scoring
and analyzing the words found is straight-forward, the same
configuration file that describes the game dice and board geometry
also describes the scoring rules. The WGS program has a create
command which uses a Fisher-Yates
shuffle to generate a random board based on the dice configured
for a particular game. The analyze command solves boards and
provides the specified information for each board. Used together,
the create and analyze operations make it possible to easily generate
and analyze a large number of random boards.
The 4x4 versions of Boggle manufactured
prior to 1992 all use one set of letters printed on the dice, the
versions made in 1992 and since use another. The configuration that
comes bundled with the WGS program contains both of these, referred
to as “Boggle (Old)” and “Boggle (New)”. For this analysis,
one billion random boards were generated for each die configuration
using a command that looks something like this (the actual work was
parallelized but the command is equivalent to the one used):
wgs
wgs.json create "Boggle (Old)" 10000000000 | wgs wgs.json
analyze "Boggle (Old)" "%B %C %S %3C %4C %5C %6C %7C
%8+C\n" dump-words >board_data.txt 2>word_data.txt
The first part of the command pipeline
generates a billion random boards and the second part runs them
through the solver which will output, for every board, a line
containing the information requested in the format string: the board
letters, the number of unique words found, the total score of those
words, and the number of 3-letter words, 4-letter words, 5-letter
words, 6-letter words, 7-letter words, and 8+-letter words. A list
of all words found along with a count of each word will also be
printed to stderr via the dump-words option. The board_data.txt file
looks something like this:
TLEOWSAITEZPNSYI
191 285 53 70 48 17 3 0
TMNTWAHEXFVLSGTE
101 132 47 37 13 2 1 1
GLLASUERSASEVTIC
273 429 68 101 68 28 8 0
VOONTLNIHJSGYHAD
108 158 41 40 18 5 3 1
ASMREHNAJTTIYLYN
133 193 45 43 32 12 1 0
...
And the word_data.txt looks something
like this:
AAH
39833743
AAHED
3050316
AAHING
162183
AAHS
11656084
AAL
35165324
…
This data is then processed by various
custom aggregators to produce statistics and generate graphic
representations of the data.
Note: The notation used to represent an individual boggle board is a string of 16 letters which are meant to be populated in a 4x4 board left-to-right, top-to-bottom. For example NNICSEDRMAREBOTS represents the board:
Below are the two different die configurations analyzed, each set of 6 letters represents the 6 sides of one die.
Boggle Old Dice
AACIOT, AHMORS, EGKLUY, ABILTY,
ACDEMP, EGINTV, GILRUW, ELPSTU, DENOSW, ACELRS, ABJMOQ, EEFHIY,
EHINPS, DKNOTU, ADENVZ, BIFORX
Boggle New Dice
AAEEGN, ELRTTY, AOOTTW, ABBJOO,
EHRTVW, CIMOTU, DISTTY, EIOSST, DELRVY, ACHOPS, HIMNQU, EEINSU,
EEGHNW, AFFKPS, HLNNRZ, DEILRX
The SOWPODS dictionary was used to
score the randomly generated boards.
Results
Boggle Old | Boggle New | |
---|---|---|
Avg Words | 122.99 | 133.62 |
Std Dev | 61.51 | 66.58 |
Avg Score | 171.79 | 193.88 |
Std Dev | 108.7 | 123.8 |
A Boggle New board averages about 133 words for a total of about 193 points, a modest increase from older versions of the game. The graphs below show the word count and point total distributions of both versions of the game.
Word Length | Frequency | Percent of words | Percent of Points |
3-letter words | 45.59 | 37.07% | 26.54% |
4-letter words | 45.54 | 37.03% | 26.51% |
5-letter words | 21.91 | 17.82% | 25.51% |
6-letter words | 7.652 | 6.222% | 13.36% |
7-letter words | 1.893 | 1.539% | 5.51% |
8-letter words | 0.349 | 0.2838% | 2.235% |
9-letter words | 0.04709 | 0.03829% | 0.3015% |
10-letter words | 0.004769 | 0.003878% | 0.03054% |
11-letter words | 0.0003689 | 0.0002999% | 0.002362% |
12-letter words | 2.187e-05 | 1.778e-05% | 0.00014% |
13-letter words | 9.61e-07 | 7.814e-07% | 6.154e-06% |
14-letter words | 2.7e-08 | 2.195e-08% | 1.729e-07% |
Word Length | Frequency | Percent of words | Percent of Points |
3-letter words | 46.56 | 34.84% | 24.01% |
4-letter words | 49.02 | 36.68% | 25.28% |
5-letter words | 25.47 | 19.06% | 26.28% |
6-letter words | 9.495 | 7.106% | 14.69% |
7-letter words | 2.502 | 1.872% | 6.452% |
8-letter words | 0.4965 | 0.3716% | 2.817% |
9-letter words | 0.0734 | 0.05493% | 0.4165% |
10-letter words | 0.007964 | 0.00596% | 0.04519% |
11-letter words | 0.0006584 | 0.0004928% | 0.003736% |
12-letter words | 4.271e-05 | 3.196e-05% | 0.0002423% |
13-letter words | 2.049e-06 | 1.533e-06% | 1.163e-05% |
14-letter words | 7.7e-08 | 5.762e-08% | 4.369e-07% |
15-letter words | 2e-09 | 1.497e-09% | 1.135e-08% |
There is not a major difference in the word length stats between the two versions. 3-letter and 4-letter words are each worth 1 point in Boggle and together they comprise about 70% of the words and 50% of the points on an average board. 5-letter and 6-letter words, worth 2 and 3 points respectively, account for about 25% of the words and 40% of the points. A little less than half of all boards contain an 8-letter word, the chance of a longer word occurring decreases by about a factor or 10 for each letter beyond 8.
Best and Worst Boards
Boards with No Words
146,909 Boggle Old boards (about one
out of every 6800 boards) and 134,641 (about one out of 7400 boards)
Boggle New boards had no words. An example of such a board is: WKTMSDTCVMRNDTFQ
Word Dense Boards
The Boggle (Old) board with the highest
number of words found (842) was: ALESSTIDTERANAME and is worth 2126 points. Out of a billion Boggle (Old) boards,
only 15 had 800 words or more.
The Boggle (New) board with the highest
number of words (906) was: TLGNAEIASTRSDEHB which is worth 2566 points. This was the only Boggle (New) board to
have 900 or more words, 31 boards had 800 or more words.
Highest-scoring Boards
The Boggle (Old) board with the highest
number of points (2447) was: DSLRETAIPRNTUDES It was the only
board to break the 2400 point barrier and only one of 23 boards
containing 2000 points or more. The board contains 790 words
including 72 words with 8 or more letters.
The Boggle (New)
board with the highest number of points was (2567) was: TSLNEIAENTRTBESO
Board
Validation
By “board validation” I mean
determining whether it is possible to create a given board grid
using a given set of dice. For example, Boggle only has one 'Z' so
any board containing more than one Z is not valid. Similarly, any
board with two 'B's is not a valid Boggle (New) board, the dice for
recent versions of Boggle contain two 'B's but they are both on the
same die so it is not possible to have them both appear on the same
board.
XLRV
VENT
SSFC
CUMB
Is this a valid Boggle
board?
Board validation is useful for
verifying user input (the WGS GUI does this for manually entered
boards). Board validation allows us to answer questions like “What
percent of Boggle (Old) boards are also valid Boggle (New) boards?”
and vice versa as well as the calculate the chance that a random
string of letters constitutes a valid Boggle board, the latter which will yield an estimate of the number of distinct Boggle boards. Finally,
analyzing this problem sets the foundation for the next section.
This kind of board validation is a
classic bipartite matching problem in which we seek to find the
maximum matching between a set of dice and a given board. If the
maximum matching consists of the entire provided board, the board is
valid, otherwise it is not. Note that it is possible to validate a
partial board, that is ZZ is in invalid partial board and AB is a
valid partial board. Note also that there may be multiple ways to
construct the board using the dice, some algorithms can provide one
or all solutions, others will simply tell us if the board is valid.
There are several ways to solve the
maximum bipartite matching problem including the venerable augmenting
paths algorithm and the Hopcroft-Karp
algorithm. The problem can also be reduced to a max-flow
problem and solved with an algorithm such as Ford-Fulkerson
which is how WGS does it.
The following command will create a
million random Boggle New boards and validate them against the Boggle
Old dice. The stats option will cause validator stats to be
printed at the end which will provide the information we need.
wgs
wgs.json create "Boggle (New)" 1000000 | wgs wgs.json
check-board "Boggle (Old)" stats >/dev/null
The
same command can be used to perform the opposite by swapping “New”
and “Old”.
About 90.4% of valid Boggle (New)
boards are also valid Boggle (Old) boards.
Only about 51.5% of valid Boggle (Old)
boards are also valid Boggle (New) boards.
8.47% of random strings of 16 letters
are valid Boggle (New) boards, 26.6% are valid Boggle (Old) boards.
Below is a breakdown of the chance that a random n-letter string can
be formed using Boggle dice.
Random String Length | Boggle (Old) | Boggle (New) |
1 | 100% | 100% |
2 | 99.2% | 98.3% |
3 | 97.4% | 95.2% |
4 | 94.9% | 90.8% |
5 | 91.7% | 85.3% |
6 | 87.8% | 78.8% |
7 | 83.4% | 71.7% |
8 | 78.6% | 64.3% |
9 | 73.3% | 56.6% |
10 | 67.8% | 48.7% |
11 | 61.9% | 41.1% |
12 | 56.0% | 33.7% |
13 | 49.8% | 26.9% |
14 | 43.4% | 20.6% |
15 | 36.5% | 14.5% |
16 | 26.6% | 8.5% |
Since every English letter is
represented at least once on both Boggle dice variations, 100% of
1-letter strings are valid partial boards. Most of the 2-letter combinations are
valid for both versions. JJ, QQ, XX, and ZZ are not valid for either version as only one of each of these letters exists on the dice. JQ is not valid for the older version as J and Q appear on the same die. In the newer version, BB, FF, KK, BJ, FK are not valid; the two Bs as well as the J exist on the same die and the two Fs exist on the die that contains the only K.
Distinct Boggle Boards
The
results of board validity for random 16-letter strings can be used to
obtain an approximation of the number of distinct Boggle boards. The
number of ways to permute 16 letters is 26^16, multiplying this by
.266 or .085 yields an approximation for the number of distinct
boggle boards using the old and new die configurations, respectively.
Thus there are roughly 1.16e22 Boggle Old boards and about 3.69e21
Boggle New boards (not accounting for rotations, reflections, and
other transformations).
Word
Checking
A particularly interesting question, at
least from an algorithmic standpoint, is “What
words cannot be spelled using any
possible valid board?”. We already know from above that the words
like BABY cannot be spelled because both of the letter 'B's in a
newer Boggle game are on the same die. The percent of words that
can be spelled with a particular die set is one factor to consider in
determining the efficacy of a configuration, the other being the
average words per board. A reasonable goal when designing the
letters for each die would be to strike a balance between
relatively high-scoring boards and being able to form a substantial
set of words. A game which provides an average of 10 words per board
won't be very satisfying but neither will a game that provided more
words with lots of repetition across different boards due to few overall unique word
possibilities.
One the surface,
this appears to be the same problem as partial board validation and
in many cases it is. There is a subtle difference though that is exposed when we try to use this solution
with dice that contain multiple letters on a face. For example, all
4x4 Boggle versions contain a “Qu” face and some of the 5x5
Boggle versions contain a die that has other two-letter combinations
such as “In”, “Er”, and “Th”. When multiple letters
appear, all of the letters must be used, in the provided order, to
spell a word with that die. For example, there is a “Qu” face
but no “Q” and as such words like BURQA cannot be spelled using
the traditional rules.
The problem with
multiple letters is that they turn the spell-able question into one
that can no longer be solved using bipartite matching because the
target word no longer has a single spelling but instead has multiple
spellings. For example, consider the word THEREIN when, in addition
to single letters, the combinations Th, Er, and In are available.
There are now 8 distinct ways to spell this word:
T H E R E I N
T H E R E In
T H Er E I N
T H Er E In
Th E R E I N
Th Er E In
Th Er E I N
Th E R E In
The problem is no
longer one of a single bipartite matching but a group of such
problems, the size of which can quickly explode for certain multi-letter
combinations.
With the addition
of multiple letters per die, the spell-able question turns into a set
cover constraint satisfaction problem. This is an NP-complete
problem which has no known general method of solving in less
than super-polynomial time. While it may be tempting to create a
special case for Qu as this is the only multi-letter combination used
by the 4x4 versions of Boggle, this doesn't allow for support of
customizable die configuration with arbitrary multiple letters.
Luckily there are
several optimizations that can be performed that allow this to work
well in practice. We start out by using Ford-Fulkerson, as before,
and fall back to solving the set cover problem (WGS uses Knuth's
Algorithm
X to do this) only when no solution is found and at least
one of the dice contains a multi-letter face which is a substring of
the word we are trying to form. We also quit as soon as a solution
is found and can perform checks such as ensuring that the length of
the target word is not longer than the longest letter that can be
formed using the provided dice. While it is possible to manufacture
circumstances in which the analysis of a word will take a substantial
amount of time, this doesn't occur when checking any of the
4x4, 5x5, or 6x6 Boggles against any of the popular dictionaries.
We can determine whether a given word can be spelled with a given dice configuration using the WGS's check-word command which will output the provided word with a leading minus if it cannot be spelled and a leading plus sign if it can. A whole dictionary can be checked for words that cannot be spelled like so:
wgs wgs.json check-word "Boggle (Old)" < /usr/share/dict/words|grep ^-
wgs wgs.json check-word "Boggle (Old)" < /usr/share/dict/words|grep ^-
The results of checking several Boggle versions against three popular dictionaries is provided in the below table.
TWL06 178,691 words | SOWPODS 267,651 words | ENABLE1 172,820 words | |
Boggle (Old) | 913 | 1563 | 3345 |
Boggle (New) | 9789 | 15299 | 12004 |
Big Boggle (1979) | 8529 | 13658 | 9078 |
Boggle Master | 9055 | 13623 | 8510 |
Big Boggle (2011) | 9135 | 13848 | 8715 |
Super Big Boggle | 340 | 539 | 297 |
Some observations about the data include:
Perhaps the goal was actually to prevent certain words from
being spelled. For example, in the newer configuration, both of the
“F”s and the only “K” now reside on one die making it
impossible to spell a certain 4-letter word (which appears on about 1 out of every 1000 Boggle Old boards). In fact, 3 of the
“Seven
Dirty Words” cannot be spelled on in the newer configuration,
along with numerous variations, whereas they could all be spelled in
the original configuration. The remaining 4 "Dirty Words" could not be eliminated
without seriously affecting the ability to spell a substantial number
of common words, significantly hampering game play. Could it be that the increase in unspell-able words is the fallout of an attempt at censorship? It seems plausible, especially considering that Boggle was marketed as a family game.
- SOWPODS and TWL06 contain no words with a length of more than 15 letters.
- ENABLE1 contains 4274 words of 16 letters or longer.
- Almost all of the 15-letter words can be spelled using the Boggle (Old) dice.
- There are 16 17-letter words that can be spelled using the Boggle (Old) dice, they are: ACQUAINTANCESHIPS, ACQUISITIVENESSES, CONSEQUENTIALNESS, COUNTERQUESTIONED, DISEQUILIBRATIONS, DISQUALIFICATIONS, INCONSEQUENTIALLY, INQUISITIVENESSES, PICTURESQUENESSES, PREQUALIFICATIONS, QUADRICENTENNIALS, QUARRELSOMENESSES, QUATERCENTENARIES, QUATTUORDECILLION, SESQUICENTENARIES, and SESQUICENTENNIALS.
- Of those 17-letter words, all except the following can be spelled using the 1992 dice: ACQUISITIVENESSES, DISQUALIFICATIONS, INQUISITIVENESSES, and PICTURESQUENESSES.
- There are 8 24-letter words that can be spelled using the Big Boggle (1979) dice, they are: DICHLORODIFLUOROMETHANES, ELECTROCARDIOGRAPHICALLY, ELECTROENCEPHALOGRAPHERS, ELECTROENCEPHALOGRAPHIES, INTERCOMPREHENSIBILITIES, MICROSPECTROPHOTOMETRIES, OVERINTELLECTUALIZATIONS, PHOSPHATIDYLETHANOLAMINE.
- The 25-letter word PHOSPHATIDYLETHANOLAMINES can be spelled using the Boggle Master dice.
- The longest words that can be spelled using Super Big Boggle are the 28-letter ETHYLENEDIAMINETETRAACETATES and the 27-letter words ELECTROENCEPHALOGRAPHICALLY and ETHYLENEDIAMINETETRAACETATE.
Notice the significant difference between the number of words that cannot be spelled between the Boggle Old and New versions with the TWL06 and SOWPODS dictionaries: There are about 10 times as many words that cannot be spelled using the newer configuration. We have already seen that the newer configuration results in a minor increase in the average board score but while the original configuration managed to kept the unspell-able words quite low, the newer configuration makes it impossible to spell a significant number of words. Why might this be?
Generating
High Scoring Boards
The last item on
the agenda is generating boards that have artificially high (or low) point values in an attempt to answer the question “What is
the best Boggle board?”. Such a capability is also useful for game balance purposes. For
example, the Scramble with Friends game uses a set of 16 dice to
generate their boards but the boards generated consistently have a
much higher scoring potential than the expected average of simple
random board generation. It is obvious that some mechanism for "enhancing" the randomly generated boards is employed to optimize the experience for the player.
There are several
approaches to generating such enhanced boards, most starting with a randomly generated board followed by
various operations performed in an attempt to improve the board
score. The operations that can be performed on an existing board
include changing the value of a particular die and swapping the
positions of two die. Each such change can result in either an
overall increase, a decrease, or no change in the board score. The
success of such an approach depends on the algorithm used to
determine when to move forward with a modified board vs. when to fall
back to a previous board and try again, and what type of
manipulations to perform.
The simplest
approach would be to randomly change a letter or swap dice, replacing
the current board with the new board whenever the score of the new
board exceeds, or possibly matches, the existing board. This is a
good start and will certainly find much higher scoring boards on
average than randomly rolled boards. The problem is that a
significant number of high scoring boards come from making a set of
changes that include manipulations that temporarily reduce the board
score, a small reduction in score from one manipulation may open the
door to future manipulations that significantly increases the overall
board score. Such cases would never be discovered by an algorithm
that insists on a constantly increasing board score and such cases
turn out to be significant in practice.
To address the
local minima problem, the algorithm should allow changes that reduce
the overall board score in certain situations, the details of when to
allow this are the core of such an approach. The technique employed
by WGS is simulated
annealing where a decrease in board score has the potential to be
allowed based on the impact of the board score (energy) and the
stability of the board (temperature). This allows disruptive changes
during initial board modifications but allows the board to stabilize
by only allowing successively smaller degradations throughout the
process. The current implementation is tweaked based on some testing
but is not particularly sophisticated and certainly has room for
further improvements. For example, the decision regarding which die
to manipulate is random, it may be more fruitful to consider
information such as how many words each die contributes to and
changing one that participates in fewer words. The program has
access to this information but doesn't currently use it.
Another approach
that has been used is the application of genetic algorithms to create
child boards that contain a combination of two high-scoring parent
boards in an attempt to create a board better than either parent,
repeating this process until the best board from that lineage is
found. A successful application of this approach may certainly be
possible but the available literature of such attempts didn't prove
particularly compelling, nor did it seem to produce superior results
so I didn't explore it in great detail.
The WGS program
provides optional parameters to the create command allowing
the user to specify word and/or score limits on the generated board.
The program will try to satisfy both limits when generating the
boards, employing the techniques described above to do so, ultimately
delivering the best effort if the requirements couldn't be met in a
reasonable number of manipulations.
100,000 requests
for optimized boards (using both Boggle versions) yielded the following best boards which are all
valid for both Boggle die configurations. The majority of these were
discovered in the first 10,000 iterations. The command used to generate the boards was:
wgs
wgs.json create “Boggle (Old)” 100000 9999 9999
Highest Scoring
Boggle Boards
SEGSRNTREIAESLPS
1435 4644
BESHLATEPIRSSENG
1418 4624
SPLBEIAENRTSGSET
1373 4623
SLPSEIAERNTRDESA
1452 4603
SRESGTAPENILDRES
1424 4532
Most Word Dense
Boards
SETSPIRDLANESETS
1482 4497
SDESTRNTEIAESLPS
1474 4390
SPLSEAIERTNROSED
1468 4507
SPLSEAIERNTRDESA
1463 4431
SRESOTAPENILSDES
1458 4192
A number of
transformations of the above boards were also found but are not
listed (they have the same word list and point values). These
boards, along with most of the optimized boards, far exceed the very
best boards found in the one billion random sample, a good sign that
we are in the territory of the best possible boards. But are these
the best boards? That's hard to answer but we can compare with the results of others who have tackled this problem:
A simulated annealing approach yielded SERSPATGLINESERS which is a transformation of the best board we found with 4644 points.
Another simulated annealing approach that yielded PERSLATGSINETERS as the best board. This board clocks in at 4512 points using SOWPODS.
While this doesn't prove we have found the best possible board, it does suggest that we are at least in the ballpark and that if there is a better board, simulated annealing may not be the best way to find it.
Summary of Results
- There are two die configurations, one used prior to 1992
and once used since. The new configuration results in about a 10%
increase in the number of words and the score of an average boggle
board but reduces to total number of words that can be spelled,
including a number of profane words.
- The average pre-1992 Boggle board contains 123 words and 172 points, the average post-1992 Boggle board contains 134 words and 194 points.
- There are about 3.69e21 distinct Boggle boards.
- The chances of rolling a Boggle board with no words is
about 1 in 7400.
- About half of randomly rolled Boggle boards contain an
8-letter word. The chance of a 10-letter word showing up less than
1 in 100, the chance of a 12-letter word is less than 1 in 10,000.
- 3-letter and 4-letter words, together, comprise about 70%
of the words on an average boggle board and about 50% of the points.
There are an average of about 25 5-letter words on a Boggle board
which account for about 25% of the points.
- The best known Boggle Board (using the SOWPODS dictionary) is
SEGSRNTREIAESLPS
and contains 1435 words and 4644 points. The most
word-dense board found is SETSPIRDLANESETSwhich
contains 1482 words and 4497 points.
This comment has been removed by the author.
ReplyDeleteIf no char can be shared between 2 words then the maximum number of words will be so that the sum of their chars is equal to the number of spaces on the board.
DeleteHi, I agree with you. Really this blog is very informative.
ReplyDeleteVery Interesting and wonderfull information keep sharing
ReplyDeletemobdro
Youre so cool! I dont suppose Ive read anything like this before. So nice to find somebody with some original thoughts on this subject. realy thank you for starting this up. this website is something that is needed on the web, someone with a little originality. useful job for bringing something new to the internet! ball games
ReplyDeleteCould anyone come up with an efficient way to generate a 5x5 (Big) Boggle board which contains specific words? I'm looking to prove that a board that contains the words for the numbers one through eighteen is impossible. I can get one through seventeen, one through eighteen except fifteen, one through eighteen except five, but not one through eighteen.
ReplyDeleteAny takers?
Wow, this is awesome! I am looking to find the highest scoring Boggle board that includes a 16-letter word, however I am having a hard time finding a source for the words since SOWPODS only goes up to 15-letter words.
ReplyDeleteI was wondering about how to find every configuration of any 16-letter word on the classic 4x4 Boggle board in which the 16-letter word could be played legally. I could do this using 1, 2, 3, 4... etc. so that at the end I would just have to plug in letters and run it through an automatic Boggle solver. But my problem is finding those configurations. :/
*not a bot*
ReplyDeleteThank you Rob, for this very informative post. I ended up using it in the development on my Boggle game online - the task was to find a way to present shuffles of the letters that resulted in the most words, and your graphs above helped us to find a solution.
We're writing up a blog post outlining the development of the game, and will be linking to this post.
Thanks again.
Matt
founder https://wordshake.com/boggle
Is there a program, or any way at all actually, to determine which letter combination applied to each of the 16 die �� would give the user the greatest probability to find the most amount of words possible? Using, for example, one university's findings, of the commonality of each letter in the English alphabet being used. Wherein each letter is used X number of times greater than the letter they found to be used last of all, that letter, of course, Q. The total percentage of each letter having been used more often than Q are as follow:
ReplyDeleteE 57
A 43
R 39
I 38
O 37
T 35
N 34
S 29
L 28
C 23
U 19
D 17
P 16
M 15+
H 15
G 13
B 11
F 9+
Y 9
W 7
K 6
V 5
X 1+
Z 1+
J 1+
Q 1
If someone was creating their own game, like Boggle, for personal use and decided to use 17 die �� as opposed to the common 16, and there is a formula, using the letter frequency above, that shows which 6 letter combination would be optimal for placement on each one of the 17 giving the user highest probability for the greatest amount of words that could potentially be found, would you mind sharing the formula? <-sorry for the massive run-on sentence. I'd be much appreciative as algebra is my least strongest academic subject. Great article and research by the way. You put a lot of effort and dedication into your work and it shows. Thanks so much for sharing.
Megan