Simulating Wordle: in search of the perfect strategy

preview_player
Показать описание
I have written a python Wordle simulator and let it play about 1.000.000 games.
- Will using green/yellow/gray letters result in a 100% win rate?
- Is a 100% rate even possible?
- And what is the best word to start after all?

––––––––––––––––––––––––––––––
Wasting Time by Sapajou & Yorgo H
Creative Commons — Attribution 3.0 Unported — CC BY 3.0
––––––––––––––––––––––––––––––
Track: Growth, not Stagnation — Artificial.Music & Gator Tots [Audio Library Release]
Music provided by Audio Library Plus
––––––––––––––––––––––––––––––
Track: Retro Future — Another Kid [Audio Library Release]
Music provided by Audio Library Plus
––––––––––––––––––––––––––––––
Track: Back to 1981 — Iaio [Audio Library Release]
Music provided by Audio Library Plus
––––––––––––––––––––––––––––––
Рекомендации по теме
Комментарии
Автор

Hey everybody
Thank you for all the comments and suggestions. A few updates:
- I did see 3b1b’s latest video, also it was in the comments. Yes, I got the double letters gameplay wrong too. (Can’t say it was a bug though, that was me being a bit arrogant and ignoring one of the most fundamental principles of software development: “Never assume!”)
- New video is in the making, with 3 additional approaches to try to play/solve Wordle. That double letter thing is really a wrench in the works…

GamesComputersPlay
Автор

In addition to what others have said, the letters could be analyzed to not only see how common they are in words, but in what positions they are more likely, because getting a green is more powerful in eliminating incorrect words when you continue to use all available space to delve for new info.

Aimlesswaves.
Автор

It's not always better to stick with the green letters. For the most efficient strategy you want the word with the potential of eliminating the most possibilities from the remaining search space. Keeping the green is actually a waste because you get no new information.

jamesflames
Автор

The two main improvements I can think of:

1) Guesses should not use green letters unless it is the last guess or you have found all 5 green letters. Each guess should try to find the most information possible, and repeatedly guessing using known letters wastes valuable information. I expect this to increase the average length of a game, but also increase the win rate.

2) Instead of guessing words which have the best weight where weight is calculated by the sum of the occurrences of its unique letters in all other words, you should use letters which are mutually exclusive and cover the most words. For example, "e" may be the most common letter, but it might be a bad guess combined with the letter "a" if words that contain a and/or e is a smaller set than, say, e and m. This may mean that the best starting words aren't real words at all, but are instead nonsense strings.

zactron
Автор

There's still a big piece of information your algorithm doesn't account for: the positioning of the letters. Maybe I missed something but it seems you haven't included the fact that specific letters can't go in slots if they have previously come up yellow. If they could go in that slot, they would have been green. Yellow therefore actually gives you two pieces of information: a) this letter is in the word b) this letter does not go in this slot.

theepicosityofpizza
Автор

And I realize every lost game after 2:11 is one letter off

iaongacheang
Автор

Great video, fun project!! Hope to see a part 2 implementing a strategy that files away the green letters and instead of guessing them again eliminates more words to then understand the full word faster.

ddxaidan
Автор

hell yeah i was hoping youd do this one

Chloe-jujp
Автор

I realize this video is almost two months old. By now I'm sure you realize that, at times, you should use a "burner word, " as I believe it is known, that you know cannot be a solution, but is played with the idea of gaining information.

For example, assume you've identified the last four letters as "ATCH" and that you also know that no "L" is in the solution. The answer to the puzzle could be BATCH, CATCH, HATCH, MATCH, PATCH, or WATCH. (Yes, keen observers will notice that GATCH, NATCH and RATCH are also words and are acceptable _guesses_ in Wordle, but these three words are not listed in the 2, 309 list of words that might be _answers_.)

If you just start guessing, you run the risk of not solving the puzzle at all. There are six possible answers... and by now assume you've already used a row or two. Thus, instead of guessing, you must (should) use a burner word. If you want to lower your solution row expectancy, and in this case solve the puzzle, you must play a word which you know isn't the solution to the puzzle... it won't fit at all... but will provide maximum feedback.

For example, in this example you could play the word WIMPY which uses a W, an M, and a P.

If the W receives a green box, the solution is WATCH.
If the M receives a yellow box, the solution is MATCH.
If the P receives a yellow box, the solution is PATCH.

The only other possibility is that you don't receive any colored boxes at all, and if so, you've just eliminated all three of those words. You can then enter another invalid word, like CYBER, that contains a C and a B, to guarantee a solution in five rows. (If the C gets a green box the word is CATCH. If the B receives a yellow box the word is BATCH. If the C receives a yellow box you know the word is the only one left, HATCH, for a solution in 5 rows.)

Also, note some valid and possible guesses are much better than other valid and possible guesses. For example, in a recent game I knew the possible answers were SLOOP, SLOSH, SLUMP, SLURP, SLUSH, or SLYLY. All six words are equally possible answers. However, of the six, SLYLY would have been a _terrible_ guess. Why? Because if it is wrong, it won't help my case any. All five of the other words would still be possible! SLYLY won't help to eliminate any of the others. A better guess would have been SLUMP or SLURP. They also have the same chance of success as SLYLY... but if they are wrong they _will_ help to eliminate one or two of the others, narrowing down my search space.

I suspect your initial program did not do either of these two things.

Every puzzle can be solved in five rows or less. Some of the other Wordle bots have already proven this. (And with many of them, SALET has been shown to be the best possible guess, that will lower the search space more than others.)

Nice graphics and thanks for taking the time to create this video.

MrEdwardCollins
Автор

As others have pointed out, there are opportunities to optimise further by looking at positions. So while E is a very common letter in general, in actual usage it's much, much more likely to be at the end of a word than the beginning, for example.

edzeppelin
Автор

I think the best possible algorithm would be one that tries to narrow down the search space as much as possible, in the worst case. If we start with the 2315 possible answers, and we guess "arise" (or any of its anagrams like "raise"), then no matter what pattern of colors the computer gives back, there will be a MAXIMUM of 168 possible answers that could fit that pattern. This value of 168 is the best possible value for any starting word, there is no word that can always guarantee an answer space of 167 or less after starting with it.
We can keep applying this logic, reducing the number of possible answers as much as possible, until we either only have 1 left, or we stumble across the answer by luck.
However, this still isn't perfect. If the only two possible answers are "dizzy" and "fizzy" then the algo might throw a word like "faded" to first distinguish the two (which will require two guesses to get it right), instead of just guessing "dizzy" and taking the 50/50 chance of getting it in just one guess. To fix this, in case of a tie for the best possible guess, then we will prioritize guesses that are actually in the answer space.
The algorithm is deterministic, so I ran it once for each of the 2315 possible words to come up with an accurate probability distribution. I used all 12972 words for guessing, but the 2315 for the possible answers (which is how actual Wordle does it). Here is the score distribution:
1 - 1
2 - 53
3 - 1001
4 - 1165
5 - 59
>5 - 0
The average score is 3.56. Every single word was guessed within 5 guesses, 97.5% were guessed within 4 guesses, and over 45% were guessed within 3 guesses.

If we allow the correct answer to be any one of the 12972 words from the larger set, then the best starting word is "serai", which narrows it down to 697 possibilities in the worst case (this time, we can't anagram it). The distributions are as follows:
1 - 1
2 - 66
3 - 1741
4 - 6405
5 - 4060
6 - 658
7 - 37
8 - 4
>8 - 0
The average score in this case is 4.28, slightly better than the algorithm in the video. We can solve 99.69% of words within 6 guesses. The four words that required 8 guesses are "gills", "tests", "vests", and "zests". Interesting that the maximum number of guesses is so much higher in this case.
The reason that "gills" is so high is that after doing just three guesses, we are left with the words: vills, jills, hills, bills, fills, pills, kills, zills, gills, mills, cills
For the other three words, after doing three guesses we have yests, fests, vests, bests, wests, zests, pests, gests, kests, tests, hests, lests, jests
As you can imagine, this really sucks for the algorithm.

Also, fun fact: There are 11 different ways to make a set of 5 5-letter words that cover 25 unique letters with zero repeats. A potentially viable but very risky strategy is to start with one of these sets of 5 words, then try to guess the answer on your 6th and final guess. They are:
clipt jumby kreng vozhd waqfs
glent jumby prick vozhd waqfs
chunk fjord gymps vibex waltz
fjord gucks nymph vibex waltz
bemix clunk grypt vozhd waqfs
brick glent jumpy vozhd waqfs
blunk cimex grypt vozhd waqfs
jumby pling treck vozhd waqfs
brung kempt vozhd waqfs xylic
brung cylix kempt vozhd waqfs
bling jumpy treck vozhd waqfs

Eclpsed
Автор

I think it comes down to minimizing the worst-case size of the decision tree (or minimizing the branch with the highest probability if considering letter/position frequencies). Each turn has 3^5 outcomes (colors per position), and each outcome reduces the set of possible words. The best greedy move gives the biggest reduction in words for the worst-case of the 3^5 outcomes. One way to represent this problem is to make a 5*26 array of colors per letter/position for each word. For a set of possible words and one choice of letter/position, the worst possible outcome is the color shared by the most words for that letter/position, e.g. `def worst_case(letter): max([np.sum(array[words, letter, position] == c) for c in colors])` - this is usually grey, but in the leftmost case at 12:10 all remaining words have "R2 A3 C4 K5" as green. Therefore, the best individual letter for a position has the least shared colors among remaining words, e.g. `argmin([worst_case(letter) for letter in letters])`. Choosing the best move over all letter/positions is harder (26^5 moves, can be brute forced but slow), but choosing each position independently should be a good place to start, and should cover the situation at 12:10. I think there's also a best overall sequence of moves for a fixed word distribution, but can't think of a way to efficiently construct the entire search tree.

Love the video and channel BTW. It's exactly what I've wanted to see or make on youtube. Hope you keep growing!

Sagaciux
Автор

This was really interesting! Makes me want to strike out and do my own testing!

Also a lot of fun comments with good ideas. I can only hope for a sequel/part 2 one day.

mme
Автор

Love this episode, very visually pleasing, and something I'm already interested in.

I was workng on this myself a few weeks ago. I basically created a massive matrix of every 'guess' and every 'solution', and the combination of greens, yellows and greys it gives you. At each step, it then calculates a heauristic value for each possible guess, based on the worst possible scenario in terms of words left (RAISE and it's anagram are the best starting words by this alone: the worst case scenario here is matching no letters, with 168 possible answers remaining), and the geometric mean of the number of words remaining after that guess (no real reason for doing it this way, I just tried a few things and it worked the best). The guess with the minimum value is used, and the process repeats.

As pointed out in a different comment below, this method is entirely determnistic, so I can just test it for the 2315 possible answers and confirm that it always wins. With normal wordle rules, it always wins in 5 turns or fewer, with hard mode, it always wins in 6 or fewer. The algorithm often picks words that do not match the yellows/greens it already knows, which is why hard mode takes slightly longer, when it is forced to.

It starts a normal game with the word LATEN, and a hard mode game with the word SWEAL, so it never wins on the first turn :(

misterbobEA
Автор

there's one more metric of the guess you overlooked in the video, the position of letters is important too. lots of words end with e but a lot fewer start with it, for example.
I think it's probably quite difficult to have a strategy that weights the two against eachother properly, especially with an algorithm that avoids wasted guesses, when I play myself I tend to avoid reusing greens until I have 4-5 letters or I'm on guess 4+, to get more value out of each guess. But it's hard to formulate an ideal strategy around this, that's something I do by gut feel.

empty
Автор

A suggestion I haven't seen in these comments is to use an n-gram like approach looking at not only one letter at a time but 2, 3 and 4 letter combinations to contribute to your weights. You could use this in combination with the position to add a lot of information to your guesses. This would certainly be computationally expensive, so it may not be the most practical solution.

joeyharrington
Автор

Жаль что видео давно не выходят. Пересматриваю старые, очень хорошие.
И главное акцент понятный и такой приятный, свой)

EugeneStorozh
Автор

I use ROYAL SETUP as my first two words usualy followed by a word ending with ING

jirkakalecky
Автор

The strategy I use is to start with a word with common letters but then I use the second guess to position the possible yellow letters, completely ignoring greens. If there are no yellows, I use common letters not used yet. This way I eliminate much more letters and I think it should win 100% of games.

Henrix
Автор

Do. Not. Fear. The. Gray. Letters. When you come up with gray letters start stretching your mind around letters that would fit better, especially if you have yellow or green letters in your first two words. It's important to think not only one step ahead, but two steps ahead and try to see if you can't get a win on your third round instead of your fourth or later. I personally favor doing 5-letters the first round similar to what probability is showing to be the ideal probability range and then doing the second round with totally different 5-letters. You are extremely unlikely to win by round 1 or 2 but 3 is an open book. And if you fail round 3, round 4 is almost guaranteed.

cylenalag