Versatility
Dec. 22nd, 2012 08:44 pmWhat is the most versatile set of three distinct letters that one can have in Scrabble?
This is the kind of thing that randomly goes through my head from time to time. I was wondering if there was any set of 3 letters for which all six unique permutations of the letters are actual English words.
So for example, if you have the letters PTO, you can write POT, TOP, and OPT, but TPO, PTO, and OTP are not real words (fandom acronym appropriations notwithstanding).
What is the most permutable set of three English Letters? Are there any that have all six possibilities covered? Using the list at this site, I did some research and some data manipulation in R to find the answer.
The Data Set:
There are 1011 valid three-letter English words. Not all of them have three unique letters, however. For example, the first entry
It has 2 A's so the length of unique characters is 2. Assigning those values to each of the entries in my data frame, using the "apply" function:
Out of the 1011 words, 905 of them have 3 distinct letters. Snag them and rearrange the ordering so that any multiples will be sorted the same way: eg, OPT, TOP, and POT all get mapped to alphabetically-ordered "OPT". This is done by splitting the word into its constituent characters, sorting the characters, and pasting them back together into the new word.
Now we have a list of 905 words, that will contain repeats of a unique letter sequence when that sequence makes more than one word. We just have to tabulate them:
So there are definitely fewer unique 3-letter groups than words, obviously, owing to overlaps and repeats like OPT, PIN, BAG, etc. The 905 words fall into 640 categories. Do any have six permutations?
No! about 66% of the combinations actually only have 1 valid word (422/640). The highest we have is one set with 5 permutations, and then we have 2 sets with 4 permutations, and we jump all the way to 40 sets with 3. ETA: I should also say, there are 26 choose 3 = 2600 different ways to select three unique letters out of 26, so there should be a 0 spot there, with a count of 1960 letter combinations with no valid words.
Any guesses as to the three most popular?
So that's the answer. The most versatile set of letters is:
AET: for which ATE, EAT, ETA, TAE, and TEA are all valid words
Runners up:
APS: for which ASP, SAP, PAS, and SPA are all valid words. I think, give it a few more years, and this group will join into the 5-permutation crowd, since we are well on the way to APS being a word.
AHS: for which AHS, ASH, HAS, and SHA are all words.
So there's the answer. No fully permutable set of 3-letter words, and only a few groups come close. I wonder what 4-letter words would look like!
This post brought to you by procrastination and geekery.
This is the kind of thing that randomly goes through my head from time to time. I was wondering if there was any set of 3 letters for which all six unique permutations of the letters are actual English words.
So for example, if you have the letters PTO, you can write POT, TOP, and OPT, but TPO, PTO, and OTP are not real words (fandom acronym appropriations notwithstanding).
What is the most permutable set of three English Letters? Are there any that have all six possibilities covered? Using the list at this site, I did some research and some data manipulation in R to find the answer.
The Data Set:
> tlw = read.table("threeletterwords.txt")
> tlw[,1] = as.character(tlw[,1])
> tlw[1:10,]
[1] "AAL" "AAS" "ABA" "ABO" "ABS" "ABY" "ACE" "ACT" "ADD" "ADO"
> nrow(tlw)
[1] 1011
There are 1011 valid three-letter English words. Not all of them have three unique letters, however. For example, the first entry
> unlist(strsplit(tlw[1,1],"")) [1] "A" "A" "L" > table(unlist(strsplit(tlw[1,1],""))) A L 2 1 > length(table(unlist(strsplit(tlw[1,1],"")))) [1] 2
It has 2 A's so the length of unique characters is 2. Assigning those values to each of the entries in my data frame, using the "apply" function:
> tlw$NumUniq = apply(tlw,1,function(x){length(table(unlist(strsplit(x[1],""))))})
> tlw[1:10,]
tlw NumUniq
1 AAL 2
2 AAS 2
3 ABA 2
4 ABO 3
5 ABS 3
6 ABY 3
7 ACE 3
8 ACT 3
9 ADD 2
10 ADO 3
> sum(tlw$NumUniq == 3)
[1] 905
Out of the 1011 words, 905 of them have 3 distinct letters. Snag them and rearrange the ordering so that any multiples will be sorted the same way: eg, OPT, TOP, and POT all get mapped to alphabetically-ordered "OPT". This is done by splitting the word into its constituent characters, sorting the characters, and pasting them back together into the new word.
> tlw.uniq=tlw[(tlw$NumUniq == 3),1]
> tlw.uniq.sort = unlist(lapply(tlw.uniq, function(x){paste(sort(unlist(strsplit(x,""))), collapse="" )}))
Now we have a list of 905 words, that will contain repeats of a unique letter sequence when that sequence makes more than one word. We just have to tabulate them:
> tlw.uniq.sort.count = rev(sort(table(tlw.uniq.sort))) > length(tlw.uniq.sort.count) [1] 640
So there are definitely fewer unique 3-letter groups than words, obviously, owing to overlaps and repeats like OPT, PIN, BAG, etc. The 905 words fall into 640 categories. Do any have six permutations?
> table(tlw.uniq.sort.count) tlw.uniq.sort.count 1 2 3 4 5 422 175 40 2 1
No! about 66% of the combinations actually only have 1 valid word (422/640). The highest we have is one set with 5 permutations, and then we have 2 sets with 4 permutations, and we jump all the way to 40 sets with 3. ETA: I should also say, there are 26 choose 3 = 2600 different ways to select three unique letters out of 26, so there should be a 0 spot there, with a count of 1960 letter combinations with no valid words.
Any guesses as to the three most popular?
> tlw.uniq.sort.count[1:3] tlw.uniq.sort AET APS AHS 5 4 4
So that's the answer. The most versatile set of letters is:
AET: for which ATE, EAT, ETA, TAE, and TEA are all valid words
Runners up:
APS: for which ASP, SAP, PAS, and SPA are all valid words. I think, give it a few more years, and this group will join into the 5-permutation crowd, since we are well on the way to APS being a word.
AHS: for which AHS, ASH, HAS, and SHA are all words.
So there's the answer. No fully permutable set of 3-letter words, and only a few groups come close. I wonder what 4-letter words would look like!
This post brought to you by procrastination and geekery.