Home » Algorithm to generate anagrams

Algorithm to generate anagrams

Solutons:


Most of these answers are horribly inefficient and/or will only give one-word solutions (no spaces). My solution will handle any number of words and is very efficient.

What you want is a trie data structure. Here’s a complete Python implementation. You just need a word list saved in a file named words.txt You can try the Scrabble dictionary word list here:

http://www.isc.ro/lists/twl06.zip

MIN_WORD_SIZE = 4 # min size of a word in the output

class Node(object):
    def __init__(self, letter="", final=False, depth=0):
        self.letter = letter
        self.final = final
        self.depth = depth
        self.children = {}
    def add(self, letters):
        node = self
        for index, letter in enumerate(letters):
            if letter not in node.children:
                node.children[letter] = Node(letter, index==len(letters)-1, index+1)
            node = node.children[letter]
    def anagram(self, letters):
        tiles = {}
        for letter in letters:
            tiles[letter] = tiles.get(letter, 0) + 1
        min_length = len(letters)
        return self._anagram(tiles, [], self, min_length)
    def _anagram(self, tiles, path, root, min_length):
        if self.final and self.depth >= MIN_WORD_SIZE:
            word = ''.join(path)
            length = len(word.replace(' ', ''))
            if length >= min_length:
                yield word
            path.append(' ')
            for word in root._anagram(tiles, path, root, min_length):
                yield word
            path.pop()
        for letter, node in self.children.iteritems():
            count = tiles.get(letter, 0)
            if count == 0:
                continue
            tiles[letter] = count - 1
            path.append(letter)
            for word in node._anagram(tiles, path, root, min_length):
                yield word
            path.pop()
            tiles[letter] = count

def load_dictionary(path):
    result = Node()
    for line in open(path, 'r'):
        word = line.strip().lower()
        result.add(word)
    return result

def main():
    print 'Loading word list.'
    words = load_dictionary('words.txt')
    while True:
        letters = raw_input('Enter letters: ')
        letters = letters.lower()
        letters = letters.replace(' ', '')
        if not letters:
            break
        count = 0
        for word in words.anagram(letters):
            print word
            count += 1
        print '%d results.' % count

if __name__ == '__main__':
    main()

When you run the program, the words are loaded into a trie in memory. After that, just type in the letters you want to search with and it will print the results. It will only show results that use all of the input letters, nothing shorter.

It filters short words from the output, otherwise the number of results is huge. Feel free to tweak the MIN_WORD_SIZE setting. Keep in mind, just using “astronomers” as input gives 233,549 results if MIN_WORD_SIZE is 1. Perhaps you can find a shorter word list that only contains more common English words.

Also, the contraction “I’m” (from one of your examples) won’t show up in the results unless you add “im” to the dictionary and set MIN_WORD_SIZE to 2.

The trick to getting multiple words is to jump back to the root node in the trie whenever you encounter a complete word in the search. Then you keep traversing the trie until all letters have been used.

For each word in the dictionary, sort the letters alphabetically. So “foobar” becomes “abfoor.”

Then when the input anagram comes in, sort its letters too, then look it up. It’s as fast as a hashtable lookup!

For multiple words, you could do combinations of the sorted letters, sorting as you go. Still much faster than generating all combinations.

(see comments for more optimizations and details)

See this assignment from the University of Washington CSE department.

Basically, you have a data structure that just has the counts of each letter in a word (an array works for ascii, upgrade to a map if you want unicode support). You can subtract two of these letter sets; if a count is negative, you know one word can’t be an anagram of another.

Related Solutions

Why not drop the “auto” keyword? [duplicate]

Your proposal would be rejected on the basis of backward compatibility alone. But let's say for the sake of argument that the standards committee like your idea. You don't take into account the numerous ways you can initialize a variable widget w; // (a) widget...

Recursive to iterative using a systematic method [closed]

So, to restate the question. We have a function f, in our case fac. def fac(n): if n==0: return 1 else: return n*fac(n-1) It is implemented recursively. We want to implement a function facOpt that does the same thing but iteratively. fac is written almost in...

How can I match values in one file to ranges from another?

if the data file sizes are not huge, there is a simpler way $ join input1 input2 | awk '$5<$4 && $3<$5 {print $2, $5-$3+1}' B100002 32 B100043 15 B123465 3 This Perl code seems to solve your problem It is a common idiom: to load the entire...

Javascript difference between “=” and “===” [duplicate]

You need to use == or === for equality checking. = is the assignment operator. You can read about assignment operators here on MDN. As a quick reference as you are learning JS: = assignment operator == equal to === equal value and equal type != not equal !==...

Compiler complains about misplaced else [closed]

Your compiler complains about an misplaced else because, well, there is an else without a preceding if: // ... for (j=1; j<n-i; j++) { if(a[j]<=a[j+1]) { // ... } // END OF IF } // END OF FOR else { continue; } // ... The else in your code does not follow...

Bootstrap – custom alerts with progress bar

/* !important are just used to overide the bootstrap css in the snippet */ .alertContainer { border-radius: 0 !important; border-width: 0 !important; padding: 0 !important; height: auto !important; position: absolute !important; bottom: 15px !important; left:...

How to Garbage Collect an external Javascript load?

Yes, s.onload = null is useful and will garbage collect! As of 2019, it is not possible to explicitly or programmatically trigger garbage collection in JavaScript. That means it collects when it wants. Although there is cases where setting to null may do a GC...

Math programming with python

At first, what you are looking for is the modulo operator and the function math.floor() Modulo from wikipedia: In computing, the modulo operation finds the remainder after division of one number by another (sometimes called modulus). for example: 12%12=0...

Android slide over letters to create a word [closed]

Here some advice you can use: First for each cell you can create an object that represents the state of that cell: class Cell { char mChar; int row,column; boolean isSelected; } then you can create a 2D array of your cells Cell[][] mTable = ... For views you...

Sum two integers in Java

You reused the x and y variable names (hence the variable x is already defined in method main error), and forgot to assign the ints read from the Scanner to the x and y variables. Besides, there's no need to create two Scanner objects. public static void...

Extend three classes that implements an interface in Java

Using this simplified implementation of the library, using method() instead of M(): interface IFC { void method(); } class A implements IFC { public void method() { System.out.println("method in A"); }; } As akuzminykh mentions in their comment You'd write a...

How to set the stream content in PHPExcel? [closed]

Okey, First thing first PHPExcel_Worksheet_MemoryDrawing() can't solve your problem if you insist to use stream content and pass that to your worksheet your PDF will not render your image. But you can use `PHPExcel_Worksheet_Drawing()' if you want to render...

How to remove all files from a directory?

Linux does not use extensions. It is up to the creator of the file to decide whether the name should have an extension. Linux looks at the first few bytes to figure out what kind of file it is dealing with. To remove all non-hidden files* in a directory use: rm...

Hacker used picture upload to get PHP code into my site

Client side validation The validation code you have provided is in JavaScript. That suggests it is code that you use to do the validation on the client. Rule number one of securing webapps is to never trust the client. The client is under the full control of...

First Time HTML5/CSS Site

Semantically, I would suggest using HTML5 elements more. For example, instead of... <div id="header"> <div id="logo"></div> </div> Use instead: (the ID can stay if you want it to) <header> <div id="logo"></div>...