How to find list of possible words from a letter matrix [Boggle Solver]

The fastest solution you're going to get will probably involve storing your dictionary in a trie. Then, create a queue of triplets (x, y, s), where each element in the queue corresponds to a prefix s of a word which can be spelled in the grid, ending at location (x, y). Initialize the queue with N x N elements (where N is the size of your grid), one element for each square in the grid. Then, the algorithm proceeds as follows:

While the queue is not empty:
  Dequeue a triple (x, y, s)
  For each square (x', y') with letter c adjacent to (x, y):
    If s+c is a word, output s+c
    If s+c is a prefix of a word, insert (x', y', s+c) into the queue

If you store your dictionary in a trie, testing if s+c is a word or a prefix of a word can be done in constant time (provided you also keep some extra metadata in each queue datum, such as a pointer to the current node in the trie), so the running time of this algorithm is O(number of words that can be spelled).

[Edit] Here's an implementation in Python that I just coded up:

#!/usr/bin/python

class TrieNode:
    def __init__(self, parent, value):
        self.parent = parent
        self.children = [None] * 26
        self.isWord = False
        if parent is not None:
            parent.children[ord(value) - 97] = self

def MakeTrie(dictfile):
    dict = open(dictfile)
    root = TrieNode(None, '')
    for word in dict:
        curNode = root
        for letter in word.lower():
            if 97 <= ord(letter) < 123:
                nextNode = curNode.children[ord(letter) - 97]
                if nextNode is None:
                    nextNode = TrieNode(curNode, letter)
                curNode = nextNode
        curNode.isWord = True
    return root

def BoggleWords(grid, dict):
    rows = len(grid)
    cols = len(grid[0])
    queue = []
    words = []
    for y in range(cols):
        for x in range(rows):
            c = grid[y][x]
            node = dict.children[ord(c) - 97]
            if node is not None:
                queue.append((x, y, c, node))
    while queue:
        x, y, s, node = queue[0]
        del queue[0]
        for dx, dy in ((1, 0), (1, -1), (0, -1), (-1, -1), (-1, 0), (-1, 1), (0, 1), (1, 1)):
            x2, y2 = x + dx, y + dy
            if 0 <= x2 < cols and 0 <= y2 < rows:
                s2 = s + grid[y2][x2]
                node2 = node.children[ord(grid[y2][x2]) - 97]
                if node2 is not None:
                    if node2.isWord:
                        words.append(s2)
                    queue.append((x2, y2, s2, node2))

    return words

Example usage:

d = MakeTrie('/usr/share/dict/words')
print(BoggleWords(['fxie','amlo','ewbx','astu'], d))

Output:

['fa', 'xi', 'ie', 'io', 'el', 'am', 'ax', 'ae', 'aw', 'mi', 'ma', 'me', 'lo', 'li', 'oe', 'ox', 'em', 'ea', 'ea', 'es', 'wa', 'we', 'wa', 'bo', 'bu', 'as', 'aw', 'ae', 'st', 'se', 'sa', 'tu', 'ut', 'fam', 'fae', 'imi', 'eli', 'elm', 'elb', 'ami', 'ama', 'ame', 'aes', 'awl', 'awa', 'awe', 'awa', 'mix', 'mim', 'mil', 'mam', 'max', 'mae', 'maw', 'mew', 'mem', 'mes', 'lob', 'lox', 'lei', 'leo', 'lie', 'lim', 'oil', 'olm', 'ewe', 'eme', 'wax', 'waf', 'wae', 'waw', 'wem', 'wea', 'wea', 'was', 'waw', 'wae', 'bob', 'blo', 'bub', 'but', 'ast', 'ase', 'asa', 'awl', 'awa', 'awe', 'awa', 'aes', 'swa', 'swa', 'sew', 'sea', 'sea', 'saw', 'tux', 'tub', 'tut', 'twa', 'twa', 'tst', 'utu', 'fama', 'fame', 'ixil', 'imam', 'amli', 'amil', 'ambo', 'axil', 'axle', 'mimi', 'mima', 'mime', 'milo', 'mile', 'mewl', 'mese', 'mesa', 'lolo', 'lobo', 'lima', 'lime', 'limb', 'lile', 'oime', 'oleo', 'olio', 'oboe', 'obol', 'emim', 'emil', 'east', 'ease', 'wame', 'wawa', 'wawa', 'weam', 'west', 'wese', 'wast', 'wase', 'wawa', 'wawa', 'boil', 'bolo', 'bole', 'bobo', 'blob', 'bleo', 'bubo', 'asem', 'stub', 'stut', 'swam', 'semi', 'seme', 'seam', 'seax', 'sasa', 'sawt', 'tutu', 'tuts', 'twae', 'twas', 'twae', 'ilima', 'amble', 'axile', 'awest', 'mamie', 'mambo', 'maxim', 'mease', 'mesem', 'limax', 'limes', 'limbo', 'limbu', 'obole', 'emesa', 'embox', 'awest', 'swami', 'famble', 'mimble', 'maxima', 'embolo', 'embole', 'wamble', 'semese', 'semble', 'sawbwa', 'sawbwa']

Notes: This program doesn't output 1-letter words, or filter by word length at all. That's easy to add but not really relevant to the problem. It also outputs some words multiple times if they can be spelled in multiple ways. If a given word can be spelled in many different ways (worst case: every letter in the grid is the same (e.g. 'A') and a word like 'aaaaaaaaaa' is in your dictionary), then the running time will get horribly exponential. Filtering out duplicates and sorting is trivial to due after the algorithm has finished.


My answer works like the others here, but I'll post it because it looks a bit faster than the other Python solutions, from setting up the dictionary faster. (I checked this against John Fouhy's solution.) After setup, the time to solve is down in the noise.

grid = "fxie amlo ewbx astu".split()
nrows, ncols = len(grid), len(grid[0])

# A dictionary word that could be a solution must use only the grid's
# letters and have length >= 3. (With a case-insensitive match.)
import re
alphabet = ''.join(set(''.join(grid)))
bogglable = re.compile('[' + alphabet + ']{3,}$', re.I).match

words = set(word.rstrip('\n') for word in open('words') if bogglable(word))
prefixes = set(word[:i] for word in words
               for i in range(2, len(word)+1))

def solve():
    for y, row in enumerate(grid):
        for x, letter in enumerate(row):
            for result in extending(letter, ((x, y),)):
                yield result

def extending(prefix, path):
    if prefix in words:
        yield (prefix, path)
    for (nx, ny) in neighbors(path[-1]):
        if (nx, ny) not in path:
            prefix1 = prefix + grid[ny][nx]
            if prefix1 in prefixes:
                for result in extending(prefix1, path + ((nx, ny),)):
                    yield result

def neighbors((x, y)):
    for nx in range(max(0, x-1), min(x+2, ncols)):
        for ny in range(max(0, y-1), min(y+2, nrows)):
            yield (nx, ny)

Sample usage:

# Print a maximal-length word and its path:
print max(solve(), key=lambda (word, path): len(word))

Edit: Filter out words less than 3 letters long.

Edit 2: I was curious why Kent Fredric's Perl solution was faster; it turns out to use regular-expression matching instead of a set of characters. Doing the same in Python about doubles the speed.