如何从字母矩阵中找到可能的单词列表 [Boggle求解器]

Question

9 浏览2023年5月22日

匿名的 2022年10月13日

0 Comments

最近我一直在我的iPhone上玩一个叫Scramble的游戏。你们中的一些人可能会把这个游戏称作Boggle。本质上，当游戏开始时，你会得到一个字母矩阵，如下所示：

F X I E
A M L O
E W B X
A S T U

游戏的目标是尽可能多地找到可以通过将字母链接在一起形成的单词。你可以从任何字母开始，周围所有的字母都可以使用，一旦你移动到下一个字母，所有周围的字母都可以使用，但是之前使用过的字母除外。因此，在上面的网格中，例如，我可以想到单词LOB、TUX、SEA、FAME等等。单词必须至少为3个字符，但不超过NxN个字符，这在这个游戏中是16个，但在某些实现中可能会有所不同。虽然这个游戏很有趣，很容易上瘾，但是我显然不太擅长它，所以我想作弊一点，写一个能够给我最好单词的程序（单词越长，得分越高）。不幸的是，我对算法或效率并不太熟悉。我的第一次尝试使用一个词典（像这个）（~2.3MB），并通过线性搜索尝试将组合与字典条目匹配。这需要很长时间才能找到可能的单词，而且由于每轮只有2分钟，这根本不足够。

我很感兴趣看看是否有Stackoverflow用户能够提出更有效的解决方案。我主要寻找使用大三P(即Python、PHP和Perl)的解决方案，尽管任何使用Java或C++的解决方案也很酷，因为速度至关重要。

当前的解决方案：

Adam Rosenfield，Python，~20s
John Fouhy，Python，~3s
Kent Fredric，Perl，~1s
Darius Bacon，Python，~1s
rvarcher，VB.NET，~1s
Paolo Bergantino，PHP (现场链接)，~5s(~2s本地)

admin 更改状态以发布 2023年5月22日

0

2 答案

匿名的 · Answer 1 · 2022-10-13T20:57:58+00:00

最快的解决方案可能涉及将字典存储在 Trie 中。然后，创建一个三元组队列（x，y，s），其中队列中的每个元素对应于可以在网格中拼写的单词的前缀s，以（x，y）结束的位置。使用 N x N 元素初始化队列（其中 N 是网格的大小），每个网格中的方块一个元素。然后，算法如下进行：

While the queue is not empty:
  Dequeue a triple (x, y, s)
  For each square (x', y') with letter c adjacent to (x, y):
    If s+c is a word, output s+c
    If s+c is a prefix of a word, insert (x', y', s+c) into the queue

如果在Trie中存储字典，则可以在常量时间内测试 s+c 是否是单词或单词的前缀（前提是在每个队列数据中还保留一些额外的元数据，例如Trie中当前节点的指针），因此此算法的运行时间为O（可以拼写的单词数）.

[编辑] 这是我刚刚编写的Python实现：

#!/usr/bin/python
class TrieNode:
    def __init__(self, parent, value):
        self.parent = parent
        self.children = [None] * 26
        self.isWord = False
        if parent is not None:
            parent.children[ord(value) - 97] = self
def MakeTrie(dictfile):
    dict = open(dictfile)
    root = TrieNode(None, '')
    for word in dict:
        curNode = root
        for letter in word.lower():
            if 97 <= ord(letter) < 123:
                nextNode = curNode.children[ord(letter) - 97]
                if nextNode is None:
                    nextNode = TrieNode(curNode, letter)
                curNode = nextNode
        curNode.isWord = True
    return root
def BoggleWords(grid, dict):
    rows = len(grid)
    cols = len(grid[0])
    queue = []
    words = []
    for y in range(cols):
        for x in range(rows):
            c = grid[y][x]
            node = dict.children[ord(c) - 97]
            if node is not None:
                queue.append((x, y, c, node))
    while queue:
        x, y, s, node = queue[0]
        del queue[0]
        for dx, dy in ((1, 0), (1, -1), (0, -1), (-1, -1), (-1, 0), (-1, 1), (0, 1), (1, 1)):
            x2, y2 = x + dx, y + dy
            if 0 <= x2 < cols and 0 <= y2 < rows:
                s2 = s + grid[y2][x2]
                node2 = node.children[ord(grid[y2][x2]) - 97]
                if node2 is not None:
                    if node2.isWord:
                        words.append(s2)
                    queue.append((x2, y2, s2, node2))
    return words

使用示例：

d = MakeTrie('/usr/share/dict/words')
print(BoggleWords(['fxie','amlo','ewbx','astu'], d))

输出：

['fa'，'xi'，'ie'，'io'，'el'，'am'，'ax'，'ae'，'aw'，'mi'，'ma'，'me'，'lo'，'li'，'oe'，'ox'，'em'，'ea'，'ea'，'es'，'wa'，'we'，'wa'，'bo'，'bu'，'as'，'aw'，'ae'，'st'，'se'，'sa'，'tu'，'ut'，'fam'，'fae'，'imi'，'eli'，'elm'，'elb'，'ami'，'ama'，'ame'，'aes'，'awl'，'awa'，'awe'，'awa'，'mix'，'mim'，'mil'，'mam'，'max'，'mae'，'maw'，'mew'，'mem'，'mes'，'lob'，'lox'，'lei'，'leo'，'lie'，'lim'，'oil'，'olm'，'ewe'，'eme'，'wax'，'waf'，'wae'，'waw'，'wem'，'wea'，'wea'，'was'，'waw'，'wae'，'bob'，'blo'，'bub'，'but'，'ast'，'ase'，'asa'，'awl'，'awa'，'awe'，'awa'，'aes'，'swa'，'swa'，'sew'，'sea'，'sea'，'saw'，'tux'，'tub'，'tut'，'twa'，'twa'，'tst'，'utu'，'fama'，'fame'，'ixil'，'imam'，'amli'，'amil'，'ambo'，'axil'，'axle'，'mimi'，'mima'，'mime'，'milo'，'mile'，'mewl'，'mese'，'mesa'，'lolo'，'lobo'，'lima'，'lime'，'limb'，'lile'，'oime'，'oleo'，'olio'，'oboe'，'obol'，'emim'，'emil'，'east'，'ease'，'wame'，'wawa'，'wawa'，'weam'，'west'，'wese'，'wast'，'wase'，'wawa'，'wawa'，'boil'，'bolo'，'bole'，'bobo'，'blob'，'bleo'，'bubo'，'asem'，'stub'，'stut'，'swam'，'semi'，'seme'，'seam'，'seax'，'sasa'，'sawt'，'tutu'，'tuts'，'twae'，'twas'，'twae'，'ilima'，'amble'，'axile'，'awest'，'mamie'，'mambo'，'maxim'，'mease'，'mesem'，'limax'，'limes'，'limbo'，'limbu'，'obole'，'emesa'，'embox'，'awest'，'swami'，'famble'，'mimble'，'maxima'，'embolo'，'embole'，'wamble'，'semese'，'semble'，'sawbwa'，'sawbwa']

说明：该程序不输出单字母单词，也不过滤单词长度。这很容易添加，但与问题无关。如果可以用许多不同的方法拼写某个给定单词（最坏情况：网格中的每个字母都相同（例如，“A”）并且您的字典中有像“aaaaaaaaaa”这样的单词），那么运行时间将变得极其指数化。在算法完成后，删除重复项和排序是很容易做到的。

匿名的 · Answer 2 · 2022-10-13T20:57:58+00:00

我的答案与其他人的答案一样，但我会发帖，因为它看起来比其他Python解决方案快一些，这是因为在更快地设置字典。（我验证了John Fouhy的解决方案。）设置后，解决时间降到了噪音中。\n

grid = "fxie amlo ewbx astu".split()
nrows, ncols = len(grid), len(grid[0])
# A dictionary word that could be a solution must use only the grid's
# letters and have length >= 3. (With a case-insensitive match.)
import re
alphabet = ''.join(set(''.join(grid)))
bogglable = re.compile('[' + alphabet + ']{3,}$', re.I).match
words = set(word.rstrip('\n') for word in open('words') if bogglable(word))
prefixes = set(word[:i] for word in words
               for i in range(2, len(word)+1))
def solve():
    for y, row in enumerate(grid):
        for x, letter in enumerate(row):
            for result in extending(letter, ((x, y),)):
                yield result
def extending(prefix, path):
    if prefix in words:
        yield (prefix, path)
    for (nx, ny) in neighbors(path[-1]):
        if (nx, ny) not in path:
            prefix1 = prefix + grid[ny][nx]
            if prefix1 in prefixes:
                for result in extending(prefix1, path + ((nx, ny),)):
                    yield result
def neighbors((x, y)):
    for nx in range(max(0, x-1), min(x+2, ncols)):
        for ny in range(max(0, y-1), min(y+2, nrows)):
            yield (nx, ny)

\n示例用法：\n

# Print a maximal-length word and its path:
print max(solve(), key=lambda (word, path): len(word))

\n编辑：过滤掉长度小于3个字母的单词。\n编辑2：我对Kent Fredric的Perl解决方案为什么更快感到好奇，结果它使用正则表达式匹配而不是一组字符。在Python中做同样的事情可以将速度提高一倍。