Solving Word Grids (Word Search Puzzles)

Posted: 4 September 2021

It's been a long time since my last blog post. I have a few ideas for things to write about, including writing about my experiences implementing a MQTT client library and parsing JSON using lemon/re2c, but there always seems to be something more interesting to do than writing about things I've already done. Being in lockdown yet again, I was getting a little bored with manipulating JSON and executing tests ad nauseam for a project I am working on (to be released soon), so when a word grid puzzle arrived in my inbox last night along with a weekly update from an organisation I volunteer with, I felt uncharacteristically motivated to take up the challenge. Naturally, I thought of approaching it how any programmer would... by cheating automating.

The task was a simple one. Find as many collective nouns for animals (e.g. a flock of seagulls or a murder of crows) in the word grid below. This post documents my approach. It's not intended as a tutorial, it's just a bit of fun. Show me the answers!

Animal collective nouns word grid
Challenge accepted!

Approach

First of all, let me address the obvious. I am aware there are sites online that help to solve these puzzles. No, I didn't want to use them and I didn't go searching for one. I did, however, try to find the puzzle source so I could avoid either having to manually enter the entire word grid by hand or dealing with OCR issues. Attempts at using OCR on the supplied image were an unmitigated disaster, so if you know of a site that works better (or software that runs on Windows/FreeBSD), please let me know. In the end, I just typed out the characters and converted them into a C array using regular expressions.

Often these sorts of puzzles would be accompanied by a list of words to match. Unfortunately, no such list was provided, so I had to search the Internet for a comprehensive list. Fortunately, Wikipedia had a list of animal collective nouns even subject matter experts hadn't heard of. That should suffice. I noticed a glaringly obvious word that wasn't in the list. It seemed mess would be a candidate for a collective noun, and indeed it appears it can be used to describe a group of iguanas. I found a second comprehensive list in a blog post and merged this with the original Wikipedia list. In both cases, a few quick regular expressions were used to format the lists.

Efficiency of code wasn't an objective. Quick and dirty was the favoured approach, however, I wanted the solution to at least be generic, capable of supporting different sized grids. The puzzle as provided is 20 x 20, or 400 characters. Processed vertically, horizontally and at 45 and -45 degree angles, both forwards and backwards, that's only 236 lines in which to search for dictionary entries. For a square grid, the number of diagonal lines in each direction is 2n - 1, where n is the number of rows/columns. The dictionary size was also very small, with 237 unique animal collective nouns identified. Given the data size, even a poorly implemented search would take a fraction of a second to run. I actually considered taking a deliberately inefficient approach, going letter-by-letter and recursively searching for surrounding words. The final code reflects this approach, with lines constructed by calculating the offset of the next character, albeit not recursively. A quick search online reveals a letter-by-letter approach is actually quite common (usually without recursion), but for a large dictionary scan, building lines first should be faster, particularly with further optimisations to only scan for words we know to be candidates. Letter-by-letter searches will perform better when you know the words you are looking for in advance, as the list is typically quite small and these implementations can stop looking once a specific word is found. Doing so, however, would result in duplicates not being found, so in practice the whole grid should be scanned regardless of the approach.

The final code works by building each line and looking for each dictionary entry within the line, then reversing the string and testing against the dictionary again, before moving on to the next line. Rows are built and tested first, then columns. The 45 and -45 degree lines are built next in ostensibly the same way, but because the lines start on different axes, we first process lines starting on one axis and then the other, skipping the common corner on one of the scans to avoid scanning duplicate lines which could lead to duplicate matches. While it's possible to start in any corner, I chose to work from the top left/bottom left, moving up and right for each letter, and from top right/bottom right moving up and to the left.

The code below omits the full list of animal collective nouns and the grid for clarity.

#include <stdio.h>
#include <string.h>

char grid[] = {
    's','t','e','l','b','b','a','u','q','s','f','m','e','s','s','t','p','h','o','n',
...
    'p','r','o','w','l','e','e','i','v','b','n','o','i','t','i','l','a','o','c','d'
};

char *dict[] = {
"aerie",
...
"zeal",
0
};

int nrows = 20;
int ncols = 20;

char *reverse_string(char *s)
{
    char tmp, *sin = s, *s2 = s + strlen(s) - 1;
    while (s < s2) {
        tmp = *s;
        *s = *s2;
        *s2 = tmp;
        ++s;
        s2--;
    }
    return sin;
}

void check_dictionary(char *dictionary[], char *line)
{
    int i;

    for(i = 0; dictionary[i]; i++) {
        if(strstr(line, dictionary[i])) {
            printf("%s\n", dictionary[i]);
        }
    }
}

int right(int n)
{
    n++;
    return n % ncols == 0 ? -1 : n;
}

int down(int n)
{
    n += ncols;
    return n > nrows * ncols ? -1 : n;
}

int up_left(int n)
{
    n = n - ncols - 1;
    return n % ncols == ncols - 1 ? -1 : n;
}

int up_right(int n)
{
    n = n - ncols + 1;
    return n % ncols ==  0 ? -1 : n;
}

void check_line(char *dictionary[], char *line)
{
    check_dictionary(dictionary, line);
    reverse_string(line);
    check_dictionary(dictionary, line);
}

char *get_line(char *line, int (*next)(int), int n)
{
    int i;

    for(i = 0; n >= 0; n = next(n), i++) {
        line[i] = grid[n];
    }
    line[i] = '\0';

    return line;
}

int main(int argc, char **argv) {

    int i;
    char line[256];

    // Parse rows -
    for(i = 0; i < ncols * nrows; i += ncols) {
        check_line(dict, get_line(line, right, i));
    }

    // Parse columns |
    for(i = 0; i < ncols; i++) {
        check_line(dict, get_line(line, down, i));
    }

    // Parse 45 / Shift up/right from top left to bottom left then bottom left to bottom right
    // TLBL
    for(i = 0; i < ncols * nrows; i += ncols) {
        check_line(dict, get_line(line, up_right, i));
    }

    // BLBR, skipping leftmost column (done above)
    for(i = ncols * (nrows - 1) + 1; i < ncols * nrows; i++) {
        check_line(dict, get_line(line, up_right, i));
    }

    // Parse 45 \ Shift up/left from top right to bottom right then bottom right to bottom left
    // TRBR
    for(i = ncols - 1; i < ncols * nrows; i += ncols) {
        check_line(dict, get_line(line, up_left, i));
    }

    // BRBL, skipping rightmost column (done above)
    for(i = ncols * nrows - 2; i >= ncols * (nrows - 1); i--) {
        check_line(dict, get_line(line, up_left, i));
    }

    return 0;
}

Results

The above initially produced 64 results with the Wikipedia list, 59 of which were unique. After combining the two lists, the total number of unique animal collective nouns was 62 (67 total). Results from the programme are output in the order found and sorted for your convenience below. Words in italics were added after merging the second list.

Answers: ambush, arc (2x), army, aurora, bale, band, bed, bevy, blessing, bloat, bouquet, cackle, charm, clan (2x), cloud, cluster, coalition, congregation, conspiracy, crash, cry, dazzle, deceit, embarrassment, flamboyance, flock, gang, herd, hive, host, husk, intrusion, leap, memory, mess, mischief, mob (3x), nest, pack, parliament, pod (2x), prickle, pride, prowl, raft, rasp, romp, sawt, shiver, sneak, surfeit, swarm, team, thunder, tower, trip, troop, waddle, walk, whoop, wisdom, zeal.

10 September 2021 - The puzzle answers have been supplied. The following words were missing from the dictionary: barrel, bob, pandemonium and squabble. The word 'shadow' (Jaguar) was included in the official results, but it is spelt incorrectly in the puzzle. Additional words not in the official results, but found in the puzzle were: arc, cry and hive.

Answers
Answers, plotted after updating the code to output co-ordinates (below).

But wait, there's more!

6 September 2021

While being able to find words in a dictionary is handy if you don't know the words you're searching for, this type of puzzle is usually accompanied by a list of words to find. Fortunately, noting the locations requires only a few minor changes.

To do this, we simply need to record the starting and end points of the word on the grid and convert these to human-friendly coordinates. I prefer numbering from 0, but for the purpose of this exercise, the top left will be coordinate (1,1). I chose to create a second array called index in main() in which to store the location of each letter. This array will be populated while building the line, so when a word is found within, we can directly access the array to obtain the grid index, and finally print the coordinates. A few functions need to be updated to handle this array, and the array will also need to be reversed when we reverse the line.

#include <stdio.h>
#include <string.h>

char grid[] = {
    's','t','e','l','b','b','a','u','q','s','f','m','e','s','s','t','p','h','o','n',
...
    'p','r','o','w','l','e','e','i','v','b','n','o','i','t','i','l','a','o','c','d'
};

char *dict[] = {
"aerie",
...
"zeal",
0
};

int nrows = 20;
int ncols = 20;

int *reverse_ints(int array[], int len)
{
    int i0, i1;
    int ix;

    for(i0 = 0, i1 = len - 1; i0 < i1; i0++, i1--) {
        ix = array[i0];
        array[i0] = array[i1];
        array[i1] = ix;
    }
    return array;
}

char *reverse_string(char *s)
{
    char tmp, *sin = s, *s2 = s + strlen(s) - 1;
    while (s < s2) {
        tmp = *s;
        *s = *s2;
        *s2 = tmp;
        ++s;
        s2--;
    }
    return sin;
}

void array_coord(int index, int *x, int *y)
{
    *x = index % ncols;
    *y = index / ncols;
}

void check_dictionary(char *dictionary[], char *line, int index[])
{
    int i;
    char *s;

    for(i = 0; dictionary[i]; i++) {
        if((s = strstr(line, dictionary[i]))) {
            int x0, y0, x1, y1;
            int loc;

            loc = s - line;
            array_coord(index[loc], &x0, &y0);
            array_coord(index[loc + strlen(dictionary[i]) - 1], &x1, &y1);
            printf("%s (%d,%d) -> (%d,%d)\n", dictionary[i], x0 + 1, y0 + 1, x1 + 1, y1 + 1);
        }
    }
}

int right(int n)
{
    n++;
    return n % ncols == 0 ? -1 : n;
}

int down(int n)
{
    n += ncols;
    return n > nrows * ncols ? -1 : n;
}

int up_left(int n)
{
    n = n - ncols - 1;
    return n % ncols == ncols - 1 ? -1 : n;
}

int up_right(int n)
{
    n = n - ncols + 1;
    return n % ncols ==  0 ? -1 : n;
}

void check_line(char *dictionary[], char *line, int index[])
{
    check_dictionary(dictionary, line, index);
    reverse_string(line);
    reverse_ints(index, strlen(line));
    check_dictionary(dictionary, line, index);
}

char *get_line(char *line, int index[], int (*next)(int), int n)
{
    int i;

    for(i = 0; n >= 0; n = next(n), i++) {
        line[i] = grid[n];
        index[i] = n;
    }
    line[i] = '\0';

    return line;
}

int main(int argc, char **argv) {

    int i;
    char line[256];
    int index[256];

    // Parse rows -
    for(i = 0; i < ncols * nrows; i += ncols) {
        check_line(dict, get_line(line, index, right, i), index);
    }

    // Parse columns |
    for(i = 0; i < ncols; i++) {
        check_line(dict, get_line(line, index, down, i), index);
    }

    // Parse 45 / Shift up/right from top left to bottom left then bottom left to bottom right
    // TLBL
    for(i = 0; i < ncols * nrows; i += ncols) {
        check_line(dict, get_line(line, index, up_right, i), index);
    }

    // BLBR, skipping leftmost column (done above)
    for(i = ncols * (nrows - 1) + 1; i < ncols * nrows; i++) {
        check_line(dict, get_line(line, index, up_right, i), index);
    }

    // Parse 45 \ Shift up/left from top right to bottom right then bottom right to bottom left
    // TRBR
    for(i = ncols - 1; i < ncols * nrows; i += ncols) {
        check_line(dict, get_line(line, index, up_left, i), index);
    }

    // BRBL, skipping rightmost column (done above)
    for(i = ncols * nrows - 2; i >= ncols * (nrows - 1); i--) {
        check_line(dict, get_line(line, index, up_left, i), index);
    }

    return 0;
}

The complete output is now as below.

mess (12,1) -> (15,1)
herd (2,2) -> (5,2)
bevy (11,3) -> (8,3)
clan (8,4) -> (11,4)
mob (3,7) -> (5,7)
mob (7,7) -> (5,7)
pride (6,9) -> (2,9)
clan (8,11) -> (11,11)
rasp (4,11) -> (7,11)
arc (17,12) -> (19,12)
bouquet (7,13) -> (13,13)
waddle (10,15) -> (5,15)
gang (3,16) -> (6,16)
raft (8,17) -> (5,17)
flamboyance (7,18) -> (17,18)
wisdom (15,19) -> (20,19)
blessing (13,19) -> (6,19)
prowl (1,20) -> (5,20)
coalition (19,20) -> (11,20)
cluster (1,4) -> (1,10)
pod (1,13) -> (1,15)
whoop (1,16) -> (1,20)
thunder (2,1) -> (2,7)
ambush (2,16) -> (2,11)
romp (2,20) -> (2,17)
bed (3,11) -> (3,9)
cry (4,12) -> (4,10)
sawt (4,17) -> (4,14)
embarrassment (5,5) -> (5,17)
bloat (6,1) -> (6,5)
troop (6,5) -> (6,9)
army (7,1) -> (7,4)
team (7,10) -> (7,7)
cloud (8,11) -> (8,15)
cackle (8,11) -> (8,6)
sneak (10,1) -> (10,5)
band (12,14) -> (12,17)
charm (12,5) -> (12,1)
tower (13,13) -> (13,9)
flock (15,8) -> (15,4)
surfeit (17,3) -> (17,9)
host (18,1) -> (18,4)
aurora (19,2) -> (19,7)
conspiracy (19,20) -> (19,11)
congregation (20,12) -> (20,1)
leap (20,17) -> (20,14)
bale (7,6) -> (10,3)
memory (9,5) -> (4,10)
pack (6,10) -> (3,13)
prickle (9,8) -> (15,2)
mischief (5,14) -> (12,7)
husk (18,1) -> (15,4)
pod (1,20) -> (3,18)
swarm (15,14) -> (19,10)
crash (18,6) -> (14,2)
arc (16,4) -> (18,6)
zeal (16,6) -> (19,9)
parliament (20,14) -> (11,5)
intrusion (10,6) -> (18,14)
walk (13,11) -> (16,14)
nest (17,16) -> (14,13)
trip (12,11) -> (9,8)
hive (2,2) -> (5,5)
shiver (1,1) -> (6,6)
deceit (18,19) -> (13,14)
mob (9,16) -> (11,18)
dazzle (1,15) -> (6,20)

That's enough time spent on this trivial little task. The complete source is located here (public domain).

 
 

 

©2021 Inveigle.net
Privacy | Contact