Generating Mazes
Introduction
- recursion
- noun See recursion
Long ago, I came across a technique that permitted me to construct a rectangular maze that contained one (and only one) path from one point to any other point in the maze. This technique used recursion to tunnel a path out from a starting point in the maze. I have played around with this recursive maze builder for quite a while.
Technical Notes
Algorithm
The maze is constructed from a rectilinear array (or "field") of cells. In any cell, the SOUTH wall of the cell is also the NORTH wall of the cell directly below it, while the EAST wall of the cell corresponds to the WEST wall of the cell to the right. We construct the maze by moving from cell to an adjacent cell selected at random, removing the walls between the two cells, until we have visited all the cells in the field. If this random walk has already visited the neighbour cell, the logic will not not proceed into that cell, and will leave the intervening wall intact.
+---+---+---+ | | <- | +---+---+ ^ + | | *-> ! | +---+---+---+ | | | | +---+---+---+
When we can proceed no further, we backtrack to the first cell that has travel options, and random walk from there.
When we have exhausted all our travel options (detected by returning to our starting cell, and not finding any unvisited adjacent cells), we terminate the algorithm.
In reality, our maze is constructed from a grid of half-cells. Each half-cell consists of a floor, and walls to the SOUTH and EAST.
0 1 2 > 0 | | | ---+ ---+ ---+ 1 | | | ---+ ---+ ---+ 2 | | | ---+ ---+ ---+ V
We cap off our grid by constructing a solid WEST wall, and a solid NORTH wall across the entire grid.
0 1 2 > + ---+ ---+ ---+ 0 | | | | + ---+ ---+ ---+ 1 | | | | + ---+ ---+ ---+ 2 | | | | + ---+ ---+ ---+ V
Now, as we traverse the grid, we only have to worry about breaking one wall at a time: We either break the SOUTH or EAST wall of the cell that we are leaving (if we travel correspondingly South or East) or we break the SOUTH or EAST wall of the cell we travel to (if we travel correspondingly North or West).
We "mark" each cell as we occupy it, and only move from that cell to surrounding cells that have not been "marked" already. This prevents us from entering a cell that we have already entered once before, and ensures that there is one (and only one) path through the maze.
As the program only needs to store three pieces of yes/no information about each cell (whether or not the cell has a WEST wall, whether or not the cell has a SOUTH wall, and whether or not the exploration logic has visited the cell already), we choose to use single bytes to represent cells, and three bits within each of those bytes to represent the three pieces of information. A single cell will consist of 8 bits, of which only three are of any use to us: x x x x x O E S
(where 'x' is "unused", 'O' is "Occupied/free", 'E' is "East Wall/no east wall", and 'S' is "South Wall/no south wall").
Additionally, while the maze is conceived as an array of rows and columns of cells, our code can only allocate a linear array of bytes for us to store the maze in. We compensate for this one-dimensionality by using a computation to determine the location of any cell given it's horizontal and vertical distance from the origin (the NorthWest corner of the maze). For any cell at row R
and column C
, it's position within the linear array is ((R * number-of-columns-in-a-row) + C)
, and it's address is (address-of-start-of-array + (R * number-of-columns-in-a-row) + C)
The Code
Standard library overhead
At the beginning of the code, I #include
several standard headers. stdio.h
contains definitions for the formatted output functions the program will use, stdlib.h
contains definitions for a number of general functions and constants that the program will use, and string.h
contains the definition of the string-handling functions that the program will use.
#include <stdio.h> #include <stdlib.h> #include <string.h>
"Global" variables
Next comes three "global" variables (actually, variables "at file scope"):
- Max_col, which holds a count of the maximum number of (North/South) columns in the maze,
- Max_row, which holds a count of the maximum number of (West/East) rows in the maze, and
- field, which holds a pointer to the field on which the program will build the maze itself
As the various functions in this program need access to these values, I have stored them in "global" variables rather than "local" variables. If I had used "local" variables, I would have had to code the program to pass the values along to each function that needs them, which would have inflated my coding quite a bit. With this "global" method, any function that needs these values can access them directly, without additional programmer overhead.
int Max_col, Max_row; char *field;
The logic of the maze building/exploration and printing processes require a variety of numeric constants to express direction and to indicate the state of each cell in the maze. Rather than code these numeric constants directly into the program, I give them symbolic names, and reference the names in the program code. This way, I can give the constants a meaningful name to guide the programming, and I can write generic code and adjust the values of the constants later.
Program MACROs
The code starts off with a definition of the START
, WEST
, EAST
, NORTH
, and SOUTH
directions. The program will use START
to indicate the direction of travel when entering the "base camp" cell of our maze exploration, while WEST
, EAST
, NORTH
, and SOUTH
will indicate both direction of travel away from, and a direction to return to each cell as we explore from it. The logic that picks our next direction will return one of these four direction values.
Rather than select numerically sequential values for WEST
, EAST
, NORTH
, and SOUTH
, I use bit positions. This will later permit me to give these constants a double duty, both as direction indicators, and as values that I can use as bitmasks when determining whether or not the program has already travelled in a particular direction. More on that in the section on function exploreCells()
.
Following the definitions of the compass directions, I have definitions for the various values associated with one cell of the maze. Again, I express the values for WALL_SOUTH
, WALL_EAST
, and OCCUPIED
as single bit bitmasks. This way, all the indicators for one cell can fit within a C char
entity.
Finally, as the recursive nature of the maze exploration function requires the code to distinguish between cells that are already "occupied" (and thus, cannot be explored further), and cells that are unoccupied (and can be recursively explored), we will have the exploration function return a value that indicates the state of exploration at any particular cell. The function will return a value of
WAS_OCCUPIED
when the cell in question has already been recursively explored (and thus, should not be explored again), orWAS_FREE
when the cell in question was previously unexplored, and was now recursively explored by the function.
/* ** compass direction encoded in four bits ** 0000 - no direction (starting point) ** 0001 - ---W - West ** 0010 - --E- - East ** 0100 - -N-- - North ** 1000 - S--- - South */ #define START 0 /* no bits on */ #define WEST (1 << 0) /* 1's bit on */ #define EAST (1 << 1) /* 2's bit on */ #define NORTH (1 << 2) /* 4's bit on */ #define SOUTH (1 << 3) /* 8's bit on */ /* ** single cell indicators encoded in three bits ** xx1 - South Wall ** x1x - East Wall ** 1xx - Cell has been occupied */ #define WALL_SOUTH (1 << 0) #define WALL_EAST (1 << 1) #define OCCUPIED (1 << 2) #define WAS_OCCUPIED 1 #define WAS_FREE 0
For convenience sake, I also define two macros to reduce the amount of coding, and simplify the reading of the logic of the exploration and printing functions.
When provided with a cell's row and column address, the Cell(x,y)
macro computes the address of the cell within the linear array used to hold the maze. This resolves to a pointer-to-char, which points at the byte holding that cell's wall and occupation information.
Rather than code an oblique (and possibly changable) expression to compute a new direction, the getHeading()
macro will generate a random number between 0 and 3, and use that value to generate a direction (see the definitions of NORTH, SOUTH, EAST, and WEST above).
/* convenience macro for single cell addressing */ #define Cell(x,y) (field + (y * Max_col) + x) /* convenience macro for generating an exploration heading */ #define getHeading() (1 << (rand() % 4))
The setupCells() function
The setupCells()
function sets up the global variables, and initializes each cell of the maze.
It saves
- the number of rows in the maze in the
Max_row
global variable, - the number of columns in the maze in the
Max_col
global variable, and - the pointer to the linear space in which we will build our maze in
field
.
It then initializes all the cells of the maze so that each cell has a West wall,and a South wall, and has not been Occupied. The initialization consists of processing each cell, column by column, row by row, setting the contents of each cell to the bitflags that represent the two walls and the "occupied/free" status. As the program uses a linear storage for these cells, and all cells get the same initial value, the logic here could have simply looped through each element of the linear storage. However, in order to reinforce the two-dimensional view of the maze, and because the initialization overhead is fairly low, the logic uses the more complex, row and column Cell()
addressing to access each cell in the maze.
/* ** Initialize the entire "field" so that each cell ** has a South wall, an East wall, and is not occupied */ void setupCells(int col, int row, char *thefield) { Max_row = row; Max_col = col; field = thefield; for (row = 0; row < Max_row; ++row) for (col = 0; col < Max_col; ++col) *Cell(col,row) = (WALL_SOUTH | WALL_EAST) & ~OCCUPIED; }
The exploreCells() function
Now, we come to the meat of the program; the exploreCells()
function. This recursive function will, starting from the indicated base-camp cell, explore outward to the North, South, East, and West until there are no more unexplored cells within it's reach.
The two tricks to a recursive function are:
- the activity that the function performs has to be repetitive, and
- there has to be a clearly defined condition that will stop the repetition
In our case, the repetitive activity is to treat every destination cell (cell that we explore, from the current cell) in the same manner as the cell we are currently in. We mark the current cell as occupied, and repeatedly explore cells bounding on our current cell. In each of these bounding cells, we mark the cell as occupied, and explore the bounding cells around that bounding cell. And so on.
The stop condition is well defined: if the cell we are currently trying to explore is outside the bounds of the maze, or the cell has already been explored, then we do not explore the cell further (nor do we explore any of it's boundary cells).
The exploreCells()
function takes three arguments:
col
is an integer value representing the West-East position (the column) of the cell to be explored,row
is an integer value representing the North-South position (the row) of the cell to be explored, andcamefrom
is an integer value representing the direction (NORTH, SOUTH, EAST, WEST, or START) that leads to the previous base-camp cell.
The function returns an integer value:
WAS_OCCUPIED
when the cell is found to have already been explored by a previous invocation of the function, or when the cell is outside the bounds of the maze, orWAS_FREE
when the cell is found to be within the bounds of the maze, and not been previously explored.
So, upon entering the exploreCells() function, we determine if the cell we've been asked to explore is one that we may explore. If the cell is outside of the field (either too far WEST, too far EAST, too far NORTH, or too far SOUTH), or the has already been explored (the cell is marked OCCUPIED), we do not proceed, and return a value of WAS_OCCUPIED.
However, if this logic determines that we can explore the cell, we then proceed. We first flag this cell as being OCCUPIED, so that future recursions through this code will avoid re-exploring the cell.
Next, we remove the wall between this cell and the cell we came from. We only do this if we came from the SOUTH or EAST, as those are the only directions that traverse a wall in this cell.
Once we've taken care of all this housekeeping, we can (recursively) explore other cells from here. Think of our recursive exploration as setting a series of "base camps"; from the base camp, we explore all the cells within reach, returning to the base camp after each journey. Once we've exhausted all the cells reachable from this base camp, we go back to the previous base camp, and explore in a different direction. When we have retreated back to our starting base camp, and exhausted all the explorable cells around it, we have explored the entire maze.
In the function exploreCells(), the logic up to this point has established our new "base camp". Now, we have to explore the adjacent cells. We can explore in (up to) four directions (North, South, East and West), and we do not need to explore the cell that we came from to. Starting with all the unexplored directions noted, we loop through our "exploration" logic randomly exploring in a direction, until we have exhausted all the unexplored directions.
With each loop, we first randomly select a direction to explore. Once we've selected the direction, we determine how to move our base camp in that direction. If we chose NORTH or WEST, we will explore, and (if the exploration determines that the explored cell was previously "unexplored") we will remove the intervening wall. If we chose SOUTH or EAST, we will explore, and leave the removal of the intervening wall to that exploration.
Once we have exhausted all the explorable directions, we will exit the function, returning WAS_FREE to indicate that the cell we had temporarily placed our base camp in was not previously explored.
/* ** Occupy and explore outwards from the indicated cell, ** Return whether the cell was already occupied or not */ int exploreCells(int col, int row, int camefrom) { int exploring, heading; /* are we allowed to explore this cell? */ if ((col < 0) /* cell is off the WEST edge of the map */ || (col >= Max_col) /* cell is off the EAST edge of the map */ || (row < 0) /* cell is off the NORTH edge of the map */ || (row >= Max_row) /* cell is off the SOUTH edge of the map */ || ( *Cell(col,row) & OCCUPIED)) /* been here before */ return WAS_OCCUPIED; /* this cell is TABU */ /* * mark territory as occupied, break down intervening wall * * If we camefrom the SOUTH, we remove the SOUTH wall of the current cell * the EAST, we remove the EAST wall of the current cell * the NORTH, we don't remove any walls * the WEST, we don't remove any walls * the START, we don't remove any walls */ *Cell(col,row) |= OCCUPIED; switch (camefrom) { case SOUTH: *Cell(col,row) &= ~WALL_SOUTH; break; case EAST: *Cell(col,row) &= ~WALL_EAST; break; } /* * explore new territory from here - stop when we've exhausted all directions * * We don't have to explore in the direction we camefrom, so we remove it * from the list of directions we are to explore. * * We will continue to explore until we have explored in all directions */ for (exploring = (WEST | EAST | NORTH | SOUTH) & ~camefrom ; /* directions we can go */ exploring; /* directions unexplored */ exploring &= ~heading) /* remove last heading */ { /* chose a new direction to explore in, see if we've gone that way before */ switch (exploring & (heading = getHeading())) { case WEST: /* haven't tried West yet */ /* we go WEST, and remove the EAST wall of the cell above*/ exploreCells(col-1, row, EAST); break; case EAST: /* haven't tried East yet */ /* if we can go EAST, we remove the EAST wall of this cell*/ if (exploreCells(col+1,row,WEST) == WAS_FREE) *Cell(col,row) &= ~WALL_EAST; break; case NORTH: /* haven't tried North yet */ /* we go NORTH and remove the SOUTH wall of the cell above */ exploreCells(col, row-1,SOUTH); break; case SOUTH: /* haven't tried South yet */ /* if we can go SOUTH, we remove the SOUTH wall of this cell*/ if (exploreCells(col,row+1,NORTH) == WAS_FREE) *Cell(col,row) &= ~WALL_SOUTH; break; default: break; } } return WAS_FREE; }
The displayCells() function
The displayCells() function prints a map of the maze we've generated. This function does not need recursion, and uses nested loops instead. It will print the maze out row by row, from the North edge to the South edge, starting each row on the West edge, and ending it on the East edge. As our maze exploration/construction neglects to create entrance and exit openings, this function also arbitrarily knocks out two West wall segments to act as entrance and exit.
First up, the function prints a continuous horizontal wall extending from West to East. This acts as the (unbroken) North wall of the northernmost row of cells. (Remenber, each cell consists of a South wall and an East wall, so the northernmost row of cells have no North wall to enclose them.)
Next, we loop through all the rows of the maze, printing their cell configurations as we go. We start at the northernmost row, and print each row until we reach and print the southernmost row.
We process each row from West to East, and as each row consists of both a cell floor and a set of walls, we generate two print lines for each row. The first line contains the cell floor and the east wall, while the second line contains the south wall. Of course, as the westernmost cell in the row does not have a west wall, we fake it and print one there. The only exceptions are the first and last rows, where we leave that initial West wall missing; this gives us our entrance and exit openings.
So, for the first line of the row, we print (or not) the initial West wall, then loop through the cells in the row, printing the cell floors and the cell East walls. For the floor, if the cell has been visited, we print a space (' '), otherwise, we print an octothorpe ('#'). Since we've supposedly explored the entire maze, every cell should have been visited, and all the floors should show as spaces. But, if we made a mistake and missed exploring a cell, this map will highlight that the cell was unexplored. Once we've printed the cell floor, we then print (or not) the East wall. If the cell still has an East wall, we print a vertical bar ('|'), otherwise we print a space (' '). This gives us our visual walls between each cell. When we reach the end of the row, we print the newline that completes the row.
The second line of print begins again at the easternmost cell of the row. This time, after accomodating the phantom West wall, we print the cell floors. For each cell, if the cell has a South wall, we print a series of dashes ('---'), otherwise, we print a series of spaces (' '). We then print the cornerpost of the cell. Once we reach the end of the row of cells, we print another newline to terminate the map row.
Once we've iterated through all the rows of the maze, we've completed printing the map of our maze.
/* ** Print the map of the maze, cell at a time ** West wall of NorthWest cell removed for entrance ** West wall of SouthWest cell removed for exit */ void displayCells(void) { int row, col, line; /* show a continious wall on the northernmost edge */ putchar('\t'); putchar('+'); for (col = 0; col < Max_col; ++col) printf("---+"); puts(""); /* Draw each row of cells */ for (row = 0; row < Max_row; ++row) { /* ** first two lines contain the floor and East wall ** of each cell. Also, we draw the Western wall of ** the westernmost cell. ** ** NB: For the cell in the NW corner, and the cell ** in the SW corner, we don't draw in their western ** wall, but instead draw in the entrance and exits. */ for (line = 0; line < 2; ++line) { putchar('\t'); /* Entrance, Exit, or Westmost wall */ if ((row == 0) || (row == Max_row - 1)) putchar(' '); /* entrance or exit */ else putchar('|'); /* westernmost wall */ /* floor and east wall of each cell */ for (col = 0; col < Max_col; ++col) { if (*Cell(col,row) & OCCUPIED) printf(" "); /* we've seen this cell */ else printf(" # "); /* should never happen */ if (*Cell(col,row) & WALL_EAST) putchar('|'); else putchar(' '); } puts(""); } /* third line contains South walls */ putchar('\t'); putchar('+'); for (col = 0; col < Max_col; ++col) { if (*Cell(col,row) & WALL_SOUTH) printf("---+"); else printf(" +"); } puts(""); } }
The buildMaze() function
In the buildMaze() function, we pull everything together to build and map a maze of a given size.
This function accepts two parameters:
- col, which is an integer representing the width of the maze (in cells), and
- row, which is an integer representing the height of the maze (in cells).
From these parameters, the function will
- allocate enough space from the free memory pool to construct the maze in (if the function fails to allocate enough space, it will immediately terminate with a returncode of 1),
- initialize this space so that every cell in the maze has both a WEST and SOUTH wall, and is marked as "unoccupied",
- explore the maze (using the exploreCells() function), starting at the centre of the maze, and
- print a map of the resulting explorations (using the displayCells() function)
At this point, the function will terminate with a returncode of 0.
NB: This function contains an error of omission: before it terminates, it fails to free() the space it allocated for the maze construction. For this implementation, we can afford to let this happen, as the overall program will build only one maze, and the system will reclaim the allocated memory once the program terminates. However, should we change the overall program to build multiple mazes (by invoking buildMaze() multiple times), we will have a "memory leak" that we will have to fix.
/* ** Build a maze */ int buildMaze(unsigned col, unsigned row) { char *thefield; srand((unsigned int)time(NULL)); if ((thefield = malloc(row * col)) == NULL) return 1; setupCells(col,row,thefield); exploreCells((col+1)/2,(row+1)/2,START); displayCells(); return 0; }
The usage() function
The usage() function takes care of reporting program usage errors. It accepts two parameters:
- a character pointer name, which points to the program name string, and
- a character pointer message, which points to an optional error message string
If message is not a NULL pointer, and the string that message points to is at least one character long, usage() will print the given string (prefixed with "Error: ") to stderr as a single line of text. Otherwise, message will be ignored.
After that, usage() prints a standard program usage message to stderr, giving the name of the program (as provided by name) and the expected parameters.
Finally, usage() returns EXIT_FAILURE, to be used as a returncode from main() if necessary.
int usage(char *name, char *message) { if (message && strlen(message)) fprintf(stderr,"Error: %s\n",message); fprintf(stderr,"Usage: %s width height\n",name); return EXIT_FAILURE; }
The main() function
The main() function accepts the user's maze width and height requirements, edits these values (and terminates with an error message if the requirements are unreasonable), and uses the buildMaze() function to build and diagram a maze with the given dimensions.
main() accepts two arguments:
- the integer value argc, which contains a count of the number of character string arguments passed to the program, and
- the character pointer array value argv[], which contains character pointers to each of the argc string arguments.
The function first allocates two unsigned integer variables, width and height, which it will use to store the user's selected maze width and height values extracted from the argv[] array.
Next, the function validates the given argument list.
- If the list does not contain exactly three arguments (programname, width, and height), the function uses usage() to issue a usage error message, and terminates using the returncode (EXIT_FAILURE) returned from usage().
- If the 2nd argument (argv[1]), when converted from a string to an integer, is less than 1 (either "0", a negative value, or a non-numeric argument provided), the function uses usage() to issue a usage error message with a width requirement explanation, and terminates using the returncode (EXIT_FAILURE) returned from usage(). Otherwise, the function stores the numeric value derived from the 2nd argument as the width value.
- If the 3nd argument (argv[2]), when converted from a string to an integer, is less than 1 (either "0", a negative value, or a non-numeric argument provided), the function uses usage() to issue a usage error message with a height requirement explanation, and terminates using the returncode (EXIT_FAILURE) returned from usage(). Otherwise, the function stores the numeric value derived from the 3rd argument as the height value.
The main() function then invokes the buildMaze() function, providing it with the determined width and height values. If buildMaze() returns a non-zero return value (returned only if the buildMaze() function cannot allocate enough space to build a maze of the given dimensions), the main() function will issue a usage error (via usage()) and an explanation that a maze of the given dimensions could not be built. With this error, the function will terminate with the EXIT_FAILURE returncode provided by the usage() function.
Finally, should execution get this far, the main() function will terminate with the EXIT_SUCCESS termination code that indicates that the program was successful.
int main(int argc, char *argv[]) { unsigned width, height; if (argc != 3) return usage(argv[0],NULL); if ((width = atoi(argv[1])) <= 1) return usage(argv[0],"width must be greater than 1"); if ((height = atoi(argv[2])) <= 1) return usage(argv[0],"height must be greater than 1"); if (buildMaze(width, height) != 0) return usage(argv[0],"your dimensions are too big"); return EXIT_SUCCESS; }
Sample execution
~/code/maze $ cc -o maze maze.c ~/code/maze $ ./maze 7 10 +---+---+---+---+---+---+---+ | | | | + + +---+ +---+---+ + | | | | | | | | | | + + +---+ + +---+---+ | | | | | | | | | | + + + +---+---+---+ + | | | | | | | | | | +---+---+ + + +---+---+ | | | | | | | | | | + + + +---+---+ + + | | | | | | | | | | | | + + + + +---+---+ + | | | | | | | | | | + +---+---+ + +---+---+ | | | | | | | | +---+---+ + +---+---+ + | | | | | | | | | | + + +---+---+ + +---+ | | | | +---+---+---+---+---+---+---+ ~/code/maze $
Attachment | Size |
---|---|
maze.c | 6.51 KB |