Skip to content

Commit b96d71c

Browse files
committed
feat(blog): add entry about maze data structures
1 parent 83d7b48 commit b96d71c

2 files changed

Lines changed: 162 additions & 1 deletion

File tree

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
---
2+
date: 2023-10-02T23:00:00Z
3+
categories:
4+
- Maze
5+
- Data Structures
6+
- C++
7+
- Optimization
8+
- Memory
9+
- Cache
10+
- Vector
11+
- Bitfield
12+
- Maze Generation
13+
authors:
14+
- tolstenko
15+
---
16+
17+
# Memory-efficient Data Structure for Maze Generation
18+
19+
In this post, you will learn how to create a memory-efficient data structure for maze generation. The main idea is to use a single array of data to store the walls of the maze. The data will be stored in a way that every bit will represent a wall. You will be able to make improvement of around 160x in memory consumption.
20+
21+
<!-- more -->
22+
23+
Problem statement: You need to generate mazes dynamicly, and you need to break or add walls between rooms. Ex.: How can we store data for a simple 3x3 maze like this:
24+
25+
```text
26+
_ _ _
27+
| | |
28+
| | | |
29+
|_ _|_|
30+
```
31+
32+
The naive approach is to create a data structure like this:
33+
34+
```c++
35+
class Node {
36+
Node* top, right, bottom, left;
37+
bool top_wall, right_wall, bottom_wall, left_wall;
38+
};
39+
```
40+
41+
This one above will work, but it is:
42+
43+
- Cache unfriendly;
44+
- Random access to any element will be slow;
45+
- Memory inefficient;
46+
- Huge memory consumption;
47+
- Redundant data usage;
48+
49+
**Cache Unfriendly**: The cache locality is hurt by extensive usage of dynamic allocation(4 pointer per node), and not reserving contigous memory for every new object created.
50+
51+
**Random Access**: To access the room `{x,y}` will have to iterate over node by node. The access of a room will have the algorithmic complexity of O(sqrt(rows*cols)) or simply O(n). For small mazes it is not a problem, but for big mazes it will be.
52+
53+
**Memory inefficiency**: The memory allocation for each room is 4 pointers and 4 booleans. If the size of the pointer is 8 bytes and each boolean is 1 byte, we might think it will have 32 bytes per room, right? No! The compiler will add padding to the struct, so it will have 40 bytes per room. If we have a 1000x1000 maze, we will have 40MB of memory allocated for the maze. It is a lot of memory for a simple maze.
54+
55+
**Data redundancy**: The wall data is stored in two neighbors. If we break a wall, we have to break the wall in two places. It is not a big deal, but it is a waste of memory.
56+
57+
## Optimization
58+
59+
Well, let's try to optimize it. The first step is to use a single array of data. And then we need to reduce the duplicity of data.
60+
61+
By removing all the pointers, and store the wall data in a single array following matrix linearization, we will drop the memory consumption to 4 bytes per room (10x improvement). It is a huge improvement, but we can do better. Now we can create an array of WallData as follows:
62+
63+
```c++
64+
struct WallData {
65+
bool top, right, bottom, left;
66+
};
67+
vector<WallData> data;
68+
WallData get_wall(int x, int y) {
69+
return data[y * width + x];
70+
}
71+
```
72+
73+
The size of the WallData is 4 bytes. But we can reduce it if we use data layout optimization:
74+
75+
```c++
76+
struct WallData {
77+
bool top:1, right:1, bottom:1, left:1;
78+
};
79+
```
80+
81+
In this version, WallData will use 1 byte per room(40x improvement). But we will be using only 4 bits of the byte. Another way of optmizing it is to use vector of bools for every type of wall.
82+
83+
```c++
84+
vector<bool> topWalls, rightWals, bottomWalls, leftWalls;
85+
```
86+
87+
But a vector alone, depending on the implementation, needs to store the size of the vector, the capacity of the vector, and the pointer to the data. So it will use 24 bytes per vector. If you want to go deeper, you can use only one vector<bool> where every bit is a wall. So we will have only 4 bits per room and do some math to get the right bit(80x improvement). Can we do it better?
88+
89+
Yes, as you might have noticed, every wall data is being stored in two nodes redundantly. Se we will jump from 40 bytes(320 bits) to 2 bits per room (approximately 160x improvement). But in order to achieve that, you have to follow a strict set of rules.
90+
91+
1. Every even bit is a top wall, and every odd bit is a right wall;
92+
2. Every dimension of the maze will be increased by one unity in order to properly address the borders.
93+
94+
```text
95+
_ _ _
96+
|_|_|_|
97+
|_|_|_|
98+
```
99+
100+
This 3x2 maze will be represented by a 4x3 linearized matrix. It is easier to understand if you look at the walls as edges and the wall intersections as nodes. So for a 3x2 maze, we need 4 vertical walls and 3 horizontal walls. So in this specific case, if we follow the pattern of 1 for the wall is there and 0 for is not there and do this only for top and right walls of a node(intersection), we will have:
101+
102+
```text
103+
This fully blocked 3x2 maze
104+
_ _ _
105+
|_|_|_|
106+
|_|_|_|
107+
108+
Will give us 4x3 pairs of bits:
109+
01 01 01 00
110+
11 11 11 10
111+
11 11 11 10
112+
113+
Linearized as:
114+
010101001111111011111110
115+
```
116+
Just to recaptulate: we went from 40 Bytes (320 bits) per room to approximately 2 bits per room. A maze map with 128x128 would go from 128*128*320/8 = 640KB to 129*129*2/8 = 4161 bytes. It is 157.5 times densely packed. It is a huge improvement.
117+
118+
Notes about vectors:
119+
120+
1. vector of bools is a bitfield, so it will pack 8 bools per byte, it will do the shift and masking for us
121+
2. vector of bools is arguably an antipattern because it doesn't behave like a commom vector by not following the rule of zero cost abstraction from C++. It adds a cost for the densely packed bitfield.
122+
3. for our intent, this is exactly what we want, so we can use it, just check if your compiler implements it as a bitfield.
123+
124+
Here goes a simple implementation of a data structure to hold the maze data:
125+
126+
```c++
127+
struct Maze {
128+
private:
129+
vector<bool> walls;
130+
vector<bool> visited;
131+
int width, height;
132+
public:
133+
Maze(int width, int height): width(width), height(height) {
134+
walls.resize((width+1)*(height+1)*2, true);
135+
for(int i = 0; i <= width; i++) // clear verticals on the top
136+
SetNorthWall(i, 0, false);
137+
for(int i = 0; i <= height; i++) // clear horizontals on the right
138+
SetEastWall(width, i, false);
139+
visited.resize(width*height, false); // no room is visited yet
140+
}
141+
142+
bool GetVisited(int x, int y) const { return visited[y*width + x]; }
143+
void SetVisited(int x, int y, bool val) { visited[y*width + x] = val; }
144+
145+
bool GetNorthWall(int x, int y) const { return walls[(y*(width+1) + x)*2 + 1]; }
146+
bool GetSouthWall(int x, int y) const { return walls[((y+1)*(width+1) + x)*2 + 1];}
147+
bool GetEastWall(int x, int y) const { return walls[((y+1)*(width+1) + x+1)*2];}
148+
bool GetWestWall(int x, int y) const { return walls[((y+1)*(width+1) + x)*2];}
149+
150+
void SetNorthWall(int x, int y, bool val) { walls[(y*(width+1) + x)*2 + 1] = val; }
151+
void SetSouthWall(int x, int y, bool val) { walls[((y+1)*(width+1) + x)*2 + 1] = val;}
152+
void SetEastWall(int x, int y, bool val) { walls[((y+1)*(width+1) + x+1)*2] = val;}
153+
void SetWestWall(int x, int y, bool val) { walls[((y+1)*(width+1) + x)*2] = val;}
154+
}
155+
```
156+
157+
## Further ideas
158+
159+
1. Is it possible to explore even more the structure?
160+
2. Is it possible to do the same for hexagonal grids? Every node will have 3 walls instead of 4 in the squared grid.

mkdocs.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ nav:
1313
- AI:
1414
- courses/artificialintelligence/README.md
1515
- Spatial Quantization: courses/artificialintelligence/readings/spatial-quantization.md
16+
- Maze Data Structure: blog/posts/MazeDataStructure/MazeDataStructures.md
1617
- Assignments:
1718
- courses/artificialintelligence/assignments/README.md
1819
- Setup: courses/artificialintelligence/assignments/README.md
@@ -134,7 +135,7 @@ theme:
134135
# icon: material/brightness-4
135136
# name: Switch to light mode
136137
font:
137-
text: jost
138+
text: Roboto
138139
code: Roboto Mono
139140
# favicon: assets/favicon.png
140141
# icon:

0 commit comments

Comments
 (0)