I'm working on a simple Java database using a B+Tree structure. My goal is to make it lightweight and efficient. I will also use JNI (Java Native Interface) to link it with Nvidia-GPUDirect using CUDA/C++ for quick GPU-based data operations.
Its a bit challenging, but where's the fun doing easy thing.
Here are some key features of our B+Tree:
Internal Nodes:
-
The internal nodes hold keys, offsets and pointers, but not the actual data.
These keys guide us through the tree, while the pointers direct us to child nodes.
Leaf Nodes:
-
The leaf nodes store the real data as key-value (KV) pairs.
Understanding the Basics:
Each node starts with a header. This 4-byte header includes 2 bytes for the node type and 2 for the number of keys.
Pointers (8 bytes each) link to child nodes, with their total size being 8 times the number of keys.
Offsets show the position of each key and are 2 bytes per key.
KeyLength and ValLength are each 2 bytes and indicate the size of keys and their associated values.
Then comes the actual key and its corresponding value.
Page Size:
-
The standard page size for our nodes is 4096 bytes.
You can check your system's page size using"getconf" on Unix-like systems.
Key and Value Sizes:
- We've set maximum sizes for keys and values in the B+Tree:
public static final int BPLUSTREE_MAX_KEY_SIZE = 1000;
public static final int BPLUSTREE_MAX_VAL_SIZE = 3000;
I've already set up the B+Tree to handle [Header, Pointer, Offset, KeyLength, ValueLength, Key, Value]. I will share more detailed articles soon. For questions, feel free to contact me at email@rishabhrahul.com.