Upload
brendan-williamson
View
215
Download
0
Embed Size (px)
Citation preview
CSC 213 –Large Scale
Programming
Lecture 38:
BTrees
Today’s Goal
Look at using advanced Tree structures Examine BTree implementation of (a,b)-Tree Discuss how to size a BTree
Examine how to implement these structures How we can write classes so trees work well Better ways to manipulate these file systems
What is “the BTree?”
BTree - common implementation of (a,b) tree Every BTree has an order Usually talk about “BTree of order m” Internal nodes then have m/2 to m children Root node has m or fewer entries
Actually exist many variants of BTree Differences here are very minor Sticking to vanilla BTrees for this lecture
BTree Order
Select order to minimize paging Full node, including entries and references to
children, fills a page with no space left over Each node has at least m/2 entries Each page used is at least 50% full
How many pages touched during operation?
Removal from BTree
Swap entry with successor on bottom level If node has fewer than m/2 entries
When possible, move entry from sibling to parent and steal one from parent
Otherwise, merge node with sibling & steal entry from parent But this might propagate underflow to parent node!
Where to Find BTrees
Databases very common place to find them Both contain far more data than machine’s RAM Perform lots of data accesses, insertions Need simple, efficient organization
Databases also store data permanently Do not want to ever lose information RAM contents lost when powered off But files stored on hard drive (s — l — o —w)
Database Implementation
Maintain BTree in memory… … but keep copy of records on disk
Each Entry has unique ID & its location in file
Entry changes written to disk immediately So file is always kept up-to-date In case of program crash, just re-read file
Ignore virtual memory & instead use file Records in file stored in random order Order of Entrys may change as program runs
Better Ways To Access Data
BTrees do not read & write file sequentially Instead they must jump around location in file Also need way to specify each of the Entrys that
exist within file Java’s solution: RandomAccessFile
RandomAccessFile
Create new files or work with existing onesRandomAccessFile raf =
new RandomAccessFile(“file.txt”, “rw”);
Creates (or rewrites) file.txt Throws IOException when problem arises Allows program to read & write to the file Use raf to access/modify the file
Reading RandomAccessFile
Read from RandomAccessFile instance using: boolean readBoolean(), int readInt(), double readDouble()… Reads and returns the appropriate value
int read(byte[] b) Reads up to b.length bytes & stores back in b Returns number of bytes read
Writing to RandomAccessFile
Write to RandomAccessFile instance using: void writeInt(int i), void writeDouble(double d)… Writes the value to the next location in the file Extends the file when at the end of the file Otherwise overwrites whatever data had been there
void write(byte[] b) Write contents of array b to the file Overwrites/extends file as it is needed
Typical File I/O
Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;
char c = ‘’;
while (c != ‘s’) {
c = raf.readChar();
}
This is an example file we accessraf:
Typical File I/O
Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;
char c = ‘’;
while (c != ‘s’) {
c = raf.readChar();raf.writeChar(c);
}
This is an example file we access
Typical File I/O
Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;
char c = ‘’;
while (c != ‘s’) {
c = raf.readChar();raf.writeChar(c);
}
TTis is an example file we access
Typical File I/O
Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;
char c = ‘’;
while (c != ‘s’) {
c = raf.readChar();raf.writeChar(c);
}
TTii is an example file we access
Typical File I/O
Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;
char c = ‘’;
while (c != ‘s’) {
c = raf.readChar();raf.writeChar(c);
}
TTii s an example file we access
Typical File I/O
Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;
char c = ‘’;
while (c != ‘s’) {
c = raf.readChar();raf.writeChar(c);
}
TTii ssan example file we access
Skipping Around The File
Can position RandomAccessFile to read from/write to anywhere in file void seek(long pos) moves to position in
file Positions specified as bytes from beginning of file
RandomAccessFile I/O
Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;
char c;
raf.seek(raf.length()-1);
c = raf.readChar();
raf.seek(0);
raf.writeChar(c);
This is an example file we access
RandomAccessFile I/O
Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;
char c;
raf.seek(raf.length()-1);
c = raf.readChar();
raf.seek(0);
raf.writeChar(c);
shis is an example file we access
How do we use this?
Use positions to simplify everything Entry contains position of record within file
Simplify building nodes from start of program Record new nodes at end of file Store nodes’ size & number of Entrys at file start Node records ID & position of each of its children
For Next Lecture
Review end of graphs, (a, b)Tree, & BTree Come with any questions you still have Last of these problem days for the year…