<h1 id="chapter-3-storage-and-retrieval">Chapter 3 Storage and Retrieval<a aria-hidden="true" class="anchor-heading icon-link" href="#chapter-3-storage-and-retrieval"></a></h1>
<ul>
<li>
Hash indexes
</li>
<li>
Append only logs are good becasue appending and merging are sequential, hence faster than random writes.
</li>
<li>
Concurrency and crash recovery is much easier if the files are append-only
</li>
<li>
If we only append, we risk running out of storage
</li>
<li>
One strategy is to divide data into segments, close segment when it reaches a certain size and perform compaction on these segments
</li>
<li>
Compaction means throwing away duplicate keys in the log.
</li>
<li>
Sorted String tables and Log-structured Merge Tree
</li>
<li>
Keep sorted key-value files in storage
</li>
<li>
Keep an index of some of the keys in memory.
</li>
<li>
on retrieval, if the key is in memory, just get it from memory
</li>
<li>
if not in memory, we get the closest and then start scanning storage from there.
</li>
<li>
The index can be a balanced binary search tree (red-black or AVL) so that we insert randomly but can access in sorted order easily.
</li>
<li>
B-Trees
</li>
<li>
If an index stores the row value for the key inside the index (as opposed to storing a reference to the value), it's called a clustered index.
</li>
<li>
If the index stores only some columns of the row inside the index, it's a covering index.
</li>
</ul>

Chapter 3 Storage and Retrieval


Hi! I'm Param! This is a place for my personal notes.

My website is https://param.codes, and I write more readable
stuff on my substack: https://newsletter.param.codes.

I've found that taking notes helps me remember things, and it's nice
to look back on information that you processed years ago. I jot down random things, there's no real structure, but some of these
notes could eventually become blog posts.

Some notes you might find interesting:

- [[history.laphams_quarterly.democracy.campaign_finance]]
- [[engineering.being_a_mentor]]
- [[history.china.dynasties]]
- [[history.india.indira_gandhi]]

I also keep notes on books I read [[here|books]].

This is built using [[Dendron|engineering.dendron]] and hosted using
[GitHub Pages](https://github.com/paramsingh/notes).

Get in touch via [Twitter](https://twitter.com/iliekcomputers) or email me at `me [at] param [dot] codes`!