Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Write Ahead Log #5

Open
3 tasks
AlecDivito opened this issue Jul 25, 2023 · 0 comments
Open
3 tasks

[FEAT] Write Ahead Log #5

AlecDivito opened this issue Jul 25, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@AlecDivito
Copy link
Contributor

Overview

Track all changes to the user's data inside of an append only log. When writes happen, data is written in 2 places:

  1. In memory data structure called memtable. It would be flushed to a SST file later
  2. A write ahead log (WAL) on disk

This issue tracks writing to the WAL and memtable on every write.

Purpose of WAL

If a failure were to happen, the WAL can be used to recover data in the memtable.

A single WAL captures write logs for all users in the database

Goals

Configuration

WALs can be configured in 2 ways:

  1. Max amount of writes allowed to be stored
  2. Max size of WAL

If any of these situations are hit, we need to rotate the WAL and dump the SST into a file.

WAL rotation

WAL rotation happens when the above configuration is met or when a Schema has decided that it is no longer in use and closes its connection. On Close, the Schema will flush all its changes to disk. This would cause the current WAL to be archived and a new one to be created.

let wal = Wal::new();
wal.flush().await?; // write memtable and create a new WAL

WAL achieve process

All Schema write to the same WAL. However, it's possible for a

TODO

  1. Require users provide a Schema when creating a database (Record the Schema in the Manifest files)
  2. Each Schema allows users to create tables on them (Record the Tables in the Manifest files)
  3. When a Table is written to (and the updates that effect that write), push all writes to the WAL at once (collect all writes before sending it to the WAL) (Read More below)

WAL rotation will be handled in another ticket. First focus on just writing to the WAL.

Notes on Number 3:

  • Should we collect all changes (which would mean updating data structures) and then commit to the database. This would freeze all of those data structures
  • Or should we do the initial write and then lazily update all the listeners in the future at some point?
    One thing is for sure, we need to track table row versions and on recovery, we need to restore the generated tables if they do not have the correct values.
@AlecDivito AlecDivito self-assigned this Jul 25, 2023
@AlecDivito AlecDivito added this to the MVP milestone Jul 26, 2023
@AlecDivito AlecDivito added the enhancement New feature or request label Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant