Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SequenceId seems unnecessary #43

Open
scottcarey opened this issue Nov 21, 2019 · 0 comments
Open

SequenceId seems unnecessary #43

scottcarey opened this issue Nov 21, 2019 · 0 comments
Labels
enhancement New feature or request question Need more discussion

Comments

@scottcarey
Copy link

I may be wrong, but I believe that SequenceId is unnecessary. It eats up 8 bytes per entry in memory, and >=16 bytes per entry on disk (one for each entry in tombstone/index/data files).

My reasoning:

  • One reason that SequenceId is required, is so that during initialization, tombstones (deletes) and index updates (put/replace) do not process out of order. Tombstones are processed last, but do not apply to keys that were updated 'after' them.
  • If Tombstones are sequenced in order inside the index file as they occur, then rebuilding the index in the order it was written would resolve the above, without sequenceId.
  • The other reason that SequenceId is required, is to support concurrent initialization by multiple threads. For simple 'puts', the fileId is enough to resolve which update should win. But if tombstones are interleaved with index loading as I suggest above, then concurrent threads from different files doing puts and deletes on the same key will have race conditions (if file 2 does a delete on key X, then file 1 does a put, it would not know that file 2 has already removed it).
  • I see two solutions to the above, assuming tombstones are sequenced in the index/data file in the order they happened relative to the 'puts':
    1. When a delete happens, leave the fileId in the in memory map with the key, marked as deleted (possible with closed addressing but not the current data structures).
    2. Initialize the database in order, from oldest file to newest file, and split the data by hash so that a thread per segment can do the update. This would be limited by how fast one thread could compute the hash of the keys in the tombstone/index file. One optimization would be if compaction was able to merge and split files so that files are on disjoint key hash ranges. For example, if compacting file 1,2, 3, 4 all together yeilded four files "4.a, 4.b, 4.c, 4.d" each representing a distinct hash range perhaps split by the top two bits of the hash into 4 buckets. Then the initialization could read all four in parallel as their updates would be disjoint and apply to different Segments.

My biggest concerns with such a change is that it is a major overhaul of the file format and in memory layout, and would not easily share code with the older implementation. The work to make the code support the old and new formats simultaneously could easily be most of the effort.

@bellofreedom bellofreedom added enhancement New feature or request question Need more discussion HaloDB-1.0 Feature for HaloDB 1.0 and removed HaloDB-1.0 Feature for HaloDB 1.0 labels Dec 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Need more discussion
Projects
None yet
Development

No branches or pull requests

2 participants