Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/support large keys (#42) #49

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

scottcarey
Copy link

@scottcarey scottcarey commented Dec 5, 2019

This change allows for keys to be up to 2047 bytes long for both non-pooled and memory-pool implementations. Performance is moderately improved in the process, and the RAM required for the memory pool is reduced by 2+ bytes per key.

Each individual commit should build and test successfully, this was developed incrementally and tested at each step.
It may be easier to follow by first perusing the commits individually, then looking at the combined diff after that.

Read the full individual commit messages for more detailed information. A few highlights:

In the file format, 'version' is now a number between 0 and 31. 'key size' is a number between 0 and 2047.
For the in-memory data structures, the 5 bytes that used to hold key size and value size are now split with 11 bits for the key (size between 0 and 2047) and 29 bits for the value size (max value size 512MB).
MemoryPoolSegments are now addressed by their slot rather than their offset, which allows one more byte per entry to be removed, leading to 4-byte MemoryPoolAddress values, saving a byte per slot in the master table and a byte per slot in the pool.
The non-pooled implementation supports large keys in a simple way, by allocating a larger entry.
The memory pool implementation uses multiple slots (chained together) to hold larger key data. If keys are equal to or smaller than the fixedKeyLength, these function as before. If keys are larger, slots are used to hold the 'overflowing' portion of the key.
Lastly, the memory pool implementation was refactored a bit to encapsulate the chunk offsets inside of a new MemoryPoolChunk.Slot class, which makes the code a lot easier to read and removes a lot of scope for error as the raw offsets are not passed around between chunks and addresses loosely.

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

Scott Carey added 2 commits November 20, 2019 13:00
  Modify index, record, and tombstone file serialization so that the
  prior version byte and key size byte now represent a 5 bit version
  and an 11 bit key length.

  Because we are at version zero,
  old code can still read data written by new code as long as
  key sizes are not larger than 127 bytes.  If so, old code
  will interpret large key sizes as an incompatible version number.

  Data written by old code is fully compatible with new code because
  a version 0 byte still represents 0 after this change, and also does
  not impact the key.

  Prior:
  version byte -> version number, 0 to 255
  key size byte -> key size, 0 to 127

  After this commit:
  version byte -> top 5 bits are the version number, 0 to 31
                  bottom 3 bits are the three MSB of key size
  key size byte -> lower 8 bits of key size

  -------------------
  |  vbyte | ksbyte |   old version and key size -- 8 bits each
  -------------------
  | ver | key size  |   new version and key size -- 5 / 11 bit split
  -------------------
@scottcarey
Copy link
Author

scottcarey commented Dec 5, 2019

The two stack traces in the failed build make no sense at all. A byte array's length field is being passed in to another byte array initializer, which gives a negative array size exception. A byte array can't have a negative size... I ran the tests locally a dozen times without issue. I'll kick off travis again.

@scottcarey
Copy link
Author

scottcarey commented Dec 5, 2019

I take that back, I can see that these tests have a ~ 4% or so chance of failing due to the test set-up.

Scott Carey added 9 commits December 5, 2019 16:08
  Move key length into InMemoryIndexMetaData.
  Use 5 bytes to represent the key size and vlaue size.
  Use 11 bits for the key size, and 29 for the value size,
  limiting keys to 2047 bytes and values to 512MB.

  Introduce HashEntry and HashEntrySerializer
  to abstract over key access inside the hash entry, and
  disambiguate user values from the in memory hash data.
  Record is the public user visible class that contains minimal data.
  RecordEntry represents an entry in the data file.
  RecordIterated is needed internally to transmit sequenceNumber
  through for testing.
  Key/Value sizes are serialized in a common way
  shared across HashEntry implementations.  5 bytes
  are split into 11 bits for the key size and 29 bits
  for the value size.
  Location information -- such as the file and offset
  of a record, is handled separately, as Segment
  implementations can choose where to serialize
  these bits of data relative to the key and each
  other.
  Change the valid chunk range from 0-127 to 1-255
  and treat the byte used for the chunk as unsigned.
  A chunk value of zero indicates an empty
  MemoryPoolAddress.  This nearly doubles the
  effective maximum index size and efficiently uses
  all of the bits in the chunk index byte.
  Place the reference to the next slot first
  Followed by the key/value lengths
  Then the fixedKeyLength reserved key space
  Then the location data (fileId, fileOffset, sequenceId)
  Keys longer than fixedKeyLength 'overflow' into the next
  slot in the chain.  An entry for a key that is equal to
  or smaller than fixedKeyLength remains the same as before,
  fitting into a single slot.  If the key is longer,
  'overflow' slots are allocated that contain only the 'next'
  pointer and additional key data.

  Additionally, the tests related to the memory pool variant
  have been significantly enhanced to improve coverage.
  MemoryPoolAddress now contains a byte for the chunk
  and an int for the slot, instead of containing the
  offset of the slot.  This will allow MemoryPoolAddress
  to become smaller in a later commit.
…oryPool

  MemoryPoolChunk.Slot is introduced in order to encapsulate
  access to a slot.  This leads to cleaner code in
  SegmentWithMemoryPool as it no longer needs to manage
  slotOffset information and weave that between the chunk
  and MemoryPoolAddress.  The result is significantly
  safer as well, the slot offset is hidden inside the Slot
  and can not accidentally cause out of bounds access.

  This reduces the number of places we have to validate the
  offset in a typical operation which may improve performance
  slightly as well, and if not at least decreases the exception
  paths to test.
  MemoryPoolAddress is changed from 5 bytes to 4.
  One byte remains for the chunk index,
  but only three bytes are kept for the slot.
  Three bytes is up to 16.77 million slots
  in a single chunk.    For a chunk configured for
  8 byte fixedKeySize, this would imply 550MB + per chunk,
  with 255 total chunks allowed, for over 130GB of
  data in a single segment.  As segments can scale themselves,
  this shrink does not introduce any real impediment to
  the maximum size of a db.

  However, it does save at least 2 bytes per entry.
  One for the table slot, and one for each memory pool slot.
@wangtao724
Copy link
Contributor

If no dependency between the commits, I would suggest split to multiple PRs with good test coverage and have the changes in step by step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants