-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RCORE-2094 Compressing Integer Arrays #7668
base: next-major
Are you sure you want to change the base?
Conversation
…rectory_current_core
…rectory_current_core
…/new-layout-experiment
…ut_local_directory
…ut_local_directory
…ut_local_directory
4004852
to
2b9969d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments
src/realm/array.cpp
Outdated
} | ||
} | ||
start += sizeof(int64_t) * 8 / no0(w) * chunks; | ||
if (!is_compressed() && m_integer_compressor.get_encoding() == NodeHeader::Encoding::WTypBits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the first condition is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, It is too defensive.. You are right…
src/realm/integer_compressor.cpp
Outdated
for (size_t i = 0; i < sz; ++i) { | ||
auto item = arr.get(i); | ||
values.push_back(item); | ||
REALM_ASSERT_DEBUG(values.back() == item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps an assertion too much. Are you in doubt that push_back works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep too conservative, originally I was in doubt whether our integer compression algo worked :-) , but there is also an assertion (only in debug mode) that checks the whole array after it gets committed, so I can remove this.
src/realm/integer_compressor.cpp
Outdated
REALM_ASSERT_DEBUG(values[indices[i]] == arr.get(i)); | ||
} | ||
|
||
#if REALM_DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not tested above?
src/realm/integer_compressor.cpp
Outdated
auto last = std::unique(values.begin(), values.end()); | ||
values.erase(last, values.end()); | ||
|
||
for (size_t i = 0; i < arr.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use sz
@@ -13,6 +13,7 @@ set(REALM_SOURCES | |||
array_blobs_big.cpp | |||
array_decimal128.cpp | |||
array_fixed_bytes.cpp | |||
array_aggregate_optimizations.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The files should be added to Package.swift file too.
src/realm/array.cpp
Outdated
@@ -190,38 +189,108 @@ using namespace realm::util; | |||
|
|||
void QueryStateBase::dyncast() {} | |||
|
|||
size_t Array::bit_width(int64_t v) | |||
Array::Array(Allocator& allocator) noexcept |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not inlined anymore?
const auto mask = c.v_mask(); | ||
BfIterator ndx_iterator{data, offset, ndx_w, ndx_w, start}; | ||
BfIterator data_iterator{data, 0, v_w, v_w, static_cast<size_t>(*ndx_iterator)}; | ||
while (start < end) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use the same approach as in the parallel find, where we first search the values and then search for the relevant indices afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm, interesting.. I am not sure we are going to become faster or slower.. I'll try..
src/realm/array.hpp
Outdated
@@ -176,19 +176,21 @@ class Array : public Node, public ArrayParent { | |||
template <size_t w> | |||
void set(size_t ndx, int64_t value); | |||
|
|||
int64_t get(size_t ndx) const noexcept; | |||
inline int64_t get(size_t ndx) const noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sene to add inline
to a function definition? It happens a lot in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, because the getter is defined below, it is a left over, I'll delete it ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am missing more basic testing of the compressed arrays, Group_ArrayCompression_Correctness
is hardly enough.
What, How & Why?
Main Logic for compressing integers. it includes: