-
-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom pointer type support ? #87
Comments
This is a restriction I inherited from the original Abseil source code. Can you tell me what kind of pointer type you would like to use? |
Boost Interprocess Offset pointer: Please observe the following: Some implementations choose the offset 0 (that is, an offset_ptr pointing to itself) as the null pointer pointer representation but this is not valid for many use cases since many times structures like linked lists or nodes from STL containers point to themselves (the end node in an empty container, for example) and 0 offset value is needed. An alternative is to store, in addition to the offset, a boolean to indicate if the pointer is null. However, this increments the size of the pointer and hurts performance. In consequence, offset_ptr defines offset 1 as the null pointer, meaning that this class can't point to the byte after its own this pointer: |
Thanks! Maybe it is doable to support this. I'm wondering how I could test it. Would you have a small example with a boost unordered_map/set using a custom pointer type allocator? |
Yes, i can make small example |
This is the easy case: But we actually need scoped allocation with piecewise construction as well, i can add example of this later. |
Great, thanks @kleunen, let me look at that first! |
This is with a scoped allocation example: Vector in map using a scoped allocator |
Thanks. Please give me a couple days to look into it. |
I cleaned up the example a little bit more: |
Hello @greg7mdp, do you think this is possible ? Or are there any show-stopping issues ? |
What was the error you got ? |
I actually worked around the issue now, by implementing a custom allocator, which keeps the mmap region open, even after resizing. This way, the pointers don't need to be offset pointers, because the old region also stayed mapped. I do still think it would be useful if parallel-hashmap supports custom pointers. I think you should consistently use Alloctor::pointer and not convert to raw pointers. |
How can I do that since I allocate the hashmap array itself using the allocator and need to access the entries into this array? |
You don't actually have to manage the array yourself, you can use std::vector to allocate the array of pairs for you. You can make the actual storage container a template argument: using entry_t = std::pair<int, int>;
using entry_allocator_t = std::allocator< std::pair<int, int> >;
// Storing entries
template<class T, class A>
using bucket_storage_t = std::vector<T, A>;
int main()
{
bucket_storage_t<entry_t, entry_allocator_t> entries(10);
for(int i = 0; i < 10; ++i)
entries[i] = std::make_pair(i, i);
} In the cases of the parallel hash map, you also need a storage type and allocator for the vector of vectors: #include <utility>
#include <vector>
// Allocate entries
using entry_t = std::pair<int, int>;
using entry_allocator_t = std::allocator< std::pair<int, int> >;
using bucked_list_allocator_t = std::allocator< std::vector< std::pair<int, int> > >;
// Storing entries
template<class T, class A>
using bucket_storage_t = std::vector<T, A>;
// Storing list of buckets
template<class T, class A>
using bucket_list_storage_t = std::vector< bucket_storage_t<T, entry_allocator_t>, A>;
int main()
{
bucket_list_storage_t<entry_t, bucked_list_allocator_t> bucket_list(10);
for(int i = 0; i < 10; ++i)
bucket_list.resize(100);
} But possibly you can also use a deque for this? |
It seems custom pointer types are not supported with the parallel hashmap:
https://github.com/greg7mdp/parallel-hashmap/blob/master/parallel_hashmap/phmap.h#L843-L846
Would this be difficult to add ? Because I would like to apply the parallel hashmap to a project to process openstreetmap data. This involves loading several billions of geometrical data. We do this by loading the the data into a boost unordered_map backed by a boost interprocess mmap file. This way, the data is hashed but backed by a file on filesystem. This is required, because converting the planet actually involved processing 60G of compressed data.
The loading process is now completely single threaded, but ideally, we would like to load the node/ways/relations store with openstreetmap from the planet osm file in parallel. Parallel hashmap provides an interesting approach for this, but this approach would require using the boost interprocess offset pointer.
https://github.com/systemed/tilemaker
Is there a particular reason why custom pointer types are not allowed ?
The text was updated successfully, but these errors were encountered: