You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 7, 2020. It is now read-only.
it would work, but as a drawback, you have to query each chunk for contains method. So, complexity increases by N (chunk amount). Add method also becomes more sophisticated.
Actually, I have a "solution" (I'm not a java native, so excuse me in advance). There is a fork, there is a lot of copy-paste code: own BitSet with long type and BloomFilter hierarchy. https://github.com/ajtkulov/stream-lib
It works, but I'm not sure that this is idiomatic for java and enough for PR.
why complexity increase by N chunks? you need partition your data by hash, index_of_bloom = long_hash % number_of_chunks, complexity is same, memory complexity is num_chunks * bloomfilter_memory
The only theoretical issue is that (in case you build Bloom from dynamic data) some malicious person could generate items related to the same bucket. But it's related to all hashes structures, and it's a theory, not a practical thing.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
In my live case, we need BloomFilter for a bigger amount (about 4-5Gb ram, >32B bits)
The code is dependent on java.util.BitSet with .ctor
public BitSet(int nbits)
with a lot of getter/setter by int index.The text was updated successfully, but these errors were encountered: