GitHub - tenzir/libfilter: High-speed Bloom filters and taffy filters for C, C++, and Java

This repository provides implementations of blocked Bloom filters and taffy filters for C, C++, and Java. Blocked filters use slightly more space than traditional Bloom filters but are much faster, while taffy filters can scale to any size, even if the final size is not known in advance.

Example usage in C:

#include <filter/block.h>

unsigned ndv = 1000000;
double fpp = 0.0065;
uint64_t hash = 0xfeedbadbee52b055;

libfilter_block filter;

unsigned bytes = libfilter_block_bytes_needed(ndv, fpp);
libfilter_block_init(bytes, &filter);
libfilter_block_add_hash(hash, &filter);
assert(libfilter_block_find_hash(hash, &filter));
libfilter_block_destruct(&filter);

in C++:

#include <filter/block.hpp>

unsigned ndv = 1000000;
double fpp = 0.0065;
uint64_t hash = 0xfeedbadbee52b055;

auto filter = filter::BlockFilter::CreateWithNdvFpp(ndv, fpp);
filter.InsertHash(hash);
assert(filter.FindHash(hash));

in Java

import com.github.jbapple.libfilter.BlockFilter;

int ndv = 1000000;
double fpp = 0.0065;
long hash = (((long) 0xfeedbadb) << 32) | (long) 0xee52b055;

BlockFilter = BlockFilter.CreateWithNdvFpp(ndv, fpp);
filter.AddHash64(hash);
assert filter.FindHash64(hash)

To install:

make
# probably needs a sudo:
make install
make java-world
mvn -f ./java/ install -Dmaven.test.skip=true
[optional: copy java/target/libfilter-*.jar to your classpath]

Block filters are most space-efficient at around 11.5 bits per distinct value, which produces a false positive probability of 0.65%. A traditional Bloom filter would only need 10.5 bits per distinct value to achieve that same false positive probability, while a cuckoo filter would need 10.7 bits per distinct value.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
.github/workflows		.github/workflows
c		c
cpp		cpp
doc		doc
java		java
.gitignore		.gitignore
.gitmodules		.gitmodules
BENCHMARKING.md		BENCHMARKING.md
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

tenzir/libfilter

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages