As per the paper in https://www.cs.cmu.edu/~binfan/papers/login_cuckoofilter.pdf, it mentioned that cuckoo filter had maximum insertion times for the same element.
“Inserting the same item kb+1 times will cause the insertion to fail. This is similar to counting Bloom filters where duplicate insertion causes counter overflow”. Do you know what is the number of kb+1 in RedisBloom’s cuckoo filter implementation? In my opinion, k is # of hash functions, b is the slots on single bucket, is that right?
Hello @zhujianf2000,
Great question.
For a normal Cuckoo filter, if an element is inserted num_buckets * 2 times it will block the filter. It can be blocked even before if the element collides with other elements.
Please note RedisBloom implementation of the Cuckoo filter is Scalable which means the filter will add another filter instead of blocking the current one so you can add the same item an infinite number of times and until you run out of memory.
Does your use case demand this kind of behavior?
I tried cf.add 1 1 and cf.delete 1 1 mutilple times and superisingly found that it works normally in RedisBloom.
Then I checked the code (file cuckoo.c), in the function CuckooFilter_InsertFP, if it fails to insert an element, it will add one more fitler to its fitler list. (CuckooFilter_Grow(filter), line 239), and then insert the element into the new fitler.