Hi everybody,
Please give me an example of how to use the Count-Min Sketch datatype from RedisBloom
Thanks
Hi everybody,
Please give me an example of how to use the Count-Min Sketch datatype from RedisBloom
Thanks
Hello Pablo,
Let’s assume you want to measure the number of visitors that were directed to your website via a third party but you have millions of websites that might actually do that, you could end with many counters, each one of these also keeps the address of the origin website. That can be very expensive.
If on the other hand, you need only websites that provide significant traffic, maybe 1% of your traffic, CMS will be perfect.
CMS.INITBYPROB in RedisBloom takes two parameters.
After 1 million websites were added you can query for the count of an individual website. Any website with a count of up to 10,000 (1%) is just noise since the count is within the error rate you set. Beyond that, you can assume the count represents the real frequency and is overestimated by up to 1%.
the 0.1% probability means in this case, that 1 out of a thousand websites that are not heavy-hitter will get a count higher than our error rate (1%). As mentioned before, this is expected from probabilistic data structures.
I hope this helps,
Ariel