Comment by snowcrshd - Hacker Neue

snowcrshd May 27, 2017 parent

Thanks for sharing that!

Just skimmed through it and seems pretty interesting. I'll read it more in depth later.

lucasschm May 27, 2017

No problem. If there are mistakes or a segment is not clear, let me know

meetapoorvgupta May 28, 2017

Thanks for the write up, Lucas. It was very intuitive and I learnt a lot.

I noticed that you used 5000 buckets to store the frequency of 7000 non-unique words in the section on 'Counting Bloom Filters'. How is that better than using 7000 buckets and a uniformly distributed hash function, which would maintain frequencies perfectly? We would be using fewer buckets by an order of magnitude in a real-world implementation to save memory.

lucasschm May 28, 2017

Yeah, I should have given more thought to that number. Updated the example for N=300. Thanks

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous