Preferences

Alias tables are neat and not super well known. We used to have an interview question around sampling from a weighted distribution (typical answer: prefix sum -> binary search) and I don’t think anyone produced this. I like the explanation in that blog. The way it was explained to me was first ‘imagine drawing a bar chart and throwing a dart at it, retrying if you miss. This simulates the distribution but runs in expected linear time’. Then you can describe how to chop up the bars to fit in the rectangle you would get if all weights were equal. Proof that the greedy algorithm works is reasonably straightforward.

I'm not actually sure this makes for a good interview question. Doesn't it mostly just test whether you've heard of the alias method?

Btw, a slightly related question:

Supposed you have a really long text file, how would you randomly sample a line? Such that all lines in the text file have the exactly same probability. Ideally, you want to do this without spending O(size of file) time preprocessing.

(I don't think this is a good interview question, but it is an interesting question.)

One way: sample random characters until you randomly hit a newline. That's the newline at the end of your line.

It’s a retired question (so I’m not really disagreeing that it wasn’t very good), and no one was expected to get the alias tables (if they did, just ask for updateable weights) and in fact there isn’t even much point in telling people about them as they can then get the impression they failed the interview. The point is more to get some kind of binary search and understanding of probability.

The Monte Carlo method you propose probably works for files where there are many short lines but totally fails in the degenerate case of one very long line. It also may not really work that well in practice because most of the cost of reading a random byte is reading a big block from disk, and you could likely scan such a block in ram faster than you could do the random read of the block from disk.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal