1) You can make only one sequential pass through the file.
2) You may not store any additional information like a table of offsets.
I would like to receive the answers from you all. After receiving answers from you i will post the answer.
PS: Seems this question also asked in Microsoft Inverview.
--------------------------------------------------------------------
Solution: ( Taken from a book, if anyone know best solution, please post comment )
The basic technique is to pick survivors, and recalculate as you go along. This is so com-putationally inefficient that it's easy to overlook. You open the file and save the first string. At this point you have one candidate string with a 100% probability of picking it. While remembering the first string, you read the next string. You now have two candidates, each with a 50% probability. Pick one of them,save it, and discard the other. Read the next string, and choose between that and the saved string,weighting your choice 33% toward the new string and 67% toward the survivor (which represents the winner of the previous two-way selection). Save the new survivor.
Keep on doing this throughout the entire file, at each step reading string N, and choosing between that(with probability 1/N) and the previous survivor, with probability (N - 1)/N. When you reach the endof the file, the current survivor is the string picked at random!
No comments:
Post a Comment