Posted by: atri | November 17, 2009

## Hint on Q2 in HW9

In the extra questions session today I mentioned something to Anthony and Matt, which I thought should share with all of you.

The binary search algorithm can be phrased as a divide and conquer algorithm: i.e. express the algorithm as a recursive algorithm and then try to come up with a  recurrence for its time complexity. Then analyze the recurrence to get the familiar $O(\log n)$ running time,

## Responses

1. There is no special property of the databases as far as I can see that would enable me to do any less than query half the total data from the databases to find the middle number.

• Hi John,

I claim that the way you can access the databases it is as good as the database given you as a sorted array. Let me elaborate on my claim. What does it mean when I say you are given a sorted (in ascending order) array $A$? Well, it means that if you access $A[k]$, you will get the $k$th smallest value in the array. Now let’s see what the problem says: “In a single query, you can specify a value $k$ to one of the databases, and the chosen database will return the $k$th smallest value that it contains.” See the similarity?

There is a difference: in the sorted array you need constant time to access the $k$th smallest number, where in this problem, you need one query to do so. Now in an “actual implementation,” each query might take more than constant time to execute but that is not your concern. The problem specifically asks you to minimize the number of queries. As I point out in the homework, though running time is very important it is not the only thing that we care about while designing an algorithm and this (as well as Q3 in the homework) is a good example.

Let me know if you have any further questions on this.