What is the minimum budget to index all pages if you assume


Assignment

You are in charge of the Genghis ('We execute fast') search engine. You are designing your server cluster to handle 500 million hits a day and 10 billion pages of indexed data. Each machine costs $1000, and can store 10 million pages and respond to 200 queries per second (against these pages).

1. If you were given a budget of $500,000 dollars for purchasing machines, and were required to index all 10 billion pages, could you do it?

2. What is the minimum budget to index all pages? If you assume that each query can be answered by looking at data in just one (10 million page) partition, and that queries are distributed across partitions, what peak load (in number of queries per second) can such a cluster handle?

3. How would your answer to the previous question change if each query, on average, accessed two partitions?

4. What is the minimum budget required to handle the desired load of 500 million hits per day if all queries are on single Assume that queries are uniformly distributed with respect to tirTle of day.

5. How would your answer to the previous question change if the number of queries per day went up to 5 billion hits per day? How would it change if the number of pages went up to 100 billion'?

6. Assume that each query accesses just one that queries are uniformly distributed across partitions, but that at any given the peak load on a partition is upto 10 times the average load. What is the minimum budget for purchasing machines in this scenario?

7. Take the cost for machines from the previous question and multiply it by 10 to reflect the costs of maintenance, administration, network bandwidth, etc. This amount is your annual cost of operation. Assume that you charge advertisers 2 cents per page. What fraction of your inventory (i.e., the total number of pages that you serve over the course of a year) do you have to sell in order to make a profit?

Request for Solution File

Ask an Expert for Answer!!
Computer Engineering: What is the minimum budget to index all pages if you assume
Reference No:- TGS02759631

Expected delivery within 24 Hours