Design map reduce algorithms to take a very large file of


Q1. Prove or disprove each of the following relational algebra identities. These means that if they are true of all relations R and S of the same arity, then give a formal proof, otherwise provide a counterexample of two relations R and S of the same arity for which the identity fails.

(a) π1,2(R) U π1,2(S) = π1,2(R U S)

(b) π1,2(R) ∩ π1,2(S) = π1,2(R ∩ S)

(c) π1,2 (R \ S) = π1,2(R) \ π1,2(S)

(d) σ1=a(R \ S) = σ1=a(R)\ σ1=a(S)

Q2. Consider the relational schema about drinkers, beers, and bars, from the lecture slides "Relational algebra and SQL," slides nos. 31 and 32.

Express the following queries in (i) relational algebra, and in (ii) SQL

(a) Print the bars that serve a beer that drinker Bill likes.

(b) Print the drinkers that frequent at least one bar that serves a beer they like.

(c) Print the drinkers that frequent only bars that serve some beer they like. (Assume each drinker likes at least one beer and frequents at least one bar.)

(d) Print the drinkers that frequent no bar that serves a beer that they like.

Q3. Design Map Reduce algorithms to take a very large file of integers and produce as output:

(a) The largest integer.

(b) The average of all integers.

(c) The same set of integers, but with each integer appearing only' once.

(d) The count of the number of distinct integers in the input.

Q4. In the form of relational algebra implemented in SQL, relations are not sets, but bags; that is, tuples are allowed to appear more than once. There are extended definitions of union, intersection, and difference for bags, which we shall define below. Write Map Reduce algorithms for computing the following operations on bags R and S:

(a) Bag Union, defined to be the bag of tuples in which tuple t app ears the sum of the numbers of times it appears in R and S.

(b) Bag Intersection, defined to be the bag of tuples in which tuple t appears the minimum of the numbers of times it appears in R and S.

(c) Bag Difference, defined to be the bag of tuples in which the number of times a tuple t appears is equal to the number of times it appears in R minus the number of times it appears in S. A tuple that appears more times in S than in R does not appear in the difference.

Q5. The Hamming distance between a pair of bit-strings of the same length is the number of bits in which they differ.

Design a map-reduce algorithm that takes as input a (huge) set of bit-strings of length b, and outputs all pairs of strings that are at most distance d from each other.

Request for Solution File

Ask an Expert for Answer!!
Data Structure & Algorithms: Design map reduce algorithms to take a very large file of
Reference No:- TGS01605176

Expected delivery within 24 Hours