List all the minimum s-t cuts in the flow network pictured


Exercises

1. (a) List all the minimum s-t cuts in the flow network pictured in Fig- ure 7.24. The capacity of each edge appears as a label next to the edge.

708_Figure1.jpg

Figure 7.24 What are the minimum s-t cuts in this flow network?

(b) What is the minimum capacity of an s-t cut in the flow network in Figure 7.25? Again, the capacity of each edge appears as a label next to the edge.

683_Figure2.jpg

Figure 7.25 What is the min- imum capacity of an s-t cut in this flow network?

2. Figure 7.26 shows a flow network on which an s-t flow has been computed. The capacity of each edge appears as a label next to the edge, and the numbers in boxes give the amount of flow sent on each edge. (Edges without boxed numbers-specifically, the four edges of capacity 3-have no flow being sent on them.)

2476_Figure3.jpg

Figure 7.26 What is the value of the depicted flow? Is it a maximum flow? What is the minimum cut?

(a) What is the value of this flow? Is this a maximum (s,t) flow in this graph?

(b) Find a minimum s-t cut in the flow network pictured in Figure 7.26, and also say what its capacity is.

3. Figure 7.27 shows a flow network on which an s-t flow has been computed. The capacity of each edge appears as a label next to the edge, and the numbers in boxes give the amount of flow sent on each edge. (Edges without boxed numbers have no flow being sent on them.)

1626_Figure4.jpg

Figure 7.27 What is the value of the depicted flow? Is it a maximum flow? What is the minimum cut?

(a) What is the value of this flow? Is this a maximum (s,t) flow in this graph?


(b) Find a minimum s-t cut in the flow network pictured in Figure 7.27, and also say what its capacity is.

4. Decide whether you think the following statement is true or false. If it is true, give a short explanation. If it is false, give a counterexample.

Let G be an arbitrary flow network, with a source s, a sink t, and a positive integer capacity ce on every edge e. If f is a maximum s-t flow in G, then f saturates every edge out of s with flow (i.e., for all edges e out of s, we have f (e) = ce).

5. Decide whether you think the following statement is true or false. If it is true, give a short explanation. If it is false, give a counterexample.

Let G be an arbitrary flow network, with a source s, a sink t, and a positive integer capacity ce on every edge e; and let (A, B) be a mimimum s-t cut with respect to these capacities {ce : e ∈ E}. Now suppose we add 1 to every capacity; then (A, B) is still a minimum s-t cut with respect to these new capacities
{1 + ce : e ∈ E}.

6. Suppose you're a consultant for the Ergonomic Architecture Commission, and they come to you with the following problem.

They're really concerned about designing houses that are "user- friendly," and they've been having a lot of trouble with the setup of light fixtures and switches in newly designed houses. Consider, for example, a one-floor house with n light fixtures and n locations for light switches mounted in the wall. You'd like to be able to wire up one switch to control each light fixture, in such a way that a person at the switch can see the light fixture being controlled.

2231_Figure5.jpg

Figure 7.28 The floor plan in (a) is ergonomic, because we can wire switches to fixtures in such a way that each fixture is visible from the switch that controls it. (This can be done by wiring switch 1 to a, switch 2 to b, and switch 3 to c.) The floor plan in (b) is not ergonomic, because no such wiring is possible.

Sometimes this is possible and sometimes it isn't. Consider the two simple floor plans for houses in Figure 7.28. There are three light fixtures (labeled a, b, c) and three switches (labeled 1, 2, 3). It is possible to wire switches to fixtures in Figure 7.28(a) so that every switch has a line of sight to the fixture, but this is not possible in Figure 7.28(b).

Let's call a floor plan, together with n light fixture locations and n switch locations, ergonomic if it's possible to wire one switch to each fixture so that every fixture is visible from the switch that controls it. A floor plan will be represented by a set of m horizontal or vertical line segments in the plane (the walls), where the ith wall has endpoints (x'i, y'i), (xr, yr). Each of the n switches and each of the n fixtures is given by its coordinates in the plane. A fixture is visible from a switch if the line segment joining them does not cross any of the walls.

Give an algorithm to decide if a given floor plan is ergonomic. The running time should be polynomial in m and n. You may assume that you have a subroutine with O(1) running time that takes two line segments as input and decides whether or not they cross in the plane.

7. Consider a set of mobile computing clients in a certain town who each need to be connected to one of several possible base stations. We'll suppose there are n clients, with the position of each client specified by its (x, y) coordinates in the plane. There are also k base stations; the position of each of these is specified by (x, y) coordinates as well.

For each client, we wish to connect it to exactly one of the base stations. Our choice of connections is constrained in the following ways.

There is a range parameter r-a client can only be connected to a base station that is within distance r. There is also a load parameter L-no more than L clients can be connected to any single base station.

Your goal is to design a polynomial-time algorithm for the following problem. Given the positions of a set of clients and a set of base stations, as well as the range and load parameters, decide whether every client can be connected simultaneously to a base station, subject to the range and load conditions in the previous paragraph.

8. Statistically, the arrival of spring typically results in increased accidents and increased need for emergency medical treatment, which often re- quires blood transfusions. Consider the problem faced by a hospital that is trying to evaluate whether its blood supply is sufficient.

The basic rule for blood donation is the following. A person's own blood supply has certain antigens present (we can think of antigens as a kind of molecular signature); and a person cannot receive blood with a particular antigen if their own blood does not have this antigen present. Concretely, this principle underpins the division of blood into four types: A, B, AB, and O. Blood of type A has the A antigen, blood of type B has the B antigen, blood of type AB has both, and blood of type O has neither. Thus, patients with type A can receive only blood types A or O in a transfusion, patients with type B can receive only B or O, patients with type O can receive only O, and patients with type AB can receive any of the four types.

(a) Let sO, sA, sB, and sAB denote the supply in whole units of the different blood types on hand. Assume that the hospital knows the projected demand for each blood type dO, dA, dB, and dAB for the coming week. Give a polynomial-time algorithm to evaluate if the blood on hand would suffice for the projected need.

(b) Consider the following example. Over the next week, they expect to need at most 100 units of blood. The typical distribution of blood types in U.S. patients is roughly 45 percent type O, 42 percent type A, 10 percent type B, and 3 percent type AB. The hospital wants to know if the blood supply it has on hand would be enough if 100 patients arrive with the expected type distribution. There is a total of 105 units of blood on hand. The table below gives these demands, and the supply on hand.

blood type

supply

demand

O

50

45

A

36

42

B

11

8

AB

8

3

Is the 105 units of blood on hand enough to satisfy the 100 units of demand? Find an allocation that satisfies the maximum possible number of patients. Use an argument based on a minimum-capacity cut to show why not all patients can receive blood. Also, provide an explanation for this fact that would be understandable to the clinic administrators, who have not taken a course on algorithms. (So, for example, this explanation should not involve the words flow, cut , or graph in the sense we use them in this book.)

9. Network flow issues come up in dealing with natural disasters and other crises, since major unexpected events often require the movement and evacuation of large numbers of people in a short amount of time.

Consider the following scenario. Due to large-scale flooding in a re- gion, paramedics have identified a set of n injured people distributed across the region who need to be rushed to hospitals. There are k hos- pitals in the region, and each of the n people needs to be brought to a hospital that is within a half-hour's driving time of their current location (so different people will have different options for hospitals, depending on where they are right now).

At the same time, one doesn't want to overload any one of the hospitals by sending too many patients its way. The paramedics are in touch by cell phone, and they want to collectively work out whether they can choose a hospital for each of the injured people in such a way that the load on the hospitals is balanced: Each hospital receives at most Jn/k] people.

Give a polynomial-time algorithm that takes the given information about the people's locations and determines whether this is possible.

10. Suppose you are given a directed graph G = (V , E), with a positive integer capacity ce on each edge e, a source s ∈ V, and a sink t ∈ V. You are also given a maximum s-t flow in G, defined by a flow value fe on each edge e. The flow f is acyclic: There is no cycle in G on which all edges carry positive flow. The flow f is also integer-valued.

Now suppose we pick a specific edge e* ∈ E and reduce its capacity by 1 unit. Show how to find a maximum flow in the resulting capacitated graph in time O(m + n), where m is the number of edges in G and n is the number of nodes.

11. Your friends have written a very fast piece of maximum-flow code based on repeatedly finding augmenting paths as in Section 7.1. However, after you've looked at a bit of output from it, you realize that it's not always finding a flow of maximum value. The bug turns out to be pretty easy to find; your friends hadn't really gotten into the whole backward-edge thing when writing the code, and so their implementation builds a variant of the residual graph that only includes the forward edges. In other words, it searches for s-t paths in a graph G˜f consisting only of edges e for which f (e)< ce, and it terminates when there is no augmenting path consisting entirely of such edges. We'll call this the Forward-Edge-Only Algorithm. (Note that we do not try to prescribe how this algorithm chooses its forward-edge paths; it may choose them in any fashion it wants, provided that it terminates only when there are no forward-edge paths.)

It's hard to convince your friends they need to reimplement the code. In addition to its blazing speed, they claim, in fact, that it never returns a flow whose value is less than a fixed fraction of optimal. Do you believe this? The crux of their claim can be made precise in the following statement.

There is an absolute constant b > 1 (independent of the particular input flow network), so that on every instance of the Maximum-Flow Problem, the Forward-Edge-Only Algorithm is guaranteed to find a flow of value at least 1/b times the maximum-flow value (regardless of how it chooses its forward-edge paths).

Decide whether you think this statement is true or false, and give a proof of either the statement or its negation.

12. Consider the following problem. You are given a flow network with unit- capacity edges: It consists of a directed graph G = (V , E), a source s ∈ V, and a sink t ∈ V; and ce = 1 for every e ∈ E. You are also given a parameter k.

The goal is to delete k edges so as to reduce the maximum s-t flow in G by as much as possible. In other words, you should find a set of edges F ⊆ E so that |F |= k and the maximum s-t flow in Gr = (V , E - F) is as small as possible subject to this.
Give a polynomial-time algorithm to solve this problem.

13. In a standard s-t Maximum-Flow Problem, we assume edges have capaci- ties, and there is no limit on how much flow is allowed to pass through a node. In this problem, we consider the variant of the Maximum-Flow and Minimum-Cut problems with node capacities.

Let G = (V , E) be a directed graph, with source s ∈ V, sink t ∈ V, and nonnegative node capacities {cv ≥ 0} for each v ∈ V. Given a flow f in this graph, the flow though a node v is defined as f in(v). We say that a flow is feasible if it satisfies the usual flow-conservation constraints and the node-capacity constraints: f in(v) ≤ cv for all nodes.

Give a polynomial-time algorithm to find an s-t maximum flow in such a node-capacitated network. Define an s-t cut for node-capacitated networks, and show that the analogue of the Max-Flow Min-Cut Theorem holds true.

14. We define the Escape Problem as follows. We are given a directed graph G = (V , E) (picture a network of roads). A certain collection of nodes X ⊂ V are designated as populated nodes, and a certain other collection S ⊂ V are designated as safe nodes. (Assume that X and S are disjoint.) In case of an emergency, we want evacuation routes from the populated nodes to the safe nodes. A set of evacuation routes is defined as a set of paths in G so that (i) each node in X is the tail of one path, (ii) the last node on each path lies in S, and (iii) the paths do not share any edges. Such a set of paths gives a way for the occupants of the populated nodes to "escape" to S, without overly congesting any edge in G.

(a) Given G, X, and S, show how to decide in polynomial time whether such a set of evacuation routes exists.

(b) Suppose we have exactly the same problem as in (a), but we want to enforce an even stronger version of the "no congestion" condition (iii). Thus we change (iii) to say "the paths do not share any nodes."

With this new condition, show how to decide in polynomial time whether such a set of evacuation routes exists.

Also, provide an example with the same G, X, and S, in which the answer is yes to the question in (a) but no to the question in (b).

15. Suppose you and your friend Alanis live, together with n - 2 other people, at a popular off-campus cooperative apartment, the Upson Collective. Over the next n nights, each of you is supposed to cook dinner for the co-op exactly once, so that someone cooks on each of the nights.

Of course, everyone has scheduling conflicts with some of the nights (e.g., exams, concerts, etc.), so deciding who should cook on which night becomes a tricky task. For concreteness, let's label the people

{p1, ... , pn},

the nights

{d1, ... , dn};

and for person pi, there's a set of nights Si ⊂ {d1, ... , dn} when they are not able to cook.

A feasible dinner schedule is an assignment of each person in the co- op to a different night, so that each person cooks on exactly one night, there is someone cooking on each night, and if pi cooks on night dj, then dj /∈ Si.

(a) Describe a bipartite graph G so that G has a perfect matching if and only if there is a feasible dinner schedule for the co-op.

(b) Your friend Alanis takes on the task of trying to construct a feasible dinner schedule. After great effort, she constructs what she claims is a feasible schedule and then heads off to class for the day.

Unfortunately, when you look at the schedule she created, you notice a big problem. n - 2 of the people at the co-op are assigned to different nights on which they are available: no problem there. But for the other two people, pi and pj, and the other two days, dk and d4, you discover that she has accidentally assigned both pi and pj to cook on night dk, and assigned no one to cook on night d4.

You want to fix Alanis's mistake but without having to recom- pute everything from scratch. Show that it's possible, using her "al- most correct" schedule, to decide in only O(n2) time whether there exists a feasible dinner schedule for the co-op. (If one exists, you should also output it.)

16. Back in the euphoric early days of the Web, people liked to claim that much of the enormous potential in a company like Yahoo! was in the "eyeballs"-the simple fact that millions of people look at its pages every day. Further, by convincing people to register personal data with the site, a site like Yahoo! can show each user an extremely targeted advertisement whenever he or she visits the site, in a way that TV networks or magazines couldn't hope to match. So if a user has told Yahoo! that he or she is a 20-year-old computer science major from Cornell University, the site can present a banner ad for apartments in Ithaca, New York; on the other hand, if he or she is a 50-year-old investment banker from Greenwich, Connecticut, the site can display a banner ad pitching Lincoln Town Cars instead.

But deciding on which ads to show to which people involves some serious computation behind the scenes. Suppose that the managers of a popular Web site have identified k distinct demographic groups

G1, G2, ... , Gk. (These groups can overlap; for example, G1 can be equal to all residents of New York State, and G2 can be equal to all people with a degree in computer science.) The site has contracts with m different advertisers, to show a certain number of copies of their ads to users of the site. Here's what the contract with the ith advertiser looks like.

. For a subset Xi ⊆ {G1, ... , Gk} of the demographic groups, advertiser i wants its ads shown only to users who belong to at least one of the demographic groups in the set Xi.

. For a number ri, advertiser i wants its ads shown to at least ri users each minute.

Now consider the problem of designing a good advertising policy - a way to show a single ad to each user of the site. Suppose at a given minute, there are n users visiting the site. Because we have registration information on each of these users, we know that user j (for j = 1, 2, ... , n) belongs to a subset Uj ⊆ {G1, ... , Gk} of the demographic groups. The problem is: Is there a way to show a single ad to each user so that the site's contracts with each of the m advertisers is satisfied for this minute? (That is, for each i = 1, 2, ... , m, can at least ri of the n users, each belonging to at least one demographic group in Xi, be shown an ad provided by advertiser i?)

Give an efficient algorithm to decide if this is possible, and if so, to actually choose an ad to show each user.

17. You've been called in to help some network administrators diagnose the extent of a failure in their network. The network is designed to carry traffic from a designated source node s to a designated target node t, so we will model the network as a directed graph G = (V , E), in which the capacity of each edge is 1 and in which each node lies on at least one path from s to t.

Now, when everything is running smoothly in the network, the max- imum s-t flow in G has value k. However, the current situation (and the reason you're here) is that an attacker has destroyed some of the edges in the network, so that there is now no path from s to t using the remaining (surviving) edges. For reasons that we won't go into here, they believe the attacker has destroyed only k edges, the minimum number needed to separate s from t (i.e., the size of a minimum s-t cut); and we'll assume they're correct in believing this.

The network administrators are running a monitoring tool on node s, which has the following behavior. If you issue the command ping(v), for a given node v, it will tell you whether there is currently a path from s to v. (So ping(t) reports that no path currently exists; on the other hand, ping(s) always reports a path from s to itself.) Since it's not practical to go out and inspect every edge of the network, they'd like to determine the extent of the failure using this monitoring tool, through judicious use of the ping command.

So here's the problem you face: Give an algorithm that issues a sequence of ping commands to various nodes in the network and then reports the full set of nodes that are not currently reachable from s. You could do this by pinging every node in the network, of course, but you'd like to do it using many fewer pings (given the assumption that only k edges have been deleted). In issuing this sequence, your algorithm is allowed to decide which node to ping next based on the outcome of earlier ping operations.

Give an algorithm that accomplishes this task using only O(k log n) pings.

2350_Figure6.jpg

Figure 7.29 An instance of Coverage Expansion.

18. We consider the Bipartite Matching Problem on a bipartite graph G = (V , E). As usual, we say that V is partitioned into sets X and Y, and each edge has one end in X and the other in Y.

If M is a matching in G, we say that a node y ∈ Y is covered by M if y is an end of one of the edges in M.

(a) Consider the following problem. We are given G and a matching M in G. For a given number k, we want to decide if there is a matching Mr in G so that

(i) Mr has k more edges than M does, and

(ii) every node y ∈ Y that is covered by M is also covered by Mr.

We call this the Coverage Expansion Problem, with input G, M, and k. and we will say that Mr is a solution to the instance.

Give a polynomial-time algorithm that takes an instance of Cov- erage Expansion and either returns a solution Mr or reports (correctly) that there is no solution. (You should include an analysis of the run- ning time and a brief proof of why it is correct.)

Note: You may wish to also look at part (b) to help in thinking about this.

Example. Consider Figure 7.29, and suppose M is the matching con- sisting of the edge (x1, y2). Suppose we are asked the above question with k = 1.

Then the answer to this instance of Coverage Expansion is yes. We can let Mr be the matching consisting (for example) of the two edges (x1, y2) and (x2, y4); Mr has one more edge than M, and y2 is still covered by Mr.

(b) Give an example of an instance of Coverage Expansion, specified by G, M, and k, so that the following situation happens.

The instance has a solution; but in any solution Mr, the edges of M do not form a subset of the edges of Mr.

(c) Let G be a bipartite graph, and let M be any matching in G. Consider the following two quantities.
- K1 is the size of the largest matching Mr so that every node y that is covered by M is also covered by Mr.
- K2 is the size of the largest matching Mrr in G.

Clearly K1 ≤ K2, since K2 is obtained by considering all possible match- ings in G.

Prove that in fact K1 = K2; that is, we can obtain a maximum matching even if we're constrained to cover all the nodes covered by our initial matching M.

19. You've periodically helped the medical consulting firm Doctors Without Weekends on various hospital scheduling issues, and they've just come to you with a new problem. For each of the next n days, the hospital has determined the number of doctors they want on hand; thus, on day i, they have a requirement that exactly pi doctors be present.

There are k doctors, and each is asked to provide a list of days on which he or she is willing to work. Thus doctor j provides a set Lj of days on which he or she is willing to work.

The system produced by the consulting firm should take these lists and try to return to each doctor j a list Lr with the following properties.

(A) Lj' is a subset of Lj, so that doctor j only works on days he or she finds acceptable.

(B) If we consider the whole set of lists L'1 , ... , L'j , it causes exactly pi doctors to be present on day i, for i = 1, 2, ... , n.

(a) Describe a polynomial-time algorithm that implements this system. Specifically, give a polynomial-time algorithm that takes the num- bers p1, p2, ... , pn, and the lists L1, ... , Lk, and does one of the fol- lowing two things.

- Return lists L'1 , L'2 , ... , L'k satisfying properties (A) and (B); or

- Report (correctly) that there is no set of lists L'1 , L'2 , ... , L'k that satisfies both properties (A) and (B).

(b) The hospital finds that the doctors tend to submit lists that are much too restrictive, and so it often happens that the system reports (cor- rectly, but unfortunately) that no acceptable set of lists L'1 , L'2 , ... , L'k exists.

Thus the hospital relaxes the requirements as follows. They add a new parameter c > 0, and the system now should try to return to each doctor j a list Lr with the following properties.

(A*) Lr contains at most c days that do not appear on the list Lj.

(B) (Same as before) If we consider the whole set of lists L'1 , ... , L'k, it causes exactly pi doctors to be present on day i, for i = 1, 2, ... , n.

Describe a polynomial-time algorithm that implements this re- vised system. It should take the numbers p1, p2, ... , pn, the lists L1, ... , Lk, and the parameter c > 0, and do one of the following two things.

- Return lists L'1 , L'2 , ... , L'k satisfying properties (A∗) and (B); or

- Report (correctly) that there is no set of lists L'1 , L'2 , ... , L'k that satisfies both properties (A∗) and (B).

20. Your friends are involved in a large-scale atmospheric science experi- ment. They need to get good measurements on a set S of n different conditions in the atmosphere (such as the ozone level at various places), and they have a set of m balloons that they plan to send up to make these measurements. Each balloon can make at most two measurements.

Unfortunately, not all balloons are capable of measuring all condi- tions, so for each balloon i = 1, ... , m, they have a set Si of conditions that balloon i can measure. Finally, to make the results more reliable, they plan to take each measurement from at least k different balloons. (Note that a single balloon should not measure the same condition twice.) They are having trouble figuring out which conditions to measure on which balloon.

Example. Suppose that k = 2, there are n = 4 conditions labeled c1, c2, c3, c4, and there are m = 4 balloons that can measure conditions, subject to the limitation that S1 = S2 = {c1, c2, c3}, and S3 = S4 = {c1, c3, c4}. Then one possible way to make sure that each condition is measured at least k = 2 times is to have
. balloon 1 measure conditions c1, c2,
. balloon 2 measure conditions c2, c3,
. balloon 3 measure conditions c3, c4, and
. balloon 4 measure conditions c1, c4.

(a) Give a polynomial-time algorithm that takes the input to an instance of this problem (the n conditions, the sets Si for each of the m balloons, and the parameter k) and decides whether there is a way to measure each condition by k different balloons, while each balloon only measures at most two conditions.

(b) You show your friends a solution computed by your algorithm from (a), and to your surprise they reply, "This won't do at all-one of the conditions is only being measured by balloons from a single subcon- tractor." You hadn't heard anything about subcontractors before; it turns out there's an extra wrinkle they forgot to mention.

Each of the balloons is produced by one of three different sub- contractors involved in the experiment. A requirement of the exper- iment is that there be no condition for which all k measurements come from balloons produced by a single subcontractor.

For example, suppose balloon 1 comes from the first subcon- tractor, balloons 2 and 3 come from the second subcontractor, and balloon 4 comes from the third subcontractor. Then our previous so- lution no longer works, as both of the measurements for condition c3 were done by balloons from the second subcontractor. However, we could use balloons 1 and 2 to each measure conditions c1, c2, and use balloons 3 and 4 to each measure conditions c3, c4.

Explain how to modify your polynomial-time algorithm for part - into a new algorithm that decides whether there exists a solution satisfying all the conditions from (a), plus the new requirement about subcontractors.

21. You're helping to organize a class on campus that has decided to give all its students wireless laptops for the semester. Thus there is a collection of n wireless laptops; there is also have a collection of n wireless access points, to which a laptop can connect when it is in range.

The laptops are currently scattered across campus; laptop 4 is within range of a set S4 of access points. We will assume that each laptop is within range of at least one access point (so the sets S4 are nonempty); we will also assume that every access point p has at least one laptop within range of it.

To make sure that all the wireless connectivity software is working correctly, you need to try having laptops make contact with access points in such a way that each laptop and each access point is involved in at least one connection. Thus we will say that a test set T is a collection of ordered pairs of the form (4, p), for a laptop 4 and access point p, with the properties that

(i) If (4, p) ∈ T, then 4 is within range of p (i.e., p ∈ S4).

(ii) Each laptop appears in at least one ordered pair in T.

(iii) Each access point appears in at least one ordered pair in T.

This way, by trying out all the connections specified by the pairs in T, we can be sure that each laptop and each access point have correctly functioning software.

The problem is: Given the sets S4 for each laptop (i.e., which laptops are within range of which access points), and a number k, decide whether there is a test set of size at most k.

Example. Suppose that n = 3; laptop 1 is within range of access points 1 and 2; laptop 2 is within range of access point 2; and laptop 3 is within range of access points 2 and 3. Then the set of pairs

(laptop 1, access point 1), (laptop 2, access point 2),

(laptop 3, access point 3)

would form a test set of size 3.

(a) Give an example of an instance of this problem for which there is no test set of size n. (Recall that we assume each laptop is within range of at least one access point, and each access point p has at least one laptop within range of it.)

(b) Give a polynomial-time algorithm that takes the input to an instance of this problem (including the parameter k) and decides whether there is a test set of size at most k.

22. Let M be an n × n matrix with each entry equal to either 0 or 1. Let mij denote the entry in row i and column j. A diagonal entry is one of the form mii for some i.

Swapping rows i and j of the matrix M denotes the following action: we swap the values mik and mjk for k = 1, 2, ... , n. Swapping two columns is defined analogously.

We say that M is rearrangeable if it is possible to swap some of the pairs of rows and some of the pairs of columns (in any sequence) so that, after all the swapping, all the diagonal entries of M are equal to 1.

(a) Give an example of a matrix M that is not rearrangeable, but for which at least one entry in each row and each column is equal to 1.

(b) Give a polynomial-time algorithm that determines whether a matrix M with 0-1 entries is rearrangeable.

23. Suppose you're looking at a flow network G with source s and sink t, and you want to be able to express something like the following intuitive no- tion: Some nodes are clearly on the "source side" of the main bottlenecks; some nodes are clearly on the "sink side" of the main bottlenecks; and some nodes are in the middle. However, G can have many minimum cuts, so we have to be careful in how we try making this idea precise.

Here's one way to divide the nodes of G into three categories of this sort.
. We say a node v is upstream if, for all minimum s-t cuts (A, B), we have v ∈ A-that is, v lies on the source side of every minimum cut.
. We say a node v is downstream if, for all minimum s-t cuts (A, B), we have v ∈ B-that is, v lies on the sink side of every minimum cut.
. We say a node v is central if it is neither upstream nor downstream; there is at least one minimum s-t cut (A, B) for which v ∈ A, and at least one minimum s-t cut (Ar, Br) for which v ∈ Br.

Give an algorithm that takes a flow network G and classifies each of its nodes as being upstream, downstream, or central. The running time of your algorithm should be within a constant factor of the time required to compute a single maximum flow.

24. Let G = (V , E) be a directed graph, with source s ∈ V, sink t ∈ V, and nonnegative edge capacities {ce}. Give a polynomial-time algorithm to decide whether G has a unique minimum s-t cut (i.e., an s-t of capacity strictly less than that of all other s-t cuts).

25. Suppose you live in a big apartment with a lot of friends. Over the course of a year, there are many occasions when one of you pays for an expense shared by some subset of the apartment, with the expectation that everything will get balanced out fairly at the end of the year. For example, one of you may pay the whole phone bill in a given month, another will occasionally make communal grocery runs to the nearby organic food emporium, and a third might sometimes use a credit card to cover the whole bill at the local Italian-Indian restaurant, Little Idli.

In any case, it's now the end of the year and time to settle up. There are n people in the apartment; and for each ordered pair (i, j) there's an amount aij ≥ 0 that i owes j, accumulated over the course of the year. We will require that for any two people i and j, at least one of the quantities aij or aji is equal to 0. This can be easily made to happen as follows: If it turns out that i owes j a positive amount x, and j owes i a positive amount y < x, then we will subtract off y from both sides and declare aij = x - y while aji = 0. In terms of all these quantities, we now define the imbalance of a person i to be the sum of the amounts that i is owed by everyone else, minus the sum of the amounts that i owes everyone else. (Note that an imbalance can be positive, negative, or zero.)

In order to restore all imbalances to 0, so that everyone departs on good terms, certain people will write checks to others; in other words, for certain ordered pairs (i, j), i will write a check to j for an amount bij > 0.

We will say that a set of checks constitutes a reconciliation if, for each person i, the total value of the checks received by i, minus the total value of the checks written by i, is equal to the imbalance of i. Finally, you and your friends feel it is bad form for i to write j a check if i did not actually owe j money, so we say that a reconciliation is consistent if, whenever i writes a check to j, it is the case that aij > 0.
Show that, for any set of amounts aij, there is always a consistent reconciliation in which at most n - 1 checks get written, by giving a polynomial-time algorithm to compute such a reconciliation.

26. You can tell that cellular phones are at work in rural communities, from the giant microwave towers you sometimes see sprouting out of corn fields and cow pastures. Let's consider a very simplified model of a cellular phone network in a sparsely populated area.
We are given the locations of n base stations, specified as points b1, ..., bn in the plane. We are also given the locations of n cellular phones, specified as points p1, ... , pn in the plane. Finally, we are given a range parameter 6> 0. We call the set of cell phones fully connected if it is possible to assign each phone to a base station in such a way that

. Each phone is assigned to a different base station, and

. If a phone at pi is assigned to a base station at bj, then the straight-line distance between the points pi and bj is at most 6.

Suppose that the owner of the cell phone at point p1 decides to go for a drive, traveling continuously for a total of z units of distance due east. As this cell phone moves, we may have to update the assignment of phones to base stations (possibly several times) in order to keep the set of phones fully connected.

Give a polynomial-time algorithm to decide whether it is possible to keep the set of phones fully connected at all times during the travel of this one cell phone. (You should assume that all other phones remain sta- tionary during this travel.) If it is possible, you should report a sequence of assignments of phones to base stations that will be sufficient in order to maintain full connectivity; if it is not possible, you should report a point on the traveling phone's path at which full connectivity cannot be maintained.

You should try to make your algorithm run in O(n3) time if possible.

Example. Suppose we have phones at p1 = (0, 0) and p2 = (2, 1); we have base stations at b1 = (1, 1) and b2 = (3, 1); and 6 = 2. Now consider the case in which the phone at p1 moves due east a distance of 4 units, ending at (4, 0). Then it is possible to keep the phones fully connected during this motion: We begin by assigning p1 to b1 and p2 to b2, and we reassign p1 to b2 and p2 to b1 during the motion (for example, when p1 passes the point (2, 0)).

27. Some of your friends with jobs out West decide they really need some extra time each day to sit in front of their laptops, and the morning commute from Woodside to Palo Alto seems like the only option. So they decide to carpool to work.

Unfortunately, they all hate to drive, so they want to make sure that any carpool arrangement they agree upon is fair and doesn't overload any individual with too much driving. Some sort of simple round-robin scheme is out, because none of them goes to work every day, and so the subset of them in the car varies from day to day.

Here's one way to define fairness. Let the people be labeled S = {p1, ... , pk}. We say that the total driving obligation of pj over a set of days is the expected number of times that pj would have driven, had a driver been chosen uniformly at random from among the people going to work each day. More concretely, suppose the carpool plan lasts for d days, and on the ith day a subset Si ⊆ S of the people go to work. Then the above definition of the total driving obligation 6j for pj can be written as j∈ i |Si| . Ideally, we'd like to require that p drives at most 6 times; unfortunately, 6j may not be an integer.

So let's say that a driving schedule is a choice of a driver for each day-that is, a sequence pi1, pi2, ... , pid with pit ∈ St-and that a fair driving schedule is one in which each pj is chosen as the driver on at most J6j] days.

(a) Prove that for any sequence of sets S1, ... , Sd, there exists a fair driving schedule.

(b) Give an algorithm to compute a fair driving schedule with running time polynomial in k and d.

28. A group of students has decided to add some features to Cornell's on-line Course Management System (CMS), to handle aspects of course planning that are not currently covered by the software. They're beginning with a module that helps schedule office hours at the start of the semester.

Their initial prototype works as follows. The office hour schedule will be the same from one week to the next, so it's enough to focus on the scheduling problem for a single week. The course administrator enters a collection of nonoverlapping one-hour time intervals I1, I2, ... , Ik when it would be possible for teaching assistants (TAs) to hold office hours; the eventual office-hour schedule will consist of a subset of some, but generally not all, of these time slots. Then each of the TAs enters his or her weekly schedule, showing the times when he or she would be available to hold office hours.

Finally, the course administrator specifies, for parameters a, b, and c, that they would like each TA to hold between a and b office hours per week, and they would like a total of exactly c office hours to be held over the course of the week.

The problem, then, is how to assign each TA to some of the office- hour time slots, so that each TA is available for each of his or her office- hour slots, and so that the right number of office hours gets held. (There should be only one TA at each office hour.)
Example. Suppose there are five possible time slots for office hours:

I1 = Mon 3-4 P.M.; I2 = Tue 1-2 P.M.; I3 = Wed 10-11 A.M.; I4 = Wed 3-4 P.M.; and I5 = Thu 10-11 A.M..

There are two TAs; the first would be able to hold office hours at any time on Monday or Wednesday afternoons, and the second would be able to hold office hours at any time on Tuesday, Wednesday, or Thursday. (In general, TA availability might be more complicated to specify than this, but we're keeping this example simple.) Finally, each TA should hold between a = 1 and b = 2 office hours, and we want exactly c = 3 office hours per week total.

One possible solution would be to have the first TA hold office hours in time slot I1, and the second TA to hold office hours in time slots I2 and I5.

(a) Give a polynomial-time algorithm that takes the input to an instance of this problem (the time slots, the TA schedules, and the parameters a, b, and c) and does one of the following two things:

- Constructs a valid schedule for office hours, specifying which
TA will cover which time slots, or

- Reports (correctly) that there is no valid way to schedule office hours.

(b) This office-hour scheduling feature becomes very popular, and so course staffs begin to demand more. In particular, they observe that it's good to have a greater density of office hours closer to the due date of a homework assignment.

So what they want to be able to do is to specify an office-hour density parameter for each day of the week: The number di specifies that they want to have at least di office hours on a given day i of the week.

For example, suppose that in our previous example, we add the constraint that we want at least one office hour on Wednesday and at least one office hour on Thursday. Then the previous solution does not work; but there is a possible solution in which we have the first TA hold office hours in time slot I1, and the second TA hold office hours in time slots I3 and I5. (Another solution would be to have the first TA hold office hours in time slots I1 and I4, and the second TA hold office hours in time slot I5.)

Give a polynomial-time algorithm that computes office-hour schedules under this more complex set of constraints. The algo- rithm should either construct a schedule or report (correctly) that none exists.

29. Some of your friends have recently graduated and started a small com- pany, which they are currently running out of their parents' garages in Santa Clara. They're in the process of porting all their software from an old system to a new, revved-up system; and they're facing the following problem.

They have a collection of n software applications, {1, 2, ... , n}, run- ning on their old system; and they'd like to port some of these to the new system. If they move application i to the new system, they expect a net (monetary) benefit of bi ≥ 0. The different software applications interact with one another; if applications i and j have extensive interaction, then the company will incur an expense if they move one of i or j to the new system but not both; let's denote this expense by xij ≥ 0.

So, if the situation were really this simple, your friends would just port all n applications, achieving a total benefit of .i bi. Unfortunately,
there's a problem. . . .

Due to small but fundamental incompatibilities between the two systems, there's no way to port application 1 to the new system; it will have to remain on the old system. Nevertheless, it might still pay off to port some of the other applications, accruing the associated benefit and incurring the expense of the interaction between applications on different systems.

So this is the question they pose to you: Which of the remaining applications, if any, should be moved? Give a polynomial-time algorithm to find a set S ⊆ {2, 3, ... , n} for which the sum of the benefits minus the expenses of moving the applications in S to the new system is maximized.

30. Consider a variation on the previous problem. In the new scenario, any application can potentially be moved, but now some of the benefits bi for moving to the new system are in fact negative: If bi < 0, then it is preferable (by an amount quantified in bi) to keep i on the old system. Again, give a polynomial-time algorithm to find a set S ⊆ {1, 2, ... , n} for which the sum of the benefits minus the expenses of moving the applications in S to the new system is maximized.

31. Some of your friends are interning at the small high-tech company Web- Exodus. A running joke among the employees there is that the back room has less space devoted to high-end servers than it does to empty boxes of computer equipment, piled up in case something needs to be shipped back to the supplier for maintainence.

A few days ago, a large shipment of computer monitors arrived, each in its own large box; and since there are many different kinds of monitors in the shipment, the boxes do not all have the same dimensions. A bunch of people spent some time in the morning trying to figure out how to store all these things, realizing of course that less space would be taken up if some of the boxes could be nested inside others.

Suppose each box i is a rectangular parallelepiped with side lengths equal to (i1, i2, i3); and suppose each side length is strictly between half a meter and one meter. Geometrically, you know what it means for one box to nest inside another: It's possible if you can rotate the smaller so that it fits inside the larger in each dimension. Formally, we can say that box i with dimensions (i1, i2, i3) nests inside box j with dimensions (j1, j2, j3) if there is a permutation a, b, c of the dimensions {1, 2, 3} so that ia < j1, and ib < j2, and ic < j3. Of course, nesting is recursive: If i nests in j, and j nests in k, then by putting i inside j inside k, only box k is visible. We say that a nesting arrangement for a set of n boxes is a sequence of operations in which a box i is put inside another box j in which it nests; and if there were already boxes nested inside i, then these end up inside j as well. (Also notice the following: Since the side lengths of i are more than half a meter each, and since the side lengths of j are less than a meter each, box i will take up more than half of each dimension of j, and so after i is put inside j, nothing else can be put inside j.) We say that a box k is visible in a nesting arrangement if the sequence of operations does not result in its ever being put inside another box.

Here is the problem faced by the people at WebExodus: Since only the visible boxes are taking up any space, how should a nesting arrangement be chosen so as to minimize the number of visible boxes?

Give a polynomial-time algorithm to solve this problem.

Example. Suppose there are three boxes with dimensions (.6, .6, .6), (.75, .75, .75), and (.9, .7, .7). The first box can be put into either of the second or third boxes; but in any nesting arrangement, both the second and third boxes will be visible. So the minimum possible number of vis- ible boxes is two, and one solution that achieves this is to nest the first box inside the second.

32. Given a graph G = (V , E), and a natural number k, we can define a relation G , k G , k -→G , k on pairs of vertices of G as follows. If x, y ∈ V, we say that x -→G , k y if there exist k mutually edge-disjoint paths from x to y in G. Is it true that for every G and every k ≥ 0, the relation -→ is transitive?

That is, is it always the case that if x -→G , k y and y -→ G , k z, then we have x -→G , k z? Give a proof or a counterexample.

33. Let G = (V , E) be a directed graph, and suppose that for each node v, the number of edges into v is equal to the number of edges out of v. That is, for all v,

|{(u, v) : (u, v) ∈ E}|= |{(v, w) : (v, w) ∈ E}|.

Let x, y be two nodes of G, and suppose that there exist k mutually edge- disjoint paths from x to y. Under these conditions, does it follow that there exist k mutually edge-disjoint paths from y to x? Give a proof or a counterexample with explanation.

34. Ad hoc networks, made up of low-powered wireless devices, have been proposed for situations like natural disasters in which the coordinators of a rescue effort might want to monitor conditions in a hard-to-reach area. The idea is that a large collection of these wireless devices could be dropped into such an area from an airplane and then configured into a functioning network.

Note that we're talking about (a) relatively inexpensive devices that are (b) being dropped from an airplane into (c) dangerous territory; and for the combination of reasons (a), (b), and (c), it becomes necessary to include provisions for dealing with the failure of a reasonable number of the nodes.

We'd like it to be the case that if one of the devices v detects that it is in danger of failing, it should transmit a representation of its current state to some other device in the network. Each device has a limited transmitting range-say it can communicate with other devices that lie within d meters of it. Moreover, since we don't want it to try transmitting its state to a device that has already failed, we should include some redundancy: A device v should have a set of k other devices that it can potentially contact, each within d meters of it. We'll call this a back-up set for device v.

(a) Suppose you're given a set of n wireless devices, with positions represented by an (x, y) coordinate pair for each. Design an algorithm that determines whether it is possible to choose a back-up set for each device (i.e., k other devices, each within d meters), with the further property that, for some parameter b, no device appears in the back-up set of more than b other devices. The algorithm should output the back-up sets themselves, provided they can be found.

(b) The idea that, for each pair of devices v and w, there's a strict dichotomy between being "in range" or "out of range" is a simplified abstraction. More accurately, there's a power decay function f (•) that specifies, for a pair of devices at distance δ, the signal strength f (δ) that they'll be able to achieve on their wireless connection. (We'll assume that f (δ) decreases with increasing δ.)

We might want to build this into our notion of back-up sets as follows: among the k devices in the back-up set of v, there should be at least one that can be reached with very high signal strength, at least one other that can be reached with moderately high signal strength, and so forth. More concretely, we have values p1 ≥ p2 ≥ ... pk, so that if the back-up set for v consists of devices at distances d1 ≤ d2 ≤ ... dk, then we should have f (dj) ≥ pj for each j.

Give an algorithm that determines whether it is possible to choose a back-up set for each device subject to this more detailed condition, still requiring that no device should appear in the back-up set of more than b other devices. Again, the algorithm should output the back-up sets themselves, provided they can be found.

35. You're designing an interactive image segmentation tool that works as follows. You start with the image segmentation setup described in Section 7.10, with n pixels, a set of neighboring pairs, and parameters {ai}, {bi}, and {pij}. We will make two assumptions about this instance. First, we will suppose that each of the parameters {ai}, {bi}, and {pij} is a nonnegative integer between 0 and d, for some number d. Second, we will suppose that the neighbor relation among the pixels has the property that each pixel is a neighbor of at most four other pixels (so in the resulting graph, there are at most four edges out of each node).

You first perform an initial segmentation (A0, B0) so as to maximize the quantity q(A0, B0). Now, this might result in certain pixels being assigned to the background when the user knows that they ought to be in the foreground. So, when presented with the segmentation, the user has the option of mouse-clicking on a particular pixel v1, thereby bringing it to the foreground. But the tool should not simply bring this pixel into the foreground; rather, it should compute a segmentation (A1, B1) that maximizes the quantity q(A1, B1) subject to the condition that v1 is in the foreground. (In practice, this is useful for the following kind of operation: In segmenting a photo of a group of people, perhaps someone is holding a bag that has been accidentally labeled as part of the background. By clicking on a single pixel belonging to the bag, and recomputing an optimal segmentation subject to the new condition, the whole bag will often become part of the foreground.)

In fact, the system should allow the user to perform a sequence of such mouse-clicks v1, v2, ... , vt; and after mouse-click vi, the sys- tem should produce a segmentation (Ai, Bi) that maximizes the quantity q(Ai, Bi) subject to the condition that all of v1, v2, ... , vi are in the fore- ground.

Give an algorithm that performs these operations so that the initial segmentation is computed within a constant factor of the time for a single maximum flow, and then the interaction with the user is handled in O(dn) time per mouse-click.

(Note: Solved Exercise 1 from this chapter is a useful primitive for doing this. Also, the symmetric operation of forcing a pixel to belong to the background can be handled by analogous means, but you do not have to work this out here.)

36. We now consider a different variation of the image segmentation problem in Section 7.10. We will develop a solution to an image labeling problem, where the goal is to label each pixel with a rough estimate of its distance from the camera (rather than the simple foreground/background labeling used in the text). The possible labels for each pixel will be 0, 1, 2, ... , M for some integer M.

Let G = (V , E) denote the graph whose nodes are pixels, and edges indicate neighboring pairs of pixels. A labeling of the pixels is a partition of V into sets A0, A1, ... , AM , where Ak is the set of pixels that is labeled with distance k for k = 0, ... , M. We will seek a labeling of minimum cost ; the cost will come from two types of terms. By analogy with the fore- ground/background segmentation problem, we will have an assignment cost : for each pixel i and label k, the cost ai,k is the cost of assigning label k to pixel i. Next, if two neighboring pixels (i, j) ∈ E are assigned different labels, there will be a separation cost. In Section 7.10, we used a sepa- ration penalty pij. In our current problem, the separation cost will also depend on how far the two pixels are separated; specifically, it will be proportional to the difference in value between their two labels.
Thus the overall cost qr of a labeling is defined as follows:

746_Figure7.jpg

Figure 7.30 The set of nodes corresponding to a single pixel i in Exercise 36 (shown together with the source s and sink t).

872_Figure8.jpg

 

The goal of this problem is to develop a polynomial-time algorithm that finds the optimal labeling given the graph G and the penalty pa- rameters ai,k and pij. The algorithm will be based on constructing a flow network, and we will start you off on designing the algorithm by providing a portion of the construction.

The flow network will have a source s and a sink t. In addition, for each pixel i ∈ V we will have nodes vi,k in the flow network for k = 1, ..., M, as shown in Figure 7.30. (M = 5 in the example in the figure.)

For notational convenience, the nodes vi,0 and vi,M+1 will refer to s and t, respectively, for any choice of i ∈ V.

We now add edges (vi,k, vi,k+1) with capacity ai,k for k = 0, ... , M; and edges (vi,k+1, vi,k) in the opposite direction with very large capacity L. We will refer to this collection of nodes and edges as the chain associated with pixel i.

Notice that if we make this very large capacity L large enough, then there will be no minimum cut (A, B) so that an edge of capacity L leaves the set A. (How large do we have to make it for this to happen?). Hence, for any minimum cut (A, B), and each pixel i, there will be exactly one low- capacity edge in the chain associated with i that leaves the set A. (You should check that if there were two such edges, then a large-capacity edge would also have to leave the set A.)

Finally, here's the question: Use the nodes and edges defined so far to complete the construction of a flow network with the property that a minimum-cost labeling can be efficiently computed from a minimum s-t cut. You should prove that your construction has the desired property, and show how to recover the minimum-cost labeling from the cut.

37. In a standard minimum s-t cut problem, we assume that all capacities are nonnegative; allowing an arbitrary set of positive and negative capacities results in a problem that is computationally much more difficult. How- ever, as we'll see here, it is possible to relax the nonnegativity requirement a little and still have a problem that can be solved in polynomial time.

Let G = (V , E) be a directed graph, with source s ∈ V, sink t ∈ V, and edge capacities {ce}. Suppose that for every edge e that has neither s nor t as an endpoint, we have ce ≥ 0. Thus ce can be negative for edges e that have at least one end equal to either s or t. Give a polynomial-time algorithm to find an s-t cut of minimum value in such a graph. (Despite the new nonnegativity requirements, we still define the value of an s-t cut (A, B) to be the sum of the capacities of all edges e for which the tail of e is in A and the head of e is in B.)

38. You're working with a large database of employee records. For the pur- poses of this question, we'll picture the database as a two-dimensional table T with a set R of m rows and a set C of n columns; the rows corre- spond to individual employees, and the columns correspond to different attributes.
To take a simple example, we may have four columns labeled
name, phone number, start date, managerrs name

and a table with five employees as shown here.

name

phone number

start date

manager's name

Alanis

3-4563

6/13/95

Chelsea

Chelsea

3-2341

1/20/93

Lou

Elrond

3-2345

12/19/01

Chelsea

Hal

3-9000

1/12/97

Chelsea

Raj

3-3453

7/1/96

Chelsea

Given a subset S of the columns, we can obtain a new, smaller table by keeping only the entries that involve columns from S. We will call this new table the projection of T onto S, and denote it by T[S]. For example, if S = {name, start date}, then the projection T[S] would be the table consisting of just the first and third columns.

There's a different operation on tables that is also useful, which is to permute the columns. Given a permutation p of the columns, we can obtain a new table of the same size as T by simply reordering the columns according to p. We will call this new table the permutation of T by p, and denote it by Tp.

All of this comes into play for your particular application, as follows. You have k different subsets of the columns S1, S2, ... , Sk that you're going to be working with a lot, so you'd like to have them available in a readily accessible format. One choice would be to store the k projections T[S1], T[S2], ... , T[Sk], but this would take up a lot of space. In considering alternatives to this, you learn that you may not need to explicitly project onto each subset, because the underlying database system can deal with a subset of the columns particularly efficiently if (in some order) the members of the subset constitute a prefix of the columns in left-to-right order. So, in our example, the subsets {name, phone number} and {name, start date, phone number,} constitute prefixes (they're the first two and first three columns from the left, respectively); and as such, they can be processed much more efficiently in this table than a subset such as
{name, start date}, which does not constitute a prefix. (Again, note that a given subset Si does not come with a specified order, and so we are interested in whether there is some order under which it forms a prefix of the columns.)

So here's the question: Given a parameter 4< k, can you find 4 per- mutations of the columns p1, p2, ... , p4 so that for every one of the given subsets Si (for i = 1, 2, ... , k), it's the case that the columns in Si consti- tute a prefix of at least one of the permuted tables Tp1, Tp2, ... , Tp4? We'll say that such a set of permutations constitutes a valid solution to the problem; if a valid solution exists, it means you only need to store the 4 permuted tables rather than all k projections. Give a polynomial-time algorithm to solve this problem; for instances on which there is a valid solution, your algorithm should return an appropriate set of 4 permuta- tions.

Example. Suppose the table is as above, the given subsets are
S1 = {name, phone number},
S2 = {name, start date},
S3 = {name, managerrs name, start date},
and 4 = 2. Then there is a valid solution to the instance, and it could be achieved by the two permutations
p1 = {name, phone number, start date, managerrs name},
p2 = {name, start date, managerrs name, phone number}.
This way, S1 constitutes a prefix of the permuted table Tp1, and both S2
and S3 constitute prefixes of the permuted table Tp2.

39. You are consulting for an environmental statistics firm. They collect statistics and publish the collected data in a book. The statistics are about populations of different regions in the world and are recorded in multiples of one million. Examples of such statistics would look like the following table.

Country

A

B

C

Total

grown-up men

11.998

9.083

2.919

24.000

grown-up women

12.983

10.872

3.145

27.000

children

1.019

2.045

0.936

4.000

Total

26.000

22.000

7.000

55.000

We will assume here for simplicity that our data is such that all row and column sums are integers. The Census Rounding Problem is to round all data to integers without changing any row or column sum. Each fractional number can be rounded either up or down. For example, a good rounding for our table data would be as follows.

Country

A

B

C

Total

grown-up men

11.000

10.000

3.000

24.000

grown-up women

13.000

10.000

4.000

27.000

children

2.000

2.000

0.000

4.000

Total

26.000

22.000

7.000

55.000

(a) Consider first the special case when all data are between 0 and 1. So you have a matrix of fractional numbers between 0 and 1, and your problem is to round each fraction that is between 0 and 1 to either 0 or 1 without changing the row or column sums. Use a flow computation to check if the desired rounding is possible.

(b) Consider the Census Rounding Problem as defined above, where row and column sums are integers, and you want to round each fractional number α to either [α] or Jα]. Use a flow computation to check if the desired rounding is possible.

(c) Prove that the rounding we are looking for in (a) and (b) always exists.

40. In a lot of numerical computations, we can ask about the "stability" or "robustness" of the answer. This kind of question can be asked for combinatorial problems as well; here's one way of phrasing the question for the Minimum Spanning Tree Problem.

Suppose you are given a graph G = (V , E), with a cost ce on each edge e. We view the costs as quantities that have been measured experimentally, subject to possible errors in measurement. Thus, the minimum spanning tree one computes for G may not in fact be the "real" minimum spanning tree.

Given error parameters ε> 0 and k > 0, and a specific edge er = (u, v), you would like to be able to make a claim of the following form.

(∗) Even if the cost of each edge were to be changed by at most ε (either increased or decreased), and the costs of k of the edges other than er were further changed to arbitrarily different values, the edge er would still not belong to any minimum spanning tree of G.

Such a property provides a type of guarantee that er is not likely to belong to the minimum spanning tree, even assuming significant measurement error.

Give a polynomial-time algorithm that takes G, er, ε, and k, and decides whether or not property (∗) holds for er.

41. Suppose you're managing a collection of processors and must schedule a sequence of jobs over time.

The jobs have the following characteristics. Each job j has an arrival time aj when it is first available for processing, a length 4j which indicates how much processing time it needs, and a deadline dj by which it must be finished. (We'll assume 0 < 4j ≤ dj - aj.) Each job can be run on any of the processors, but only on one at a time; it can also be preempted and resumed from where it left off (possibly after a delay) on another processor.

Moreover, the collection of processors is not entirely static either: You have an overall pool of k possible processors; but for each processor i, there is an interval of time [ti, tr] during which it is available; it is unavailable at all other times.

Given all this data about job requirements and processor availability, you'd like to decide whether the jobs can all be completed or not. Give a polynomial-time algorithm that either produces a schedule completing all jobs by their deadlines or reports (correctly) that no such schedule exists. You may assume that all the parameters associated with the problem are integers.

Example. Suppose we have two jobs J1 and J2. J1 arrives at time 0, is due at time 4, and has length 3. J2 arrives at time 1, is due at time 3, and has length 2. We also have two processors P1 and P2. P1 is available between times 0 and 4; P2 is available between times 2 and 3. In this case, there is a schedule that gets both jobs done.

. At time 0, we start job J1 on processor P1.

. At time 1, we preempt J1 to start J2 on P1.
. At time 2, we resume J1 on P2. (J2 continues processing on P1.)
. At time 3, J2 completes by its deadline. P2 ceases to be available, so we move J1 back to P1 to finish its remaining one unit of processing there.
. At time 4, J1 completes its processing on P1.

Notice that there is no solution that does not involve preemption and moving of jobs.

42. Give a polynomial-time algorithm for the following minimization ana- logue of the Maximum-Flow Problem. You are given a directed graph G = (V , E), with a source s ∈ V and sink t ∈ V, and numbers (capacities) 4(v, w) for each edge (v, w) ∈ E. We define a flow f , and the value of a flow, as usual, requiring that all nodes except s and t satisfy flow conserva- tion. However, the given numbers are lower bounds on edge flow-that is, they require that f (v, w) ≥ 4(v, w) for every edge (v, w) ∈ E, and there is no upper bound on flow values on edges.

(a) Give a polynomial-time algorithm that finds a feasible flow of mini- mum possible value.

(b) Prove an analogue of the Max-Flow Min-Cut Theorem for this problem (i.e., does min-flow = max-cut?).

43. You are trying to solve a circulation problem, but it is not feasible. The problem has demands, but no capacity limits on the edges. More formally, there is a graph G = (V , E), and demands dv for each node v (satisfying v∈V dv = 0), and the problem is to decide if there is a flow f such that f (e) ≥ 0 and f in(v) - f out(v) = dv for all nodes v ∈ V. Note that this problem can be solved via the circulation algorithm from Section 7.7 by setting ce = +∞ for all edges e ∈ E. (Alternately, it is enough to set ce to be an extremely large number for each edge-say, larger than the total of all positive demands dv in the graph.)

You want to fix up the graph to make the problem feasible, so it would be very useful to know why the problem is not feasible as it stands now. On a closer look, you see that there is a subset U of nodes such that

there is no edge into , and yet . v∈U dv > 0. You quickly realize that the existence of such a set immediately implies that the flow cannot exist:

The set U has a positive total demand, and so needs incoming flow, and yet U has no edges into it. In trying to evaluate how far the problem is from being solvable, you wonder how big the demand of a set with no incoming edges can be.

Give a polynomial-time algorithm to find a subset S ⊂ V of nodes such that there is no edge into and for which . d ∈is as large as possible subject to this condition.

44. Suppose we are given a directed network G = (V , E) with a root node r and a set of terminals T ⊆ V. We'd like to disconnect many terminals from r, while cutting relatively few edges.

We make this trade-off precise as follows. For a set of edges F ⊆ E, let q(F) denote the number of nodes v ∈ T such that there is no r-v path in the subgraph (V , E - F). Give a polynomial-time algorithm to find a set F of edges that maximizes the quantity q(F) - |F |. (Note that setting F equal to the empty set is an option.)

45. Consider the following definition. We are given a set of n countries that are engaged in trade with one another. For each country i, we have the value si of its budget surplus; this number may be positive or negative, with a negative number indicating a deficit. For each pair of countries i, j, we have the total value eij of all exports from i to j; this number is always nonnegative. We say that a subset S of the countries is free-standing if the sum of the budget surpluses of the countries in S, minus the total value of all exports from countries in S to countries not in S, is nonnegative.

Give a polynomial-time algorithm that takes this data for a set of n countries and decides whether it contains a nonempty free-standing subset that is not equal to the full set.

46. In sociology, one often studies a graph G in which nodes represent people and edges represent those who are friends with each other. Let's assume for purposes of this question that friendship is symmetric, so we can consider an undirected graph.

Now suppose we want to study this graph G, looking for a "close-knit" group of people. One way to formalize this notion would be as follows. For a subset S of nodes, let e(S) denote the number of edges in S-that is, the number of edges that have both ends in S. We define the cohesiveness of S as e(S)/|S|. A natural thing to search for would be a set S of people achieving the maximum cohesiveness.

(a) Give a polynomial-time algorithm that takes a rational number α and determines whether there exists a set S with cohesiveness at least α.

(b) Give a polynomial-time algorithm to find a set S of nodes with maximum cohesiveness.

47. The goal of this problem is to suggest variants of the Preflow-Push Algorithm that speed up the practical running time without ruining its worst-case complexity. Recall that the algorithm maintains the invariant that h(v) ≤ h(w) + 1 for all edges (v, w) in the residual graph of the current preflow. We proved that if f is a flow (not just a preflow) with this invariant, then it is a maximum flow. Heights were monotone increasing, and the running-time analysis depended on bounding the number of times nodes can increase their heights. Practical experience shows that the algorithm is almost always much faster than suggested by the worst case, and that the practical bottleneck of the algorithm is relabeling nodes (and not the nonsaturating pushes that lead to the worst case in the theoretical analysis). The goal of the problems below is to decrease the number of relabelings by increasing heights faster than one by one. Assume you have a graph G with n nodes, m edges, capacities c, source s, and sink t.

(a) The Preflow-Push Algorithm, as described in Section 7.4, starts by setting the flow equal to the capacity ce on all edges e leaving the source, setting the flow to 0 on all other edges, setting h(s) = n, and setting h(v) = 0 for all other nodes v ∈ V. Give an O(m) procedure for initializing node heights that is better than the one we constructed in Section 7.4. Your method should set the height of each node v to be as high as possible given the initial flow.

(b) In this part we will add a new step, called gap relabeling, to Preflow- Push, which will increase the labels of lots of nodes by more than one at a time. Consider a preflow f and heights h satisfying the invariant. A gap in the heights is an integer 0 < h < n so that no node has height exactly h. Assume h is a gap value, and let A be the set of nodes v with heights n > h(v)> h. Gap relabeling is the process of changing the heights of all nodes in A so they are equal to n. Prove that the Preflow-Push Algorithm with gap relabeling is a valid max- flow algorithm. Note that the only new thing that you need to prove is that gap relabeling preserves the invariant above, that h(v) ≤ h(w) + 1 for all edges (v, w) in the residual graph.

(c) In Section 7.4 we proved that h(v) ≤ 2n - 1 throughout the algorithm. Here we will have a variant that has h(v) ≤ n throughout. The idea is that we "freeze" all nodes when they get to height n; that is, nodes at height n are no longer considered active, and hence are not used for push and relabel. This way, at the end of the algorithm we have a preflow and height function that satisfies the invariant above, and so that all excess is at height n. Let B be the set of nodes v so that there is a path from v to t in the residual graph of the current preflow. Let A = V -B. Prove that at the end of the algorithm, (A, B) is a minimum- capacity s-t cut.

(d) The algorithm in part (c) computes a minimum s-t cut but fails to find a maximum flow (as it ends with a preflow that has excesses). Give an algorithm that takes the preflow f at the end of the algorithm of part (c) and converts it into a maximum flow in at most O(mn) time. (Hint: Consider nodes with excess, and try to send the excess back to s using only edges that the flow came on.)

48. In Section 7.4 we considered the Preflow-Push Algorithm, and discussed one particular selection rule for considering vertices. Here we will explore a different selection rule. We will also consider variants of the algorithm that terminate early (and find a cut that is close to the minimum possible).

(a) Let f be any preflow. As f is not necessarily a valid flow, it is possible that the value f out(s) is much higher than the maximum-flow value in G. Show, however, that f in(t) is a lower bound on the maximum-flow value.

(b) Consider a preflow f and a compatible labeling h. Recall that the set A = {v : There is an s-v path in the residual graph Gf }, and B = V -A defines an s-t cut for any preflow f that has a compatible labeling h.

Show that the capacity of the cut (A, B) is equal to c(A, B) = .v∈B ef(v).

Combining (a) and (b) allows the algorithm to terminate early and return (A, B) as an approximately minimum-capacity cut, assuming c(A, B) - f in(t) is sufficiently small. Next we consider an implementa- tion that will work on decreasing this value by trying to push flow out of nodes that have a lot of excess.

(c) The scaling version of the Preflow-Push Algorithm maintains a scal- ing parameter 6. We set 6 initially to be a large power of 2. The algorithm at each step selects a node with excess at least 6 with as small a height as possible. When no nodes (other than t) have ex- cess at least 6, we divide 6 by 2, and continue. Note that this is a valid implementation of the generic Preflow-Push Algorithm. The algorithm runs in phases. A single phase continues as long as 6 is unchanged. Note that 6 starts out at the largest capacity, and the algorithm terminates when 6 = 1. So there are at most O(log C) scal- ing phases. Show how to implement this variant of the algorithm so that the running time can be bounded by O(mn + n log C + K) if the algorithm has K nonsaturating push operations.

(d) Show that the number of nonsaturating push operations in the above algorithm is at most O(n2 log C). Recall that O(log C) bounds the num- ber of scaling phases. To bound the number of nonsaturating push operations in a single scaling phase, consider the potential function .v∈V h(v)ef(v)/6. What is the effect of a nonsaturating push on 8? Which operation(s) can make 8 increase?

49. Consider an assignment problem where we have a set of n stations that can provide service, and there is a set of k requests for service. Say, for example, that the stations are cell towers and the requests are cell phones. Each request can be served by a given set of stations. The problem so far can be represented by a bipartite graph G: one side is the stations, the other the customers, and there is an edge (x, y) between customer x and station y if customer x can be served from station y. Assume that each station can serve at most one customer. Using a max-flow computation, we can decide whether or not all customers can be served, or can get an assignment of a subset of customers to stations maximizing the number of served customers.

Here we consider a version of the problem with an additional compli- cation: Each customer offers a different amount of money for the service. Let U be the set of customers, and assume that customer x ∈ U is willing to pay vx ≥ 0 for being served. Now the goal is to find a subset X ⊂ U max- imizing . v∈to stations.

such that there is an assignment of the customers in X

Consider the following greedy approach. We process customers in order of decreasing value (breaking ties arbitrarily). When considering customer x the algorithm will either "promise" service to x or reject x in the following greedy fashion. Let X be the set of customers that so far have been promised service. We add x to the set X if and only if there is a way to assign X ∪ {x} to servers, and we reject x otherwise. Note that rejected customers will not be considered later. (This is viewed as an advantage: If we need to reject a high-paying customer, at least we can tell him/her early.) However, we do not assign accepted customers to servers in a greedy fashion: we only fix the assignment after the set of accepted customers is fixed. Does this greedy approach produce an optimal set of customers? Prove that it does, or provide a counterexample.

50. Consider the following scheduling problem. There are m machines, each of which can process jobs, one job at a time. The problem is to assign jobs to machines (each job needs to be assigned to exactly one machine) and order the jobs on machines so as to minimize a cost function.

The machines run at different speeds, but jobs are identical in their processing needs. More formally, each machine i has a parameter 4i, and each job requires 4i time if assigned to machine i.

There are n jobs. Jobs have identical processing needs but different levels of urgency. For each job j, we are given a cost function cj(t) that is the cost of completing job j at time t. We assume that the costs are nonnegative, and monotone in t.

A schedule consists of an assignment of jobs to machines, and on each machine the schedule gives the order in which the jobs are done. The job assigned to machine i as the first job will complete at time 4i, the second job at time 24i and so on. For a schedule S, let tS(j) denote the completion time of job j in this schedule. The cost of the schedule is cost(S) = . c (t (j)).

Give a polynomial-time algorithm to find a schedule of minimum cost.

51. Some friends of yours have grown tired of the game "Six Degrees of Kevin Bacon" (after all, they ask, isn't it just breadth-first search?) and decide to invent a game with a little more punch, algorithmically speaking. Here's how it works.

You start with a set X of n actresses and a set Y of n actors, and two players P0 and P1. Player P0 names an actress x1 ∈ X, player P1 names an actor y1 who has appeared in a movie with x1, player P0 names an actress x2 who has appeared in a movie with y1, and so on. Thus, P0 and P1 collectively generate a sequence x1, y1, x2, y2, ... such that each actor/actress in the sequence has costarred with the actress/actor immediately preceding. A player Pi (i = 0, 1) loses when it is Pi's turn to move, and he/she cannot name a member of his/her set who hasn't been named before.

Suppose you are given a specific pair of such sets X and Y , with complete information on who has appeared in a movie with whom. A strat- egy for Pi, in our setting, is an algorithm that takes a current sequence x1, y1, x2, y2, ... and generates a legal next move for Pi (assuming it's Pi's turn to move). Give a polynomial-time algorithm that decides which of the two players can force a win, in a particular instance of this game.

Request for Solution File

Ask an Expert for Answer!!
Data Structure & Algorithms: List all the minimum s-t cuts in the flow network pictured
Reference No:- TGS01630551

Expected delivery within 24 Hours