Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Max Flow Problem in Networks: Linear Programming and Simplex Algorithm, Study notes of Algorithms and Programming

University of Illinois - Urbana-Champaign Algorithms and Programming

The max flow problem in a directed network using linear programming. It explains how to formulate the problem as a linear program, the role of basic and non-basic variables, and the process of pivoting to find the optimal solution. The document also mentions the unbounded and degenerate situations and their solutions.

Typology: Study notes

Pre 2010

Uploaded on 03/16/2009

koofers-user-4xi 🇺🇸

(1)

10 documents

1 / 18

Partial preview of the text

Download Max Flow Problem in Networks: Linear Programming and Simplex Algorithm and more Study notes Algorithms and Programming in PDF only on Docsity! Chapter XV Linear Programming A linear program consists of an objective function and a number of linear constraints. The goal is to optimize (minimize or maximize) the objective function under the constraints. Though highly restrictive in that all the functions involved are linear, linear programming has a wide number of practical as well as theoretical applications. Most optimization prob- lems we have considered so far can be cast as the solution to a linear program, sometimes with the added constraint that the solution must be integer, which makes the problem much harder. XV.1 Examples XV.1.1 Diet Problem Dieter’s Problem A dieter wants to minimize the cost of food that provides the minimal nutritional require- ments. The following are the parameters of the problem: m . . . number of foods n . . . number of nutrients cj . . . cost per gram of food j bi . . . requirement of nutrient i aij . . . grams of nutrient i per gram of food j and the variables in the problem are: xj . . . grams of food i taken The problem is formulated as a linear program as follows: min ∑m j=1 cjxj subj ∑m j=1 aijxj ≥ bi for i = 1, . . . , n xi ≥ 0 for j = 1, . . . , m. Pill Maker’s Problem A pill maker produces pills for each of the nutrients and wants to price them so that it maximizes the profit while staying competitive with the different foods. In addition to the previous parameters, the problem has the variables: XV.1 yi . . . cost per gram of nutrient i and is also formulated as a linear program: max ∑n i=1 biyi subj ∑n i=1 aijyi ≤ cj for j = 1, . . . , m yi ≥ 0 for i = 1, . . . , n. We will see later that these two problems are dual of each other. Linear programming duality shows that both have the same solution (if it exists). XV.1.2 Network Flow We consider the max flow problem in a (directed) network G = (V, E), with source and sink s, t ∈ V , and edges capacities ce for e ∈ E. Let Out(v) and In(v) denote the outgoing and ingoing edges of v, and let fe denote the flow through edge e: the problem is formulated as a linear program as follows: max ∑ e∈Out(s) fe subj ∑ e∈Out(v) fe − ∑ e∈In(v) fe = 0 for v ∈ V − {s, t} fe ≤ ce for e ∈ E fe ≥ 0. XV.1.3 Separability in the Plane Let P = {p1, p2, . . . , pm} and Q = {q1, q2, . . . , qn} be two sets of points in the plane. We would like to know whether there is a line ` such that P and Q are on opposite sides of ` . . . to make it more interesting, let’s say that points on ` are not allowed. This problem can be formulated as a linear program. Let pi,1, pi,2 and qi,1, qi,2 be the coordinates of pi and qi. First, we may assume that ` is not vertical (parallel to the x2-axis) for that case can be easily solved (a one-dimensional problem), and so let ` be described by the equation x2 = αx1 + β. We also distinguish two cases: P above ` and Q below, or viceversa. We deal with the first case, the other is symmetric. So what we want is pi,2 > αpi,1 + β for i = 1, . . . , m qi,2 < αqi,1 + β for i = 1, . . . , n To obtain nonstrict inequalities, we introduce ≥ 0 pi,2 ≥ + αpi,1 + β for i = 1, . . . , m qi,2 ≤ − + αqi,1 + β for i = 1, . . . , n As objective function we take: max . If the LP solution has = 0 or no solution then the sets are not strictly separable. If we insist in obtaining a linear program with only nonnegative variables then we can set α = α+ − α− where α+, α− ≥ 0 and similalry for β (a standard trick to change a free variable into positive variables). Note that this is a problem with just 3 variables (or 5). There are special algorithms that are very efficient when this is the case. In fact, there is a linear time algorithm (with the constant depending on the dimension). We’ll see this later. XV.2 Now we have s1 = s2 = 0 and x1 = 2, x2 = 3, z = 5, which corresponds to vertex ©3 in the figure. We see now that there is advantage in increasing either of the variables s1, s2. The fact that the equations are rewritten in this form (both coeffients of the independent variables are negative), implies that the objective function is at most 5, and so z achieves its maximum under the constraints in vertex ©3. XV.2.2 General Formulation We consider the LP: max cT x subj Ax = b x ≥ 0 where n ≥ m, c is an n×1 column vector, A is an m×n matrix, b is an m×1 column vector, all with entries in R, and x is an n × 1 vector of variables. (If the LP has inequalities, they can be turned into equalities by using slack variables; also if there are free or negative variables they can be changed into nonnegative ones: x = x+−x− or x′ = −x respectively. So it is without loss of generality that we take the LP in this form.) Their entries are denoted cj, ai,j , bi and xj respectively. We assume that rank(A) = m, otherwise linearly dependent rows could be removed. For S ⊆ {1, 2, . . . , n}, let AS denote the m×|S| matrix whose columns are the columns of A whose indices are in S, keeping the order. Similarly xS and cS. Let B ⊆ {1, . . . , n} with |B| = m correspond to linearly independent columns of A (such B exists because A has rank m). Such set of columns are called a basis for the LP. Then we can rewrite the constraints as ABxB + ANxN = b. Since AB has rank m, it is invertible, so we can write xB = A −1 B b−A −1 B ANxN . The variables on the left are the basic variables, while those on the right are the nonbasic ones. Similarly, the equation for the objective function can rewritten as z = cTBxB + c T NxN . Substituting the expression above for xB in terms of xN in the expression for z we get z = cTBA −1 B b + (c T N − c T BA −1 B AN)xN . If we set xN = 0, then we get xB = b ≡ A −1 B b and z = z ≡ c T BA −1 B b. This type of solution xN = 0, xB = b corresponding to a basis is called a basic solution. If xB ≥ 0 then it is a basic feasible solution. It is a vertex in the polyhedron of feasible points, though we are not proving it here. Then we can write, with A = −A−1B AN and c = cTN − c T BA −1 B AN , xB = b + AxN z = z + cT xN This describes xB and z as functions of xN . c is called the vector of reduced costs of the nonbasic variables. The constraints written in this form let us formulate a condition for a basic feasible solution to be optimal. XV.5 Theorem 1. A basic feasible solution (xB = b, xN = 0) is optimal iff c ≤ 0 (the reduced costs are nonpositive). Let’s write these equations more explicitly: xk = bk + ∑ j∈N akjxj k ∈ B z = z + ∑ j∈N cjxj where bk, akj, cj are the entries of b, A, c. These equations are called the dictionary for this basic (feasible) solution. Simplex Algorithm: Phase II The simplex algorithm starts with a basic feasible solution and iteratively pivots from one to another feasible solution until one where the optimality criterion in the theorem above holds. A pivoting step in the simplex algorithm proceeds as follows: 1. If cj ≤ 0 for all j ∈ N then the current basic solution is optimal (terminate) 2. Otherwise let j ∈ N be such that cj > 0: 2.1 If akj ≥ 0 for all k ∈ B then LP is unbounded (terminate) 2.2 Otherwise let ∆ = min { bi |aij | : i ∈ B, aij < 0 } and let i be an index for which ∆ = bi/|aij| (note that bi ≥ 0 so ∆ ≥ 0; furthermore, under “generic” conditions, all basic variables are nonzero and so ∆ > 0). 3. The new set of basic variables is B′←B−{i}∪{j} and N ′←N −{j}∪{i}. The new dictionary is obtained by solving for xj : xj = 1 aij  xi − bi − ∑ l∈N−{j} ailxl   = −bi aij + 1 aij xi + ∑ l∈N−{j} −ail aij xl and substituting this expression for xj in the equations for the other xk, k ∈ B: xk = bk + ∑ l∈N aklxl = bk + ∑ l∈N−{j} aklxl + akj aij  xi − bi − ∑ l∈N−{j} ailxl   = ( bk − akjbi aij ) + akjbi aij xi + ∑ l∈N−{j} ( akl − akjail aij ) xl. XV.6 and also in the expression for z: z = z + ∑ l∈N clxl = z + ∑ l∈N−{j} clxl + cj aij  xi − bi − ∑ l∈N−{j} ailxl   = ( z − cjbi aij ) + cj aij xi + ∑ l∈N−{j} ( cl − cjail aij ) xl. In summary, the new dictionary is: xj = −bi aij + 1 aij xi + ∑ l∈N−{j} −ail aij xl xk = ( bk − akjbi aij ) + akjbi aij xi + ∑ l∈N−{j} ( akl − akjail aij ) xl for k ∈ B − {i} z = ( z − cjbi aij ) + cj aij xi + ∑ l∈N−{j} ( cl − cjail aij ) xl Theorem 2. If ∆ > 0 in each pivoting step, then the simplex algorithm terminates with an optimal basic feasible solution. Proof. Since ∆ > 0, no basis is repeated, and since the number of basis is at most ( n m ) then the algorithm halts after at most that many pivoting steps. Cycling and Perturbation In a generic situation, all basic variables are nonzero and so ∆ > 0. Unfortunately, non- generic or degenerate situations are possible in which basic variables may be zero and then ∆ = 0. If this happens and the choice of new basic variable is arbitrary, then a basis can be repeated, the algorithm may cycle, and termination cannot be guaranteed. Fortunately, degenerate situations can be removed by perturbations.1 There are several known pertur- bation methods or rules. One is to replace b (in Ax = b) with b + ~, where the ~j = j+1, where > 0 is arbitrarily small. This perturbation is implemented symbolically: is not substituted with an actual value, rather a lexicographic rule is derived. Specifically, the following rule is derived (we don’t verify it here) to choose among k, k′ for both of which the value ∆ above is achieved: • find the minimum l for which akl |akj | 6= ak′l |ak′j | and prefer the one which this is smallest. 1Think of a set of lines in the plane. In the generic situation, no more than two lines intersect at a point. In a degenerate situation, more than two lines can intersect at a point. However, if each line is “perturbed” by a rotation through a very small random angle around a random point, then in the perturbed set of lines, no more than two intersect in a point with probability 1. XV.7 x2 ≤ M , where M > 0 is arbitrarily large (a value for M can be estimated from the coefficients of the constraints or, better, M can be handled “simbolically,” just by answering appropriately all the predicates in which it is involved; the details are left as an exercise . . .). To achieve uniqueness, we specify that we want the lexicographically smallest2 feasible point that minimizes the objective function. We consider an ordering h1, h2, . . . , hn of the constraints (halfspaces) and add them in sequence updating the LP solution as they are added. More precisely, let Hi be the union of the four bounding constrains and h1, . . . , hi. The algorithm is based on the following observation: Observation 3. If LP(Hi−1, f) is infeasible, so is LP(Hi, f). Otherwise, let vi−1 be the solution for LP(Hi−1, f). (i) If vi−1 satisfies hi then vi is also the solution for Hi. (ii) If vi−1 does not satisfy hi then vi can be determined as the solution to LP(Hi−1∩ `, f) where ` is the bounding line of hi (in particular, if LP(Hi−1 ∩ `, f) is infeasible, so is LP(H, f)). Proof. (i) is clear. For (ii), let Ci = P (Hi). If vi−1 is not in ` consider the line segment s = vi−1vi. s crosses ` since vi−1 is on one side and vi on the other. Let x be the intersection point. Clearly f(vi) ≥ f(vi−1). If f(vi) > f(vi−1) then because f is linear then f(vi) > f(x), a contradiction. If f(vi) = f(vi−1) then f(vi) = f(x) and x is lexicographically smaller than vi, also a contradiction. v v i i−1 h i C i Summary. The following box summarizes the algorithm: Incremental-2D-LP ({h1, h2, . . . , hn}, f) let vo be the optimal for the four bounding constraints for i = 1 to n do if vi−1 ∈ hi then vi←vi−1 else vi←LP({h1, . . . , hi−1} ∩ `, f) where ` is the bounding line of hi if vi = infeasible then return infeasible return vn 2In the lexicographic ordering, (x1, x2) is smaller than (y1, y2) if either x1 < x2 or x1 = y1 and x2 < y2. XV.10 Analysis. Solving a 1-d linear program takes linear time. So in the i-th step either vi−1 ∈ hi and the running time is O(1), or vi−1 6∈ hi and the running time is O(i). A worst case input sequence in which the latter always happens is possible and so the worst case running time is O(n2). On the other hand, for a fixed input, consider a sequence which is a random permutation: what is the probability of the latter case hapenning in step i ? It is convenient to use again backward analysis. Fix the subset {h1, . . . , hi} of first i constraints (but not the sequence !). The probability that the expensive case happens is equal to the probability that one of the constraints that determines vi is hi, the last constraint added. That is at most 2/i because vi is determined by at most two constraints. Since this holds for any fixed subset {h1, . . . , hi}, then it also holds unconditionally. Therefore the expected running time is ∑ i=1 (O(1) + O(i) · (2/i)) = O(n). XV.3.2 General Case The previous algorithm can be generalized to arbitrary dimension d (number of variables). As before, H is a set of constraining halfspaces. We prefer to formulate it as a recursive algorithm rather than as an incremental one. Again, to keep it simple, we handle the problem of forcing always a bounded solution by adding the set of constraints H0 containing the halfspaces −M ≤ xi ≤ +M . Random-LP (H, B) 1. if |H| = 0 or |B| = d then 2. compute and return optimal solution with constraints B ∪H0 3. choose h ∈ H uniformly at random 4. v←Random-LP(H − {h}, B) 5. if v 6∈ h then v←Random-LP(H − {h}, B ∪ {h}) 6. return v The algorithm is called first with B = ∅. Then defining constraints are added to B as they are found. The proof of correctness is essentially the same as for the incremental 2-d algorithm. The expected running time satisfies the recurrence, where n = |H| and d′ = d− |B|, T (n, d′) ≤ 1 + T (m− 1, d′) + d′ m · T (m− 1, d′ − 1) for d′ > 1, m > 0, and T (0, d′) = 1, T (m, 0) = 1 for all d′ and m. We have used that the optimum solution is determined by d′ constraints (d in total, but |B| aready are in B), and so the probbaility of choosing in step 3 a defining constraint is d′/m. It can be verified by induction (but it is a bit tricky; one has to choose the right induction hypothesis) that T (n, d′) = O(d′!n). Exercise. Make sure that you understand what is going on at least for d = 0, 1, 2 (in which case it is essentially the same incremental algorithm). XV.11 The handling of unboundedness, which we have done with H0 can be done better by first determining d constraints that bound the LP, or finding that the LP is unbounded. This is omitted. But . . . Exercise. Think about the 2-d case. XV.4 Duality Duality plays an important role in the theory of linear programming. It associates to an LP – the primal LP (PLP) – another LP – the dual LP (DLP) – which provides upper bounds (in the case of maximization) to the solution of the PLP. We consider first a PLP of the particular form used for the simplex algorithm, because then we can use readily some byproducts of the description of the simplex algorithm. The general case can be derived similarly and is summarized below. So consider PLP: max cT x subj Ax = b x ≥ 0 DLP: min yT b subj yTA ≥ cT y free Theorem 4 (Weak Duality). Let x and y be feasible solutions of PLP and DLP respec- tively. Then cT x ≤ yT b. Also, (i) if solutions x∗ and y∗ exist, then cT x∗ ≤ y∗T b; if PLP is unbounded then DLP is infeasible, and if DLP is unbounded then PLP is infeasible. Proof. Multiplying Ax = b on the left by yt and yTA ≥ cT on the right by x, we get cT x ≤ yTAx = yT b. (i) follow by taking max on the left and min on the right. (ii) follows because if DLP is feasible then cT x ≤ yT b implies that PLP is bounded, and similarly in the other direction. Theorem 5 (Strong Duality). If PLP is feasible and has a finite optimum then DLP is feasible and finite optimum. Furthermore, if x∗ and y∗ are optimal solutions for PLP and DLP then cT x∗ = y∗T b. Proof. From the developement of the simplex algorithm, we have that for a basis B: xB = A −1 B b−A −1 B ANxN z = cTBA −1 B b + (c T N − c T BA −1 B AN )xN . For an optimal solution x∗, in particular, z∗ = cT x∗ = cTBA −1 B b and cTN − c T BA −1 B AN < 0. XV.12 XV.5.2 Minimum s-t-Cut Problem We consider the same context as for the flow problem. A cut is a set of edges determined by a partition S and T = V − S of the vertices with s ∈ S, t ∈ T : the cut C consits of edges (u, v) with u ∈ S and v ∈ T . The cost of C is ∑ e∈C ce and we are interested in a cut of minimum cost. An alternative way to view a cut is simply as a set of edges that separates s and t: any s-t path must go through an edge in C. The two definitions are not the same in general but when the minimum cut is considered then they coincide (except that a min-cut in the second definition may have extra edges with cost 0). A trivial ILP formulation for the min-cut problem can be written in terms of the alternative definition: min ∑ e∈E ceye subj ∑ e∈p ye ≥ 1 for each s-t-path p in G ye ∈ {0, 1} for each e ∈ E, where ye plays the role of an indicator variable: ye = 1 iff e ∈ C. This formulation is not good because it has a very large number of constraints (the number of s-t-paths may be exponential).3 An alternative formulation with polynomial size uses additional variables πv ∈ {0, 1} for the vertices, with the meaning: πv = 0 iff v can be reached from s without using an egde in C. The LP is: min ∑ e∈E ceye subj πu − πv + ye ≥ 0 for each e = (u, v) ∈ E ye ∈ {0, 1}, for each e ∈ E πu ∈ {0, 1}, for each u ∈ V πs = 0 πt = 1 Claim 7. Given assignements to ye, e ∈ E, there are assignments πu, u ∈ V so that the ye’s and πu’s form a feasible solution to the ILP above iff the ye’s are the indicators for a cut (that is, {e ∈ E : ye = 1} is a cut). Proof. Let’s check first that if C is a cut with indicators ye, e ∈ E, then there is also an assignment for the πu’s so that ye’s and πu’s together are a feasible solution for the LP. Assign πu = 0 to vertices reachable from s through edges not in C, and πu = 1 to any other vertex (that is, not reachable). The constraint πu − πv + ye ≥ 0 trivially holds if πu ≥ πv, so consider πu < πv which implies πu = 0 and πv = 1. Then there is a path from s to u avoiding C, but no such path from s to v. This means that e = (u, v) ∈ C. So ye = 1 and the constraint holds. Now, we consider the other direction. Let ye, e ∈ E, and πu, u ∈ V , be a feasible solution to the LP. Let C be set of edges e = (u, v) such that ye = 1, and consider an s-t-path p. The sum ∑ e=(u,v)∈p(πv − πu) is equal to πt − πs = 1, so for some edge e = (u, v) on p, πv − πu ≥ 1 must hold and so by the LP constraint πe ≥ πv − πu ≥ 1. So ye = 1 and e ∈ C. So, we have an ILP very similar to that for the dual of the max-flow LP, except for 0/1 constraints for the variables. We can argue that these constraints can be dropped and still the solution ye’s are the indicators of a min-cut: 3However, one interesting byproduct of the ellipsoid algorithm is that it is possible to handle such type of LP in polynomial time ! (in some implicit way, of course you wouldn’t want to even write down the constraints). XV.15 (i) First, the 0/1 constraint can be relaxed to a nonnegative integer. Because of the minimization, still the optimal solution will be achieved when the variables have 0/1 values. (ii) Second, the integer constraint can be relaxed to nonnegative real, and still the basic solutions are integer (there may be nonbasic solutions that are noninteger though). This is based on the fact that the matrix of constraints is totally unimodular. This is explained later. Thus, we get that the dual to the max-flow LP is indeed an LP for the min-cut problem. The statement of the max-flow min-cut theorem is a just an example of the strong LP duality. XV.6 Complementary Slackness Consider the following PLP and DLP pair: PLP: max cT x subj Ax ≤ b x ≥ 0 DLP: min yT b subj yTA ≥ cT y ≥ 0 Theorem 8 (Complementary Slackness). If x∗ is an optimal solution of PLP and y∗ is an optimal solution of DLP then (y∗T A− cT )x∗ = 0 and y∗T (b− Ax∗) = 0. In particular, if xi > 0 then y ∗T Ai = ci where Ai is the i-th column of A (if a primal variable is nonzero, then the dual constraint is tight); and if yj > 0 then bj = A jT x∗ where Aj is the j-th row of A (if a dual variable is nonzero, then the primal constraint is tight). Proof. From weak duality, we have cT x∗ ≤ y∗T Ax∗ ≤ y∗T b. Since by Strong Duality, cT x∗ = y∗T b, then the two inequalities in the sequence must be equalities: y∗T b = y∗T Ax∗ = cT x∗ and so the stated equations hold. Consider each component, we see that for the first equality to hold, if y∗j > 0 then bj = A jT x∗, where Aj T is the j-th row of A; and for the second equality to hold, if x∗i > 0 then y ∗T Ai = ci, where Ai is the i-th column of A. Application to Max-Flow, Min-Cut. Exercise. Verify that complementary slack- ness is just a statement that the maximum cut edges are precisely the edges saturated with the max-flow in corresponding primal and dual solutions. XV.16 XV.7 Integer Linear Programming XV.7.1 NP-Hardness of ILP The integer linear programming problem has the following form: max cT x subj Ax ≤ b x ∈ Z. where c, A, b have integer entries. The feasibility problem (whether there are solutions to the constraints) is already NP-complete. The fact that is in NP is not entirely trivial because one would needs to show that if the constraints have a solution, then there is such a solution of size polynomial in the size of the input. Here size involves how many bits are needed to represent all the numbers involved, and we omit proving this here. Lemma 9. The ILP decision problem is NP-hard. Proof. We use a reduction from 3-SAT. Let φ = C1 ∧ C2 ∧ · · · ∧ Cm be a 3-CNF formula where each Ci = `i,1 ∨ `i,2 ∨ `i,3 and each literal `i,j is either some xk or ¬xk. An ILP Φ is constructed from φ as follows. Φ has an integer variable zi for each boolean variable xi in φ. For a literal `, f(`) is defined do the following substitutions: f(`) = { zi if ` = xi 1− zi if ` = ¬xi For a clause C = `1 ∨ `2 ∨ `3, f(C) = f(`1) + f(`2) + f(`3). The ILP has the following constraints: subj f(Cj) ≥ 1 for j = 1, . . . , m xi ≤ 1 for all i xi ≥ 0 for all i xi ∈ Z for all i. It is very easy to check that: φ is satisfiable iff Φ is feasible. XV.7.2 Unimodularity For the network flow problem, we know that when the capacities are integer, then there is an integer solution. This is a property of many linear programs: their basic solutions are integral, and so integral constraints may be dropped. An m × n matrix A with m ≤ n is called unimodular, if any m ×m submatrix of A has determinant +1, 0, or −1. If A is unimodular and B is an invertible m×m submatrix of A, then B−1 is integral (this follows from Cramer’s rule to obtain the inverse). XV.17

Documents

questions

Max Flow Problem in Networks: Linear Programming and Simplex Algorithm, Study notes of Algorithms and Programming

Related documents

Partial preview of the text