Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Conjugate Gradient Methods for Solving Linear Systems and Optimization Problems, Slides of Computer Science

The conjugate gradient method, which is an iterative algorithm used to solve linear systems of equations and optimization problems. It covers topics such as the taylor expansion, the golden section search, and the method of conjugate directions. The document also explains the importance of eigenvalues and eigenvectors in the context of iterative methods and provides a summary of the algorithm's convergence analysis.

Typology: Slides

2012/2013

Uploaded on 03/21/2013

dharmendrae
dharmendrae 🇮🇳

4.6

(19)

129 documents

1 / 63

Toggle sidebar

Related documents


Partial preview of the text

Download Conjugate Gradient Methods for Solving Linear Systems and Optimization Problems and more Slides Computer Science in PDF only on Docsity! Conjugate Gradient Methods Docsity.com Optimization Problems • Solution of equations can be formulated as an optimization problem, e.g., density functional theory in electronic structure, conformation of proteins, etc • Minimization with constraints – operations research (linear programming, optimal conditions in management science, traveling salesman problem, etc) Docsity.com Bracketing and Search in 1D Bracket a minimum means that for given a < b < c, we have f(b) < f(a), and f(b) < f(c). There is a minimum in the interval (a,c). a b c Docsity.com How accurate can we locate a minimum? • Let b a minimum of function f(x),Taylor expanding around b, we have • The best we can do is when the second correction term reaches machine epsilon comparing to the function value, so 21( ) ( ) ( )( ) 2 f x f b f b x b   2 2 | ( ) | | | | | ( ) f b x b b b f b    Docsity.com Golden Section Search • Choose x such that the ratio of intervals [a,b] to [b,c] is the same as [a,x] to [x,b]. Remove [a,x] if f[x] > f[b], or remove [b,c] if f[x] < f[b]. • The asymptotic limit of the ratio is the Golden mean a b c x 5 1 0.61803 2   Docsity.com Local Properties near Minimum • Let P be some point of interest which is at the origin x=0. Taylor expansion gives • Minimizing f is the same as solving the equation 2 , 1 ( ) ( ) 2 1 2                 i i j i i ji i j T T f f f f x x x x x x c x P b x x A x  A x b T for transpose of a matrix Docsity.com Search along Coordinate Directions Search minimum along x direction, followed by search minimum along y direction, and so on. Such method takes a very large number of steps to converge. The curved loops represent f(x,y) = const. Docsity.com Conjugate Condition n1 T  A  n2 = 0 Make a linear coordinate transformation, such that contour is circular and (search) vectors are orthogonal Docsity.com CG Method – Conjugate Gradient (CG) Method M. R. Hestenes & E. Stiefel, 1952 BiCG Method – Biconjugate Gradient (BiCG) Method C. Lanczos, 1952 D. A. H. Jacobs, 1981 C. F. Smith et al., 1990 R. Barret et al., 1994 CGS Method – Conjugate Gradient Squared (CGS) Method (MATLAB Function) P. Sonneveld, 1989 GMRES Method – Generalized Minimal – Residual (GMRES) Method Y. Saad & M. H. Schultz, 1986 R. Barret et al., 1994 Y. Saad, 1996 QMR Method – Quasi–Minimal–Residual (QMR) Method R. Freund & N. Nachtigal, 1990 N. Nachtigal, 1991 R. Barret et al., 1994 Y. Saad, 1996 Docsity.com cgs Conjugate Gradients Squared method Syntaxx = cgs(A,b) cgs(A,b,tol) cgs(A,b,tol,maxit) cgs(A,b,tol,maxit,M) cgs(A,b,tol,maxit,M1,M2) cgs(A,b,tol,maxit,M1,M2,x0) cgs(afun,b,tol,maxit,m1fun,m2fun,x0,p1,p2,...) [x,flag] = cgs(A,b,...) [x,flag,relres] = cgs(A,b,...) [x,flag,relres,iter] = cgs(A,b,...) [x,flag,relres,iter,resvec] = cgs(A,b,...) Docsity.com Conjugate Gradient Method 1. Introduction, Notation and Basic Terms 2. Eigenvalues and Eigenvectors 3. The Method of Steepest Descent 4. Convergence Analysis of the Method of Steepest Descent 5. The Method of Conjugate Directions 6. The Method of Conjugate Gradients 7. Convergence Analysis of Conjugate Gradient Method 8. Complexity of the Conjugate Gradient Method 9. Preconditioning Techniques 10. Conjugate Gradient Type Algorithms for Non- Symmetric Matrices (CGS, Bi-CGSTAB Method) Docsity.com 1. Introduction, Notation and Basic Terms (Cont’d) • A matrix is a rectangular array of numbers, called elements. • The transpose of an mn matrix A is the nm matrix AT with elements aij T = aji • Two square nn matrices A and B are similar if there is a nonsin- gular matrix S such that B=S-1AS • Matrices having a single row are called row vectors; matrices having a single column are called column vectors. Row vector: a = [a1, a2, …, an] Column vector: a = (a1, a2, …, an) • The inner product of two vectors is written as    n i ii 1 T yxyx Docsity.com 1. Introduction, Notation and Basic Terms (Cont’d) • A matrix is called symmetric if aij = aji . • A matrix is positive definite if, for every nonzero vector x xTAx > 0 • Basic matrix identities: (AB)T=BTAT and (AB)-1=B-1A-1 • A quadratic form is a scalar, quadratic function of a vector with the form • The gradient of a quadratic form is c 2 1 )f( TT  xbAxx x bAxxA x x x                     2 1 2 1 )f( x )f( x )(f T n 1 ' Docsity.com 1. Introduction, Notation and Basic Terms (Cont’d) • Various quadratic forms for an arbitrary 22 matrix: Positive-definite matrix Negative-definite matrix Singular positive-indefinite matrix Indefinite matrix Docsity.com 2. Eigenvalues and Eigenvectors (Cont’d) =-0.5 2 • Examples for ||<1 and || >1 • The spectral radius of a matrix is: r(B)= max|i| 1=0.7 2=-2 Docsity.com 2. Eigenvalues and Eigenvectors (Cont’d) • Example: The Jacobi Method for the solution of Ax=b - split the matrix A such that A=D+E - Instead of solving the system Ax=b, one solves the modifi- ed system x=Bx+z, where B=-D-1E and z=D-1b - The iterative method is: x(i+1)= Bx(i)+z, or it terms of the error vector: e(i+1)= Be(i) - For our particular example, we have that the eigenvalues and the eigenvectors of the matrix A are: 1=7, v1=[1, 2] T 2=2, v2=[-2, 1] T - The eigenvalues and eigenvectors of the matrix B are: 1=-2/3=-0.47, v1=[2, 1] T 2= 2/3=0.47, v2=[-2, 1] T Docsity.com • Graphical representation of the convergence of the Jacobi method 2. Eigenvalues and Eigenvectors (Cont’d) Eigenvectors of B together with their eigenvalues Convergence of the Jacobi method which starts at [-2,-2]T and converges to [2, -2]T The error vector e(0) The error vector e(1) The error vector e(2) Docsity.com • Geometrical representation (e) (f) 3. The Method of Steepest Descent (Cont’d) Docsity.com 3. The Method of Steepest Descent (Cont’d) • The algorithm • To avoid one matrix-vector multiplication, one uses (i)(i)(i))1(i(i)(i)(i))1(i (i) (i) (i) (i)(i) (i) (i) ree rxx Arr rr Axbr T T     (i)(i)(i))1(i Arrr  Two matrix-vector multiplications are required. The disadvantage of using this recurrence is that the residual sequence is determined without any feedback from the value of x(i), so that round-off errors may cause x(i) to converge to some point near x. Docsity.com 3. The Method of Steepest Descent (Cont’d) • Summary of the algorithm • To avoid one matrix-vector multiplication, one uses instead (i)(i)(i))1(i Arrr  (i)(i)(i))1(i (i) (i) (i) (i)(i) (i) (i) rxx Arr rr Axbr T T     • Efficient implementation of the method of steepest descent 1ii endif - else - 50le by is divisibi If andiiWhile 0i T T 0 2 max T             rr q r r Ax br rxx qr Arq rr Axbr Docsity.com 4. Convergence Analysis of the Method of Steepest Descent (Cont’d) The influence of the spectral condition number on the convergence Worst-case scenario: =± Docsity.com Steepest Descent Problem • The gradient at the minimum of a line search is orthogonal to the direction of that search  the steepest descent algorithm tends to make right angle turns, taking many steps down a long narrow potential well. Too many steps to get to a simple minimum. Docsity.com Steepest Descent Search in the direction with the largest decrease, i.e., n = -f Constant f contour line (surface) is perpendicular to n, because df = dxf = 0. The current search direction n and next search direction are orthogonal, because for minimum  we have y’() = df(P+n)/d = nT  f|P+n = 0 n n’ nT  n’ = 0 Docsity.com 5. The Method of Conjugate Directions (Cont’d) • The new requirement is now that e(i+1) is A-orthogonal to d(i)   (i) (i) )i( )i()i()i( T )i( )1i( T )i( (i) T )1(i )1(iT )1(i)1(i (i) (i) 0 0 0 0 d d )('f)f( d d Add rd deAd Aed dr x xx T T             If the search vectors were the residuals, this formula would be identical to the method of steepest descent. Docsity.com 5. The Method of Conjugate Directions (Cont’d) • Proof that this process computes x in n-steps - express the error terms as a linear combination of the search directions - use the fact that the search directions are A-orthogonal )j( 1n 0j )j()0( de     Docsity.com 5. The Method of Conjugate Directions (Cont’d) • Calculation of the A-ortogonal search directions by a conjugate Gram-Schmidth process 1. Take a set of linearly independent vectors u0, u1, … , un-1 2. Assume that d(0)=u0 3. For i>0, take an ui and subtracts all the components from it that are not A-orthogonal to the previous search directions )j( T (j) )j( T (i) ij)j( 1i 0j ij)i()i( , Add Adu dud     Docsity.com 6. The Method of Conjugate Gradients • The method of Conjugate Gradients is simply the method of conjugate directions where the search directions are constructed by conjugation of the residuals, i.e. ui=r(i) • This allows us to simplify the calculation of the new search direction because • The new search direction is determined as a linear combination of the previous search direction and the new residual          1ji0 1ji 1 )1i( T )1i( )i( T )i( )1i( T )1i( )i( T )i( )1i(ij rr rr Add rr (i)i)1i()1(i drd   Docsity.com 6. The Method of Conjugate Gradients (Cont’d) • Efficient implementation 1ii endif - else - 50y ivisible b If i is d andiiWhile 0i old new T new newold T new 0 2 newmax new0 T new                    dr d rr q r r Ax br dxx qd Adq rr rd Axbr Docsity.com 7. Convergence Analysis of the CG Method • If the algorithm is performed in exact arithmetic, the exact solution is obtained in at most n-steps. • When finite precision arithmetic is used, rounding errors lead to gradual loss of orthogonality among the residuals, and the finite termination property of the method is lost. • If the matrix A has only m distinct eigenvalues, then the CG will converge in at most m iterations • An error bound on the CG method can be obtained in terms of the A-norm, and after k-iterations: Docsity.com Conjugate gradient iteration • One matrix-vector multiplication per iteration • Two vector dot products per iteration • Four n-vectors of working storage x0 = 0, r0 = b, d0 = r0 for k = 1, 2, 3, . . . αk = (r T k-1rk-1) / (d T k-1Adk-1) step length xk = xk-1 + αk dk-1 approx solution rk = rk-1 – αk Adk-1 residual βk = (r T k rk) / (r T k-1rk-1) improvement dk = rk + βk dk-1 search direction Docsity.com Complexity of linear solvers 2D 3D Sparse Cholesky: O(n1.5 ) O(n2 ) CG, exact arithmetic: O(n2 ) O(n2 ) CG, no precond: O(n1.5 ) O(n1.33 ) CG, modified IC: O(n1.25 ) O(n1.17 ) CG, support trees: O(n1.20 ) -> O(n1+ ) O(n1.75 ) -> O(n1.31 ) Multigrid: O(n) O(n) n1/2 n1/3 Time to solve model problem (Poisson’s equation) on regular mesh Docsity.com 9. Preconditioning Techniques References: • J.A. Meijerink and H.A. van der Vorst, “An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix,” Mathematics of Computation, Vol. 31, No. 137, pp. 148-162 (1977). • J.A. Meijerink and H.A. van der Vorst, “Guidelines for the usage of incomplete decompositions in solving sets of linear equations as they occur in practical problems,” Journal of Computational Physics, Vol. 44, pp. 134-155 (1981). • D.S. Kershaw, “The incomplete Cholesky-Conjugate gradient method for the iterative solution of systems of linear equations,” Journal of Computational Physics, Vol. 26, pp. 43-65 (1978). • H.A. van der Vorst, “High performance preconditioning,” SIAM Journal of Scientific Statistical Computations, Vol. 10, No. 6, pp. 1174-1185 (1989). Docsity.com 9. Preconditioning Techniques (Cont’d) There are sevaral types of preconditioners that can be used: (a) Preconditioners based on splitting of the matrix A, i.e. A = M - N (b) Complete or incomplete factorization of A, e.g. A = LLT + E (c) Approximation of M=A-1 (d) Reordering of the equations and/or unknowns, e.g. Domain Decomposition Docsity.com 9. Preconditioning Techniques (Cont’d) (a) Preconditioning based on splittings of A • Diagonal Preconditioning: M=D=diag{a11, a22, … ann} • Tridiagonal Preconditioning • Block Diagonal Preconditioning                 2 2 0c 8 2 62 23 x b A ,, Docsity.com 9. Preconditioning Techniques (Cont’d) (b) Preconditioning based on factorization of A • Cholesky Factorization (for symmetric and positive definite matrix) as a direct method: A = LLT x = (L-T)(L-1b) where: • Modified Cholesky factorization: A = LDLT niij L LLA L LAL ii i k ikjkji ji i k ikiiii ,,2,1 , 1 1 1 1         Docsity.com 9. Preconditioning Techniques (Cont’d) (d) Domain Decomposition Preconditioning Domain I Domain II                          SBB 0M0 00M L L S00 0M0 00M LM TT T- - where 21 2 1 2 1 1 1 1 ,                                 f d d z x x QBB BA0 B0A bAx 2 1 2 1 22 11 21 , TT 2 1 1 1* * 22 11 2211 21 , BMBBMBSS SBB BM0 B0M M -T-T TT             Docsity.com 10. Conjugate Gradient Type Algorithms for Nonsymmetric Matrices • The Bi-Conjugate Gradient (BCG) Algorithm was propo- sed by Lanczos in 1954. • It solves not only the original system Ax=b but also the dual linear system ATx*=b*. • Each step of this algorithm requires a matrix-by-vector product with both A and AT. • The search direction pj * does not contribute to the solution directly. 1 ,, ,0 *** * * * 0 2 max 0                ii - - andiiWhile , i old new T new newold *T new T* new new T new p rp pr p rr q rr q r r pxx qp pAq Apq rr rp r,p 0r*rAxbr ** *** T               Docsity.com 10. Conjugate Gradient Type Algorithms for Nonsymmetric Matrices (Cont’d) • The Bi-Conjugate Gradient Stabilized (Bi-CGSTAB) Algorithm was developed by van der Vorst in 1992. • Rather than producing iterates whose residual vectors are of the form • it produces iterates with residual vectors of the form   1 ( , arbitrary is,0 * 0 * 0 0 2 max 0 * 0 * 000 0               ii ) - andiiWhile , i old new T new newold T T T new new new T new Appr p rr As sr spxx AsAs Ass Aprs Apr rr rp rAxbr                 0 2 )( rAr j j  0 0 )()( )()( rAAp rAAr jjj jjj     Docsity.com
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved