Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Minors, Principal Minors, and Leading Principal Minors in Mathematics - Prof, Study notes of Microeconomics

An overview of the terms and notation needed to understand notes on matrix theory. It covers concepts such as greek alphabet letters, matrices, determinants, symmetric matrices, sufficient and necessary conditions, gradients, hessian matrices, positive and negative definiteness, convex and concave functions, and minors of a matrix. In section 5, it discusses endogenous and exogenous variables, structural form of a system of equations, and cramer's rule.

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-asl
koofers-user-asl 🇺🇸

10 documents

1 / 13

Toggle sidebar

Related documents


Partial preview of the text

Download Understanding Minors, Principal Minors, and Leading Principal Minors in Mathematics - Prof and more Study notes Microeconomics in PDF only on Docsity! Mathematical Prerequisites for Econ. 7005, “Microeconomic Theory I,” and Econ. 7007, “MacroeconomicTheory I,” at the University of Utah by Gabriel A. Lozada The contents of Sections 1–6 below are required for basic consumer and producer theory, which is usually taught at the very beginning of Econ. 7005. The contents of Section 7 are not required until the topic of uncertainty is reached. Sections 1–4 are a shortened version of a more complete treatment described in lecture notes I wrote for Econ. 7001; those notes are available upon request. The terms and notation needed to understand these notes include, for Section 1: letters of the Greek alphabet commonly used in mathematics; ma- trix, matrix determinant, symmetric matrix, sufficient condition, =⇒, nec- essary condition,⇐=, necessary and sufficient condition,⇐⇒, “iff” (“if and only if”), converse, contrapositive, C n function, “f : Rn → R1,” gradient vector ∇f(x), Hessian matrix ∇2f(x), positive definite, positive semidefi- nite, negative definite, negative semidefinite, convex function, strictly con- vex function, concave function, strictly concave function, quasiconcavity, quasiconvexity, convex set, open set, and closed set. In Section 2: local minimum, local maximum. In Section 3: binding and nonbinding inequality constraints, strict local minimum, strict local maximum. In Section 4: global minimum, global maximum, unique global minimum, unique global maxi- mum. In Section 5: endogenous variables, exogenous variables, dependent variables, independent variables, structural form of a system of equations, differential of a function of multiple variables, matrix inverse, Cramer’s Rule. In Section 6: no additional new terms or notation. In Section 7: probability of an event, “{x : f(x) = 6},” “ ∑ i xi,” “ ∫ f(x) dx,” “fair” random process, intervals of the real line [a, b], (a, b), [a, b), and (a, b]. 1. Convexity, Quadratic Forms, and Minors Let A denote a matrix. It does not have to be square. A “minor of A of order r” is obtained by deleting all but r rows and r columns of A, then taking the determinant of the resulting r × r matrix. 1 Now let A denote a square matrix. A “principal minor of A of order r” is obtained by deleting all but r rows and the corresponding r columns of A, then taking the determinant of the resulting r × r matrix. (For example, if you keep the first, third, and fourth rows, then you have to keep the first, third, and fourth columns.) A principal minor of A of order r is denoted by ∆r of A. Again let A denote a square matrix. A “leading principal minor of A of order r” is obtained by deleting all but the first r rows and the first r columns of A, then taking the determinant of the resulting r × r matrix. A leading principal minor of A of order r is denoted by Dr of A. A square matrix of dimension n× n has only 1 leading principal minor of order r for r = 1, . . . , n. Example. Suppose A =   1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16  . This matrix is not symmetric. Usually one is interested in the minors only of symmetric matrices, but there is nothing wrong with finding the minors of this non-symmetric matrix. • The leading principal minor of order 1 of A is D1 = |1|. There are four principal minors of order 1 of A; they are the ∆1’s: |1| = D1, |6|, |11|, and |16|. There are sixteen minors of A of order 1. • The leading principal minor of order 2 of A is D2 = ∣ ∣ ∣ 1 5 2 6 ∣ ∣ ∣. There are six principal minors of order 2 of A; they are the ∆2’s:∣ ∣ ∣ 1 5 2 6 ∣ ∣ ∣ = D2 (from rows and columns 1 and 2), ∣ ∣ ∣ 1 9 3 11 ∣ ∣ ∣ (from rows and columns 1 and 3), ∣ ∣ ∣ 1 13 4 16 ∣ ∣ ∣ (from rows and columns 1 and 4), ∣ ∣ ∣ 6 10 7 11 ∣ ∣ ∣ (from rows and columns 2 and 3), ∣ ∣ ∣ 6 14 8 16 ∣ ∣ ∣ (from rows and columns 2 and 4), and ∣ ∣ ∣ 11 15 12 16 ∣ ∣ ∣ (from rows and columns 3 and 4). There are thirty-six minors of A of order 2. • The leading principal minor of order 3 of A is D3 = ∣ ∣ ∣ ∣ ∣ 1 5 9 2 6 10 3 7 11 ∣ ∣ ∣ ∣ ∣ . There are four principal minors of order 3 of A; they are the ∆3’s: ∣ ∣ ∣ ∣ ∣ 1 5 9 2 6 10 3 7 11 ∣ ∣ ∣ ∣ ∣ = D3 (from rows and columns 1, 2, and 3), ∣ ∣ ∣ ∣ ∣ 1 5 13 2 6 14 4 8 16 ∣ ∣ ∣ ∣ ∣ (from 2 2. First-Order Conditions Proposition 3. Suppose that f , h1, . . . , hj , and g1, . . . , gk are C 1 func- tions of n variables. Suppose that x∗ ∈ Rn is a local minimum of f(x) on the constraint set defined by the j equalities and k inequalities h1(x) = c1, . . . , hj(x) = cj , (7) g1(x) ≥ b1, . . . , gk(x) ≥ bk . (8) Form the Lagrangian L (x,λ,µ) = f(x) + j ∑ i=1 λi [ci − hi(x)] + k ∑ i=1 µi [bi − gi(x)] . (9) Then (under certain conditions I omit here) there exist multipliers λ∗ and µ∗ such that: 1. ∂L (x∗,λ∗,µ∗)/∂λi = 0 for all i = 1, . . . , j. This is equivalent to: hi(x ∗) = ci for all i = 1, . . . , j. 2. ∂L (x∗,λ∗,µ∗)/∂xi = 0 for all i = 1, . . . , n. 3. µ∗i ≥ 0, gi(x∗)− bi ≥ 0, and µ∗i [gi(x∗)− bi] = 0 for all i = 1, . . . , k. These three conditions are often called the Kuhn-Tucker conditions. The last condition is sometimes called the “complementary slackness condition.” [Proposition 3′: For a maximum, change (8) to: g1(x) ≤ b1, . . . , gk(x) ≤ bk. (8′) Then the Lagrangian is formed in the same way. Condition 1 is unchanged. Condition 2 is unchanged. Condition 3 becomes: µ∗i ≥ 0, gi(x∗)− bi ≤ 0, and µ∗i [gi(x∗)− bi] = 0 for all i = 1, . . . , k.] 3. Second-Order Conditions: Local Let L be the Lagrangian of the optimization problem. In Section 2, I named the Lagrange multipliers “λ” if they were associated with one of the j equality constraints and “µ” if they were associated with one of the k inequality constraints. In this section: (a) ignore all the nonbinding inequal- ity constraints at (x∗,λ∗,µ∗); and (b) rename the Lagrange multipliers of the binding inequality constraints λj+1, λj+2, . . . , λm, where: 5 m is the number of equality constraints plus the number of bind- ing inequality constraints. It is allowed to have m = 0; if m = 0 then there are no Lagrange multipliers. Denote the m binding Lagrange multipliers collectively by λ. Let there be n variables with respect to which the optimization is occurring; denote these variables collectively by x. Let ∇2L be the following particular Hessian of the Lagrangian: first dif- ferentiate L with respect to all the Lagrange multipliers, then differentiate it with respect to the original variables x. ∇2L =                  Lλ1λ1 Lλ1λ2 · · · Lλ1λm | Lλ1x1 Lλ1x2 · · · Lλ1xn Lλ2λ1 Lλ2λ2 · · · Lλ2λm | Lλ2x1 Lλ2x2 · · · Lλ2xn ... ... . . . ... | ... ... . . . ... Lλmλ1 Lλmλ2 · · · Lλmλm | Lλmx1 Lλmx2 · · · Lλmxn — — — — — — — — — Lx1λ1 Lx1λ2 · · · Lx1λm | Lx1x1 Lx1x2 · · · Lx1xn Lx2λ1 Lx2λ2 · · · Lx2λm | Lx2x1 Lx2x2 · · · Lx2xn ... ... . . . ... | ... ... . . . ... Lxnλ1 Lxnλ2 · · · Lxnλm | Lxnx1 Lxnx2 · · · Lxnxn                  If you do this right, ∇2L should have an m ×m zero matrix in its upper left-hand corner: ∇2L = ( L Lx Lx Lxx ) = ( 0 Lx L T x Lxx ) where Lx is an m × n matrix and where a ‘T ’ superscript denotes the transpose. One has the following result: Proposition 4. A sufficient condition for the point (x∗,λ∗) identified in Proposition 3 to be a strict local minimum is that (−1)m has the same sign as all of the following when they are evaluated at (x∗,λ∗): D2m+1 of ∇2L , D2m+2 of ∇2L , . . . , Dm+n of ∇2L . If m = 0, this is equivalent to the condition that ∇2L (which in such a case equals ∇2f(x)) be positive definite, which occurs iff f(x) is strictly convex. 6 [Proposition 4′: Similarly, one will have a strict local maximum if, when they are evaluated at (x∗,∗), the following alternate in sign beginning with the sign of (−1)m+1: D2m+1 of ∇2L , D2m+2 of ∇2L , . . . , Dm+n of ∇2L .] There is a second-order necessary condition for a minimum, also: Proposition 5. For i < m, define “̂∆i of ∇2L ” to be the subset of “∆i of ∇2L ” formed by only considering those “∆i of ∇2L ” which retain (parts of) the first m rows and first m columns of ∇2L . (If m = 0, there is no difference between the ∆’s and the ̂∆’s.) Then a necessary condition for the point (x∗,λ∗) identified in Proposi- tion 3 to be a local minimum is that “(−1)m or zero” have the same sign as all of the following when they are evaluated at (x∗,λ∗): ̂∆2m+1 of ∇2L , ̂∆2m+2 of ∇2L , . . . , ̂∆m+n of ∇2L . The typical procedure is to check Proposition 4 first. If Proposition 4 doesn’t apply because one of the signs was strictly the same as (−1)m+1, then Proposition 5 tells you that (x∗,λ∗) is not a local minimum point. (This is because each Di ∈ ̂∆i.) [Proposition 5′: The version of Proposition 5 for a local maximum requires that the following, if they are evaluated at (x∗,∗), alternate in sign beginning with the sign of “(−1)m+1 or zero” (then having the sign of “(−1)m+2 or zero” and so forth): ̂∆2m+1 of ∇2L , ̂∆2m+2 of ∇2L , . . . , ̂∆m+n of ∇2L .] 4. Second-Order Conditions: Global Let (x∗,λ∗) be a point identified in Proposition 3. Let j be the number of equality constraints and k be the number of inequality constraints. 1. If j = k = 0 (an unconstrained problem) and f(x) is convex, then x∗ is a global minimum point of f in S. (The converse also holds.) Furthermore, if j = k = 0 and f(x) is strictly convex, then x∗ is the unique global minimum point of f in S. (The converse also holds.) 2. If k = 0 (only equality constraints) and L (x) is convex in x, then x∗ is a global constrained minimum point of f . Furthermore, if k = 0 and L (x) is strictly convex in x, then x∗ is the unique global constrained minimum point of f . 7 where L ∗ is L evaluated at (x∗, c). Consider the special case of (15) in which f(x, c) does not depend on c and in which the constraints take the form h1(x) = c1, . . . , hj(x) = cj . (7) Then (17) implies that ∂f∗ ∂ci = ∂L ∗ ∂ci = λi . (18) This is often used to give an interpretation of λi. 7. Probability Theory All the probability theory that is required for this course is an understanding of how to compute the expected value of a discrete or continuous random variable. A superficial understanding will suffice, but some students might be interested in a more careful treatment, which I give below. However, I still will not be giving a fully satisfactory treatment, because that requires mea- sure theory, Borel sets, and other advanced mathematics; such a treatment is given for example in Chapter 1 of Malliaris and Brock’s 1982 textbook “Stochastic Methods in Economics and Finance.” Let the set of possible outcomes of an uncertain event be called the “sample space” and be denoted by Ω. We will first suppose that the number of elements in Ω is finite or countably infinite. With each element ω ∈ Ω associate a real number X(ω). For example, if Ω is a deck of playing cards and each ω is one card, then X(ω) might be 1 when ω is the 2 of Hearts, 10 when ω is the Jack of Hearts, 14 when ω is the 2 of Spades, and so forth. The function X : Ω→ R is called a “discrete random variable.” Let Pr(ω) denote the probability that ω occurs. Let the function f(x) : R→ [0, 1] be defined by f(x) = Pr{ω : X(ω) = x} . The function f is called the “probability distribution” of the discrete random variable X. One has ∑ x∈R f(x) = 1 . 10 The “expected value” of the random variable (also called the “mean” of the random variable or the “average” of the random variable) is defined to be E(X) = ∑ x∈R xf(x) . For example, consider the outcome of a roll of a die. The set of outcomes, in no particular order, is Ω = {3, 1, 5, 4, 6, 2}. Let the “first” outcome be ω1 = 3, the “second” outcome be ω2 = 1, and so forth, so the sixth outcome is ω6 = 2. Define the random variable X(ω) in the following way: X(ω1) = 32 = x1, X(ω2) = 1 2 = x2, . . . , X(ω6) = 2 2 = x6. If in addition the die is fair (so all the outcomes occur with probability 1/6), then the expected value of X is 6 ∑ i=1 xi f(xi) = 3 2 · 1 6 + 12 · 1 6 + 52 · 1 6 + 42 · 1 6 + 62 · 1 6 + 22 · 1 6 = 1 6 · (9 + 1 + 25 + 16 + 36 + 4) = 91/6 = 1516 . For another example, again consider the outcome of a roll of a die. This time write the set of outcomes as Ω = {1, 2, 3, 4, 5, 6}. Let the “first” outcome be ω1 = 1, the “second” outcome be ω2 = 2, and so forth, so the sixth outcome is ω6 = 6. Define the random variable Y (ω) in the following way: Y (ω1) = 1 = y1, Y (ω2) = 2 = y2, . . . , Y (ω6) = 6 = y6. If in addition the die is fair (so all the outcomes occur with probability 1/6), then the expected value of Y is 6 ∑ i=1 yi f(yi) = 1 · 1 6 + 2 · 1 6 + · · ·+ 6 · 1 6 = 1 6 · (1 + 2 + 3 + 4 + 5 + 6) = 21/6 = 3.5 . This completes our treatment of the case when the number of elements in Ω is finite or countably infinite. Now suppose instead that the number of elements in Ω is uncountably infinite. Furthermore, suppose that to each element ω ∈ Ω we can associate a real number X(ω). For example, if ω is the color of paint in a paint can which we find together with many other paint cans in an abandoned building, then 11 Ω is the set of all possible colors in the abandoned cans, and if red is one’s favorite color, then X(ω) might be the grams of red pigment contained in the first abandoned paint can. The function X : Ω→ R is called a “continuous random variable.” Let the function F (x) : R→ [0, 1] be defined by F (x) = Pr{ω : X(ω) ≤ x} . The function F is called the “cumulative probability density function,” or CDF, of the continuous random variable X. (In the example, CDF(x) is the probability that the paint can will have less than or equal to x grams of red pigment.) One has F (∞) = 1. The function f(x) = dF (x) dx is called the “probability density function,” or PDF, of the continous random variable X. One has ∫ ∞ −∞ f(x) dx = 1 . The probability that the value of X is between a and b (where a 6= b) is ∫ b a f(x) dx. The probability that the value of X is exactly equal to a is not given by ∫ a a f(x) dx = 0, because then X could never take on any value. Instead, frequency with which the value of X is exactly equal to any particular value “a” goes to zero in the limit as the number of draws from the distribution goes to infinity. The “expected value” of the random variable (also called the “mean” of the random variable or the “average” of the random variable) is defined to be E(X) = ∫ ∞ −∞ x f(x) dx . For example, if Ω = [0, 12] for the outcome of the spin of a fair arrow centered on the face of a clock, if ω is defined be the number that the arrow points to on the clock face, and if X(ω) is defined to equal ω (so X(3) = 3), then the CDF of the arrow is 0.25 at x = 3, 0.75 at x = 9, and in general is equal to x/12. The PDF in this example is f(x) = 1/12 , 12
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved