Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

econometrics cheatsheat, Cheat Sheet of Introduction to Econometrics

the cheatsheat of econometrics

Typology: Cheat Sheet

2021/2022

Uploaded on 01/06/2024

kaicheng-lu
kaicheng-lu 🇺🇸

1 document

Partial preview of the text

Download econometrics cheatsheat and more Cheat Sheet Introduction to Econometrics in PDF only on Docsity! Econometrics Cheat Sheet By Marcelo Moreno - King Juan Carlos University The Econometrics Cheat Sheet Project Basic concepts Definitions Econometrics - is a social science discipline with the objective of quantify the relationships between economic agents, test economic theories and evaluate and implement government and business policies. Econometric model - is a simplified representation of the reality to explain economic phenomena. Ceteris paribus - if all the other relevant factors remain constant. Data types Cross section - data taken at a given moment in time, an static photo. Order doesn’t matter. Time series - observation of variables across time. Order does matter. Panel data - consist of a time series for each observation of a cross section. Pooled cross sections - combines cross section from dif- ferent time periods. Phases of an econometric model 1. Specification. 2. Estimation. 3. Validation. 4. Utilization. Regression analysis Study and predict the mean value of a variable (dependent variable, y) regarding the base of fixed values of other vari- ables (independent variables, x’s). In econometrics it is common to use Ordinary Least Squares (OLS) for regres- sion analysis. Correlation analysis Correlation analysis don’t distinguish between dependent and independent variables. ˆ Simple correlation measures the grade of linear associa- tion between two variables. r = Cov(x,y) σx·σy = ∑n i=1((xi−x)·(yi−y))√∑n i=1(xi−x)2· ∑n i=1(yi−y)2 ˆ Partial correlation measures the grade of linear associa- tion between two variables controlling a third. Assumptions and properties Econometric model assumptions Under this assumptions, the OLS estimator will present good properties. Gauss-Markov assumptions: 1. Parameters linearity (and weak dependence in time series). y must be a linear function of the β’s. 2. Random sampling. The sample from the population has been randomly taken. (Only when cross section) 3. No perfect collinearity. ˆ There are no independent variables that are constant: Var(xj) ̸= 0, ∀j = 1, . . . , k. ˆ There isn’t an exact linear relation between indepen- dent variables. 4. Conditional mean zero and correlation zero. a. There aren’t systematic errors: E(u | x1, . . . , xk) = E(u) = 0 → strong exogeneity (a implies b). b. There are no relevant variables left out of the model: Cov(xj , u) = 0, ∀j = 1, . . . , k → weak exogeneity. 5. Homoscedasticity. The variability of the residuals is the same for all levels of x: Var(u | x1, . . . , xk) = σ2 u 6. No auto-correlation. Residuals don’t contain infor- mation about any other residuals: Corr(ut, us | x1, . . . , xk) = 0, ∀t ̸= s. 7. Normality. Residuals are independent and identically distributed: u ∼ N(0, σ2 u) 8. Data size. The number of observations available must be greater than (k + 1) parameters to estimate. (It is already satisfied under asymptotic situations) Asymptotic properties of OLS Under the econometric model assumptions and the Central Limit Theorem (CLT): ˆ Hold 1 to 4a: OLS is unbiased. E(β̂j) = βj ˆ Hold 1 to 4: OLS is consistent. plim(β̂j) = βj (to 4b left out 4a, weak exogeneity, biased but consistent) ˆ Hold 1 to 5: asymptotic normality of OLS (then, 7 is necessarily satisfied): u ∼ a N(0, σ2 u) ˆ Hold 1 to 6: unbiased estimate of σ2 u. E(σ̂ 2 u) = σ2 u ˆ Hold 1 to 6: OLS is BLUE (Best Linear Unbiased Esti- mator) or efficient. ˆ Hold 1 to 7: hypothesis testing and confidence intervals can be done reliably. Ordinary Least Squares Objective - minimize the Sum of Squared Residuals (SSR): min ∑n i=1 û 2 i , where ûi = yi − ŷi Simple regression model y x β0 β1 Equation: yi = β0 + β1xi + ui Estimation: ŷi = β̂0 + β̂1xi where: β̂0 = y − β̂1x β̂1 = Cov(y,x) Var(x) Multiple regression model x2 y x1 β0 Equation: yi = β0 + β1x1i + · · ·+ βkxki + ui Estimation: ŷi = β̂0 + β̂1x1i + · · ·+ β̂kxki where: β̂0 = y − β̂1x1 − · · · − β̂kxk β̂j = Cov(y,resid xj) Var(resid xj) Matrix: β̂ = (XTX)−1(XTy) Interpretation of coefficients Model Dependent Independent β1 interpretation Level-level y x ∆y = β1∆x Level-log y log(x) ∆y ≈ (β1/100)(%∆x) Log-level log(y) x %∆y ≈ (100β1)∆x Log-log log(y) log(x) %∆y ≈ β1(%∆x) Quadratic y x+ x2 ∆y = (β1 + 2β2x)∆x Error measurements Sum of Sq. Residuals: SSR = ∑n i=1 û 2 i = ∑n i=1(yi − ŷi) 2 Explained Sum of Squares: SSE = ∑n i=1(ŷi − y)2 Total Sum of Sq.: SST = SSE + SSR = ∑n i=1(yi − y)2 Standard Error of the Regression: σ̂u = √ SSR n−k−1 Standard Error of the β̂’s: se(β̂) = √ σ̂2 u · (XTX)−1 Mean Squared Error: MSE = ∑n i=1(yi−ŷi) 2 n Absolute Mean Error: AME = ∑n i=1|yi−ŷi| n Mean Percentage Error: MPE = ∑n i=1|ûi/yi| n · 100 3.3-en - github.com/marcelomijas/econometrics-cheatsheet - CC-BY-4.0 license R-squared Is a measure of the goodness of the fit, how the regression fits to the data: R2 = SSE SST = 1− SSR SST ˆ Measures the percentage of variation of y that is lin- early explained by the variations of x’s. ˆ Takes values between 0 (no linear explanation of the variations of y) and 1 (total explanation of the varia- tions of y). When the number of regressors increment, the value of the R-squared increments as well, whatever the new variables are relevant or not. To solve this problem, there is an ad- justed R-squared by degrees of freedom (or corrected R- squared): R 2 = 1− n−1 n−k−1 · SSR SST = 1− n−1 n−k−1 · (1−R2) For big sample sizes: R 2 ≈ R2 Hypothesis testing Definitions An hypothesis test is a rule designed to explain from a sam- ple, if exist evidence or not to reject an hypothesis that is made about one or more population parameters. Elements of an hypothesis test: ˆ Null hypothesis (H0) - is the hypothesis to be tested. ˆ Alternative hypothesis (H1) - is the hypothesis that cannot be rejected when the null hypothesis is rejected. ˆ Test statistic - is a random variable whose probability distribution is known under the null hypothesis. ˆ Critic value - is the value against which the test statistic is compared to determine if the null hypothesis is rejected or not. Is the value that makes the frontier between the regions of acceptance and rejection of the null hypothesis. ˆ Significance level (α) - is the probability of rejecting the null hypothesis being true (Type I Error). Is chosen by who conduct the test. Commonly is 0.10, 0.05 or 0.01. ˆ p-value - is the highest level of significance by which the null hypothesis cannot be rejected (H0). The rule is: if the p-value is less than α, there is evidence to reject the null hypothesis at that given α (there is evidence to accept the alternative hypothesis). Individual tests Tests if a parameter is significantly different from a given value, ϑ. ˆ H0 : βj = ϑ ˆ H1 : βj ̸= ϑ Under H0: t = β̂j−ϑ se(β̂j) ∼ tn−k−1,α/2 If |t| > |tn−k−1,α/2|, there is evidence to reject H0. Individual significance test - tests if a parameter is sig- nificantly different from zero. ˆ H0 : βj = 0 ˆ H1 : βj ̸= 0 Under H0: t = β̂j se(β̂j) ∼ tn−k−1,α/2 If |t| > |tn−k−1,α/2|, there is evidence to reject H0. The F test Simultaneously tests multiple (linear) hypothesis about the parameters. It makes use of a non restricted model and a restricted model: ˆ Non restricted model - is the model on which we want to test the hypothesis. ˆ Restricted model - is the model on which the hypoth- esis that we want to test have been imposed. Then, looking at the errors, there are: ˆ SSRUR - is the SSR of the non restricted model. ˆ SSRR - is the SSR of the restricted model. Under H0: F = SSRR−SSRUR SSRUR · n−k−1 q ∼ Fq,n−k−1 where k is the number of parameters of the non restricted model and q is the number of linear hypothesis tested. If Fq,n−k−1 < F , there is evidence to reject H0. Global significance test - tests if all the parameters as- sociated to x’s are simultaneously equal to zero. ˆ H0 : β1 = β2 = · · · = βk = 0 ˆ H1 : β1 ̸= 0 and/or β2 ̸= 0 . . . and/or βk ̸= 0 In this case, we can simplify the formula for the F statistic. Under H0: F = R2 1−R2 · n−k−1 k ∼ Fk,n−k−1 If Fk,n−k−1 < F , there is evidence to reject H0. Confidence intervals The confidence intervals at (1− α) confidence level can be calculated: β̂j ∓ tn−k−1,α/2 · se(β̂j) Dummy variables Dummy (or binary) variables are used for qualitative infor- mation like sex, civil state, country, etc. ˆ Takes the value 1 in a given category and 0 in the rest. ˆ Are used to analyze and modeling structural changes in the model parameters. If a qualitative variable have m categories, we only have to include (m− 1) dummy variables. Structural change Structural change refers to changes in the values of the pa- rameters of the econometric model produced by the effect of different sub-populations. Structural change can be in- cluded in the model through dummy variables. The location of the dummy variables (D) matters: ˆ On the intercept (additive effect) - represents the mean difference between the values produced by the structural change. y = β0 + δ1D + β1x1 + u ˆ On the slope (multiplicative effect) - represents the ef- fect (slope) difference between the values produced by the structural change. y = β0 + β1x1 + δ1D · x1 + u Chow’s structural test - is used when we want to analyze the existence of structural changes in all the model param- eters, it’s a particular expression of the F test, where the null hypothesis is: H0: No structural change (all δ = 0). Changes of scale Changes in the measurement units of the variables: ˆ In the endogenous variable, y∗ = y ·λ - affects all model parameters, β∗ j = βj · λ, ∀j = 1, . . . , k ˆ In an exogenous variable, x∗ j = xj · λ - only affect the parameter linked to said exogenous variable, β∗ j = βj · λ ˆ Same scale change on endogenous and exogenous - only affects the intercept, β∗ 0 = β0 · λ Changes of origin Changes in the measurement origin of the variables (en- dogenous or exogenous), y∗ = y + λ - only affects the model’s intercept, β∗ 0 = β0 + λ 3.3-en - github.com/marcelomijas/econometrics-cheatsheet - CC-BY-4.0 license
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved