Download Review Hypothesis Testing in Quantitative Research Methods II | EDMS 646 and more Exams Statistics in PDF only on Docsity! Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 1 Review of Hypothesis Testing Quantitative Research Methods II EDMS 646 Section 0101 4:15 to 7:00pm Spring 2009 2 Outline • Summary of EDMS 645 • Sampling distribution in hypothesis testing • Define α, 1-α, β, 1-β under the framework of sampling distribution • z test (one-sample) • t test (one-sample) • z test (two-sample) • t test (two-sample) • Hypothesis testing for correlation • Chi-square tests • Assumptions for different hypothesis testing • Test of homogeneity of variance • Sum of squares in simple linear regression Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 2 3 Descriptive and inferential statistics • Descriptive statistics – Central tendency – Variation – Correlation • If you observed group mean difference, the observed two group mean difference could be due to – Non-random sample – Sampling variation – True difference • Inferential statistics – Make inference about the population based on sample statistics • Hypothesis testing 4 Hypothesis testing • One-sample case for the mean – One-sample z-tests – One-sample t-tests • Two-sample case for dependent means – Dependent-groups t-tests • Two-sample case for independent means – Independent-groups t-tests • With pooled variance estimate • With separate variance estimate • One-sample case for correlation • Non-parametric hypothesis testing – Chi-square tests for goodness-of-fit – Chi-square tests for independence Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 5 9 The five steps in testing statistical hypotheses 1. State a null hypothesis to be tested. 2. Specify α (the degree of risk of making a Type-I error). 3. Obtain the tests statistic. 4. Obtain one of the three decision making criteria; a. Confidence Interval b. Critical Value c. p-value 5. Make a decision regarding H0 – reject or not reject. 10 How to make decision using different statistics? Retain the null if the p-value is greater than your alpha level, or the z-statistic smaller than the critical z-statistic, or CI containing the parameter – if α = .05 and p = .10, – Z-crit=1.96 and Z = 1.86, – 95% CI = [92.85, 123.15], where H0: µ = 95 • Reject the null if the p-value is equal to or less than your alpha level, or the z-statistic larger than the critical z-statistic, or CI not containing the parameter – if α = .05 and p = .001, – Z-crit=1.96 and Z = 2.53, – 95% CI = [96.85, 123.15], where H0: µ = 95 If know population variance, use z-statistics Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 6 11 • when do not know σ, assume a sampling t-distribution other than the normal distribution. • The shape of the t distribution is still symmetrical, but slightly more flattened than the normal distribution.This tendency is more prominent with smaller n. When n is large, the sampling distribution is almost identical to the normal distribution. One-Sample t-test 12 Hypothesis testing- One Sample for the Mean If know population variance, use z-test If don’t know population variance, use t-test nX 2σσ = X XZ σ μ− = Xs Xt μ−= 2σ n ssX 2 = 2s Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 7 13 Hypothesis testing and confidence interval know population variance hypothesis testing: CI: unknown population variance, know sample variance hypothesis testing: CI: XcritzX σ± X X Xz σ μ− = X obs S Xt μ−= Xcrit StX ± 14 t-table • Critical value to look up depends on – α-level – 1- or 2-tailed test – Degree of freedom, df=n-1 • As df increases, t z Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 10 19 Use SPSS to run dependent two-sample t-test 20 Use SPSS to run dependent sample t-test-Output Paired Samples Statistics 85.3000 10 10.30696 3.25935 81.5000 10 10.55409 3.33750 pretest posttest Pair 1 Mean N Std. Deviation Std. Error Mean Paired Samples Correlations 10 .899 .000pretest & posttestPair 1 N Correlation Sig. Paired Samples Test 3.80000 4.68568 1.48174 .44807 7.15193 2.565 9 .030pretest - posttestPair 1 Mean Std. Deviation Std. Error Mean Lower Upper 95% Confidence Interval of the Difference Paired Differences t df Sig. (2-tailed) Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 11 21 Independent groups z test-known population variance • Test two population mean difference between two independent groups Null hypothesis: H0: μ1= μ2 or μ1- μ2 =0 Alternative hypothesis: HA: μ1 ≠ μ2 or μ1- μ2 ≠0 ⎟⎟ ⎠ ⎞ ⎜⎜ ⎝ ⎛ += − 21 2 11 21 nnYY σσ 21 )()( 2121 YY YYz − −−− = σ μμ Critical value to look up based on table C.1 α-level=0.05 2-tailed test Critical value=1.96 22 Independent Samples with Unknown Population Variance: t-test Assumptions for independent t test 1. Independent observations 2. Normally distributed score in populations • T-test is robust to violation of this assumption 3. Homogeneous variance in populations • Violation of this assumption creates serious problem Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 12 23 Assumptions of homogeneity of variance • Pooling variance under assumption: – two groups come from different populations and those populations have the same variance • If n1=n2, can ignore test of variance homogeneity and pool – When sample sizes are equal, t-test is robust to violations of variance homogeneity assumption • If bigger sample has bigger variance, t-test is conservative- fewer type I errors (lose power) • If smaller sample has bigger variance, t-test make too many Type I errors 24 Assumption of homogeneity of variance Sample size equal? Do pooled variance t-test Test homogeneity using Levene’s test Homogeneity? Do unpooled variance t-test Yes No Yes No Interpret results Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 15 29 Assumption of homogeneity of variance Sample size equal? Do pooled variance t-test Test homogeneity using Levene’s test Homogeneity? Do unpooled variance t-test Yes No Yes No Interpret results 30 Test assumption of homogeneity of variance • F-test If this gets much bigger than 1, we may infer σ12 ≠σ22 • Levene’s test – Created because F is sensitive to normality violations – Check SPSS output – Levene’s p<0.05, then reject H0 – Levene’s p>0.05, then retain H0 12 2 ≥= small big S S F Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 16 31 Assumption of homogeneity of variance Sample size equal? Do pooled variance t-test Test homogeneity using Levene’s test Homogeneity? Do unpooled variance t-test Yes No Yes No Interpret results 32 If homogeneity assumption is violated, use 2 2 2 1 2 1 21 n s n ss YY +=− 11 2 2 2 2 2 1 2 1 2 1 2 2 2 2 1 2 1 − ⎟⎟ ⎠ ⎞ ⎜⎜ ⎝ ⎛ + − ⎟⎟ ⎠ ⎞ ⎜⎜ ⎝ ⎛ ⎟⎟ ⎠ ⎞ ⎜⎜ ⎝ ⎛ + = n n s n n s n s n s df 21 )()( 2121 YYs YYt − −−− = μμ Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 17 33 SPSS run of independent samples t-test test 34 Independent samples t-test test sequence 1. Test for homogeneity of variance assumption Check F-test or Levene’s p-value (SPSS output) 2. If it is ok, use normal t-test (pooled variance) Equal variance assumed t-test p-value-top row 3. It it’s not ok, do adjusted t-test (unpooled variances) Equal variance assumed t-test p-value-bottom row Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 20 39 Construct the confidence interval Convert zr to r using Table C.6 )025.1 ,393.0( 316.0709.0 )192.0(645.1709.0 )(645.190 = ±= ±= ±= rzr szCI ).7730 ,374.0(90 =CI Critical value Test statistics z =-0.344 did not exceed the critical value of 1.645 40 Test: H0: ρ=0 Sampling distribution when ρ=0 is t-distribution with n-2 degree of freedom State the hypothesis H0: ρ=0 HA: ρ≠0 Compute the test statistic 21 2 r nrt − − = Critical value: Degree of freedom: n-2 Based on Table C.3 Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 21 41 Test: H0: ρ=0 using critical value table C.7 For a given df, if the absolute value of r is equal to or exceeds the tabled value for a given α level, then the null hypothesis is rejected. Example: computed r =-0.428, n=62 critical value from Table C.7=0.25 at α = 0.05 level using a two-tailed test null is rejected 42 Parametric vs. nonparametric tests • Distributions: – z – t • z is a t with infinity degrees of freedom • Assumptions for using these two distributions – Normality – Homogeneity of variances • Parametric tests: requires strong assumptions • Nonparametric tests: do not meet these assumptions Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 22 43 Distribution • Analysis of nominal/categorical data • A family of distributions – Each a function of the degrees of freedom associated with the number of categories in the data – A single df determines the chi-square distribution • The sampling distribution of chi-square are positively skewed • But as the df increases, the respective sampling distribution approaches symmetry • All chi-square values are positive, ranging from 0 to infinity 2χ 44 Chi-square distribution-probability density function Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 25 49 Example • M&M’s – Brown – Red – Yellow – Blue – Orange – Green • In a big vat of M&M’s, there are equal numbers of different colors, all mixed up randomly. When they fill a bag, will each bag have equal numbers of different colors? – No, but they shouldn’t deviate too far 50 Chi-square tests of independence (homogeneity) • More than one variable, each with more than one category H0: independent in the population HA: not independent in the population Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 26 51 Chi-square tests of independence (homogeneity) Example: if there is a relationship between academic status and the origin of the car a person drives or if those things appear independent – A sample of 20 professors and 40 undergrads 30 Europe, 15 Asia, 15 USA if there is a difference between tenured and nontenured faculty members regarding the support of the consulting policy? – A sample of 200 – 105 tenured professors and 95 nontenured professors – 168 support policy and 32 not support policy 52 Computing expected frequency: 15 0.25E=0.25*0.666*60 =10 E=0.25* 0.333*60 =5 15 0.25E=0.25*0.666*60 =10 E=0.25* 0.333*60 =5 30 0.5E=0.5*0.666*60 =20 E=0.5* 0.333*60 =10 professor student Europe Asia USA 20 40 60 0.333 0.666 ncolumnProwP columnmrowE ××= × = )()( Total )marginal ()arginal ( P(row) P(column) Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 27 53 Choose a decision criterion Table C.4 α = 0.05 df=(rows-1)(columns-1)=(2-1)*(2-1)=1 Critical value: 3.841 Observed value: 0.722 54 SPSS run of chi-square test: Academic 1=professor 2=student Car 1=Europe 2=Asia 3=USA Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 30 59 Assumptions for two independent samples t test 1. Independent observations for both within and between treatment groups • Not robust 2. Normally distributed score in within each treatment group • Robust when sample size is large 3. Homogeneity of variance • Robust when n1= n2 • when n1>n2, σ12 >σ22 or n1<n2, σ12 <σ22, larger variance receive more weights, larger estimated standard error, harder to reject null. If null is true, not really a problem; but if null is false, seriously affect power. • when n1>n2, σ12 <σ22 or n1<n2, σ12 >σ22, larger variance receive less weights, smaller estimated standard error, easier to reject null. If null is false, not really a problem-get more power; but if null is true, seriously inflate chance of Type I error. 60 Assumptions for t test for presence of a linear relation/Correlation 1. Independent pairs of observations 2. Conditional distribution of Y given X are normally distributed 3. Homoscedasticity (conditional variance of Y for given Xs ) Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 31 61 Assumptions for Chi-square test for goodness- of-fit 1. Every observation falls into one cell and only one cell 2. Independence among observations 3. Expected frequency for every cell is not smaller than 5 62 Assumptions for Chi-square test for independence 1. Every observation falls into one cell and only one cell 2. Independence among observations 3. Expected frequency for every cell is not smaller than 5 Review of Hypothesis Testing EDMS646, Dr. Jiao, Spring 2009 32 63 Test assumption of homogeneity of variance • F-test – Not robust to violation of normality of dependent variable Y within each treatment group except for very large sample – Not robust to violation of independence assumption within and between treatment groups If this gets much bigger than 1, we may infer σ12 ≠σ22 • Obtain a decision criterion (Table C.5, P645) α df for numerator=nbig-1 df for denominator = nsmall-1 F-critical value: 12 2 ≥= small big S S F 64 Test assumption of homogeneity of variance • Levene’s test (robust to violation of normality) – Created because F is sensitive to normality violations – Transform the raw data: compute the absolute deviations of group 1 score from the group 1 mean and the absolute deviations of group 2 score from the group 2 mean – Conduct a t-test of group difference in terms of their absolute deviations – Levene’s p<0.05, then reject H0 – Levene’s p>0.05, then retain H0 – Check SPSS output