Download Sampling Distributions and Hypothesis Testing: Understanding the Mean and Standard Error - and more Study notes Psychology in PDF only on Docsity! 1 Sampling Distributions and Hypothesis Testing Advanced Psychological Statistics I Psychology 502 September 13, 2007 2 Overview ! Questions? ! What is a sampling distribution? " Using the CLT ! Testing statistical hypotheses ! Example test ! Confidence intervals ! Type II errors ! Start the t-ratio 3 Sampling Distributions 4 Central Limit Theorem ! Population with mean µ and variance !2 ! The sampling distribution of the mean approaches a normal " Always " Regardless of distribution in the population ! Furthermore " The mean of the sampling distribution is µ " The variance is !2/N ! When N is large, the sampling distribution is extremely close to a perfect normal 9 Basis for Decision ! If the null hypothesis is true, certain other things follow " Such as? " Parameters for the sampling distribution of the mean ! That enables what? " Probabilistic statements about means " How? # Sampling distribution is normal with known parameters # We know a lot about normal distributions 10 Normal Distributions ! The standard normal N(0,1) is a special case of a normal distribution ! All other normals have the same fundamental probability distribution, but " Different location (mean) " Different dispersion (variance or std. dev.) ! Any probabilistic statement made about the standard normal can be generalized to other normal distributions " How? ! Thus, based on the standard normal, we can make probabilistic statements about means based on the sampling distribution of the mean 11 Hypothesis Testing Example ! Form some statistical hypothesis " Such as, µ = 0 ! Two ways to go: " Compute region of rejection and compare sample statistic to that " Compute probability of sample statistic and compare that to alpha " These are equivalent! ! Then, we either reject the null or we don!t ! Let!s walk through an example, using both methods 12 Problem Statement (everything fictional) ! A headphone manufacturer (call them Y) claims that their new headphones are better than company H!s competing headphones " And thus more expensive ! Standard headphone rating system on a scale from 1 to 20 has a known standard deviation of 5.2 ! Collect two samples, ask them to rate the headphones " Headphone Y: N = 25, M = 17.0 " Headphone H: N = 15, M = 14.3 " Company Y advertises that they!re better because the average score is 19% higher (basically true, 18.88%) ! But is the claim of “better” credible? " Why might it not be? 13 Method A: Compute Region of Rejection ! Pick an alpha level " One-tailed or two-tailed? " What does that mean and why do we care? ! What!s the null hypothesis? " Score = mean of Y - mean of H " µ " 0 ! What!s the appropriate z for our alpha level? ! What do we do with that z? " Convert it to a score on the relevant distribution " What!s the relevant distribution? # Sampling distribution for the difference in means # Why? 14 Sampling Distribution ! That sampling distribution is normal " Mean of zero (why?) " Standard deviation? !Mdiff = ! 1 2 N 1 + ! 2 2 N 2 = 5.2 2 25 + 5.2 2 15 =1.698 ! Our critical value for z was 1.65 " Need to map this onto our sampling distribution zM = x !µ "Mdiff x = zM!Mdiff +µ =1.65(1.698)+ 0 = 2.802 ! Actual difference in means is 2.7 ! What do we conclude? 19 Confidence Interval p(µ ! z " /2 # M $ x $ µ + z" / 2#M ) = 1!" p(!z " / 2 #M $ µ ! x $ z" / 2# M ) = 1!" p(x ! z " / 2 # M $ µ $ x + z" / 2# M ) = 1!" ! Thus, if we have a sample mean, we can make a probabilistic statement about the location of the population mean! 20 Confidence Intervals ! If you want to be 95% sure that the population mean is within a range, set that range to be the sample mean ± 1.96 standard errors " This is called the “95% confidence interval” ! From our example (! = 10, N = 9, sample mean = 45) " !M = 10/3 = 3.333 " z#/2 = 1.96 " 95% confidence interval is 45 +/- (1.96)(3.33) " 38.47 to 51.53 ! Question: What three things determine the width of the confidence interval? 21 Confidence Intervals ! Three determiners of confidence interval width: " Alpha level " N " Standard deviation in the population ! Assumptions " Population variance is known # When N is very large, can use the sample variance as estimate # There is also a way to estimate this when the population variance is not known and N is not large " Sampling distribution is normal # This will always be true when the population distribution is normal or N is “large enough” 22 Hypothesis Testing: When the Null Is True ! µ = 100 ! Possible test outcomes " Reject the null " Fail to reject ! Rejecting the null happens when? ! Failing to reject happens when? ! Can we reject the null when it is actually true? ! How often should this happen? ! This is called a “Type I error” ! # = p(Type I error) 23 Hypothesis Testing: When the Null Is False ! µ $ 100 ! Possible outcomes " Reject the null " Fail to reject ! Can we fail to reject the null when the null is actually false? ! This is called a “Type II error” " p(Type II error) = $ 24 States of the World ! Type I error is like a false alarm ! Type II error is like a miss ! Want your rate for both of these to be low! Reject null Fail to reject Type I error # 1- # Corr. F2R Hit (Power) 1- $ Type II error $ H0 true H1 true Truth D e c is io n 29 1!" " Changing Sample Size ! What changes when N goes up? " SEM " How does it change " Gets smaller ! What does that do to critical value? " Makes it smaller ! Therefore " $ gets smaller " No cost in #! ! Problems? H0 H1 30 Power ! 1-$ has a special name, “power” ! What is it? " Probability that the null is rejected when the null is false ! Why would this be important? " Actually can make meaningful statements about the probability of the null hypothesis " If power is low, why do the study? ! The problem " Frequently can!t actually compute power " Why? ! We!ll spend lots more time on this later 31 The Problem ! The sampling distribution of the mean tends to approach a normal distribution ! By transforming to the standard normal, we can do some very useful things: " Hypothesis tests of means " Construct confidence intervals ! There is a limitation here, though, and it!s somewhat severe. What is it? " Need to know ! ! In practice, we rarely know ! 32 The Solution ! We need to estimate ! ! We can, with s, the sample standard deviation ! A critical insight: " For any statistic that has a normal sampling distribution with mean zero, we can form the following ratio: statistic estimated standard error of the statistic ! Called the “t-ratio” or “t-statistic” ! How that helps: " The sampling distribution of that statistic is well- understood! 33 Forming the t ! For instance, we can form a t-ratio for the mean: t = x s M ! What is sM? " Estimated standard error of the mean " Simple formula: s M = s N " “s” is simply the unbiased sample estimate of the standard deviation 34 Sampling Distribution of t ! The sampling distribution of t is well understood ! There is more than one t distribution ! t distributions are identified by the degrees of freedom (df) " Degrees of freedom arise from the process of estimating the variance ! Because we know the sampling distributions for t, we can make probabilistic statements about particular values of t " Allows us to test hypotheses and form confidence intervals ! Information about the t distribution is in the back of your textbook (p. 682)