Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Inference for Population Mean: T-Tests and Confidence Intervals - Prof. S. Kalaycioglu, Exams of Statistics

An introduction to inferring population means using t-tests and confidence intervals. It covers conditions for inference, the t-distribution, one-sample and matched pairs t-procedures, and robustness of t-procedures. Examples and calculations using data from a study on sweetness loss due to storage and the effect of moderate red wine consumption on polyphenol blood levels.

Typology: Exams

Pre 2010

Uploaded on 08/31/2009

koofers-user-t36
koofers-user-t36 🇺🇸

10 documents

1 / 29

Toggle sidebar

Related documents


Partial preview of the text

Download Inference for Population Mean: T-Tests and Confidence Intervals - Prof. S. Kalaycioglu and more Exams Statistics in PDF only on Docsity! Inference for a population mean BPS chapter 18 © 2006 W.H. Freeman and Company Objectives (BPS chapter 18) Inference about a Population Mean Conditions for inference The t distribution The one-sample t confidence interval Using technology Matched pairs t procedures Robustness of t procedures When σ is unknown When the sample size is very large, the sample is likely to contain elements representative of the whole population. Then s is a very good estimate of σ. Population distribution Small sampleLarge sample But when the sample size is small, the sample contains only a few individuals. Then s is a more mediocre estimate of σ. The sample standard deviation s provides an estimate of the population standard deviation σ. Example: A medical study examined the effect of a new medication on the seated systolic blood pressure. The results, presented as mean ± SEM for 25 patients, are 113.5 ± 8.9. What is the standard deviation s of the sample data? Standard deviation s — standard error of the mean s/√n For a sample of size n, the sample standard deviation s is: n − 1 is the “degrees of freedom.” The value s/√n is called the standard error of the mean SEM. Scientists often present their sample results as the mean ± SEM. ∑ −−= 2)( 1 1 xx n s i SEM = s/√n <=> s = SEM*√n s = 8.9*√25 = 44.5 The t distributions We test a null and alternative hypotheses with one sample of size n from a normal population N(µ,σ): When σ is known, the sampling distribution is normal N(μ, σ/√n). When σ is estimated from the sample standard deviation s, then the sampling distribution follows a t distribution t(μ,s/√n) with degrees of freedom n − 1. The value (s/√n) is the standard error of the mean or SEM. Table C When σ is known, we use the normal distribution and the standardized z-value. When σ is unknown we use the sample standard deviation and a t distribution with “n − 1” degrees of freedom (df). Table C shows the z-values and t-values corresponding to landmark P-values/ confidence levels. t = x − μ s n Table A vs. Table C Table A gives the area to the LEFT of hundreds of z-values. It should only be used for normal distributions. (…) Table C also gives the middle area under a t or normal distribution comprised between the negative and positive value of t or z. (…) Confidence intervals Reminder: The confidence interval is a range of values with a confidence level C representing the probability that the interval contains the true population parameter. We have a set of data from a population with both μ and σ unknown. We use to estimate μ, and s to estimate σ, using a t distribution (df n − 1). C t*−t* m m m = t * s n Practical use of t: t* C is the area under the t (df: n−1) curve between −t* and t*. We find t* in the line of Table C for df = n−1 and confidence level C. The margin of error m is: x Excel PercentChange Mean 5.5 Standard Error 0.838981 Median 5.5 Mode #N/A Standard Deviation 2.516943 Sample Variance 6.335 Kurtosis 0.010884 Skewness -0.7054 Range 7.7 Minimum 0.7 Maximum 8.4 Sum 49.5 Count 9 Confidence Level(95.0%) 1.934695 Menu: Tools/DataAnalysis: select “Descriptive statistics” Warning !!! Do not use the function =CONFIDENCE(alpha, stdev, size). This assumes a normal sampling distribution and uses z* instead of t*!!! PercentChange Confidence Level(95.0%) 1.934695 s/√n m m The t-test As in the previous chapter, a test of hypotheses requires a few steps: 1. Stating the null and alternative hypotheses (H0 versus Ha) 2. Deciding on a one-sided or two-sided test 3. Choosing a significance level α 4. Calculating t and its degrees of freedom 5. Finding the area under the curve with Table C 6. Stating the P-value and interpreting the result t = x − μ s n One-sided (one-tailed) Two-sided (two-tailed) Review: test of significance The P-value is the probability, if H0 is true, of randomly drawing a sample like the one obtained, or more extreme, in the direction of Ha. The P-value is calculated as the corresponding area under the curve, one-tailed or two-tailed depending on Ha: Sweetening colas (continued) Is there evidence that storage results in sweetness loss for the new cola recipe at the 0.05 level of significance (α = 5%)? H0: μ = 0 versus Ha: μ > 0 (one-sided test) the critical value tα = 1.833 t > tα thus the result is significant. 2.398< t = 2.70 < 2.821, thus 0.02 > p > 0.01 p < α, thus the result is significant. The t-test has a significant p-value. We reject H0. There is a significant loss of sweetness, on average, following storage. Taster Sweetness loss 1 2.0 2 0.4 3 0.7 4 2.0 5 -0.4 6 2.2 7 -1.3 8 1.2 9 1.1 10 2.3 ___________________________ Average 1.02 Standard deviation 1.196 1.02 0 2.70 1.196 10 1 9 xt s n df n μ− − = = = = − = Sweetening colas (continued) In Excel, you can obtain the precise P-value once you have calculated t: Use the function dist(t, df, tails) “=tdist(2.7, 9, 1),” which gives 0.01226 1.02 0 2.70 1.196 10 1 9 xt s n df n μ− − = = = = − = Minitab Red wine, in moderation (continued) Does moderate red wine consumption increase the average blood level of polyphenols in healthy men? Sample average = 5.5; s = 2.517; t = (5.5 − 0)/(2.517/√9) ≈ 6.556 H0: μ = 0 versus Ha: μ > 0 (one-sided test) From Table C, df = 8: t > 5.041 and therefore p > 0.0005. The P-value is very small (well below 1%), and thus the result is very significant. Moderate red wine consumption significantly increases the average polyphenol blood levels of healthy men. Important: This test does not say how large the increase is, or what the impact on men’s health is. Test statistic would be off the chart to the right Sweetening colas (revisited) The sweetness loss due to storage was evaluated by 10 professional tasters (comparing the sweetness before and after storage): Taster Sweetness loss 1 2.0 2 0.4 3 0.7 4 2.0 5 −0.4 6 2.2 7 −1.3 8 1.2 9 1.1 10 2.3 We want to test if storage results in a loss of sweetness, thus H0: μ = 0 versus Ha: μ > 0 Although the text did not mention it explicitly, this is a pre-/post-test design, and the variable is the difference in cola sweetness before and after storage. A matched pairs test of significance is indeed just like a one-sample test. Does lack of caffeine increase depression? Individuals diagnosed as caffeine-dependent are deprived of all caffeine-rich foods and assigned to receive daily pills. At some time, the pills contain caffeine and at another time they contain a placebo. Depression was assessed. There are two data points for each subject, but we will only look at the difference. The sample distribution appears appropriate for a t-test. Subject Depression with Caffeine Depression with Placebo Placebo - Cafeine 1 5 16 11 2 5 23 18 3 4 5 1 4 3 7 4 5 8 14 6 6 5 24 19 7 0 6 6 8 0 3 3 9 2 15 13 10 11 12 1 11 1 0 -1 11 "difference" data points Placebo - ff ine Does lack of caffeine increase depression? For each individual in the sample, we have calculated a difference in depression score (placebo minus caffeine). There were 11 “difference” points, thus df = n − 1 = 10. We calculate that = 7.36; s = 6.92 H0: μdifference = 0 ; H0: μdifference > 0 53.3 11/92.6 36.70 == − = ns xt Subject Depression with Caffeine Depression with Placebo Placebo - Cafeine 1 5 16 11 2 5 23 18 3 4 5 1 4 3 7 4 5 8 14 6 6 5 24 19 7 0 6 6 8 0 3 3 9 2 15 13 10 11 12 1 11 1 0 -1 For df = 10, 3.169 < t = 3.53 < 3.581, therefore 0.005 > p > 0.0025. Caffeine deprivation causes a significant increase in depression. x Placebo - Caffeine
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved