Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Inference: Confidence Intervals and Hypothesis Testing, Study notes of Biology

An overview of statistical inference, focusing on the concepts of confidence intervals and hypothesis testing. Point estimates, interval estimates, confidence levels, and the use of standard normal and student's t distributions to calculate confidence intervals. Additionally, it discusses the importance of significance levels and types of errors in hypothesis testing.

Typology: Study notes

Pre 2010

Uploaded on 07/23/2009

koofers-user-qne
koofers-user-qne 🇺🇸

5

(2)

10 documents

1 / 22

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Inference: Confidence Intervals and Hypothesis Testing and more Study notes Biology in PDF only on Docsity! Review of statistics 2 Yang Dai Bioe594 UIC References : • StatPrimer: http://www.sjsu.edu/faculty/gerstman/StatPrimer •Biostatistics : a foundation for analysis in the health sciences, 7ed., W. Daniel, 1999. John Wiley. Statistic Inference • Statistic inference : the procedure by which we reach a conclusion about a population on the basis of the information contained in sample drawn from that population. Estimation Hypothesis testing • Point estimation : a single numerical value used to estimate the corresponding population parameter. • Interval estimate : consists of two numerical values defining a range of values that, with a specified degree of confidence, we feel includes the parameter being estimated. • Choosing an appropriate estimator. Usually presented as formulas, e.g. • An estimator, say T, of the parameter is said to be an unbiased estimator of n x x i∑= θ θθ =)( if TE Example • A physical therapist wished to estimate, with 99% confidence, the mean maximal strength of a particular muscle in certain group of individuals. He is willing to assume that strength scores are approximately normally distributed with a variance of 144. • A sample of 15 subjects who participated in the experiment yielded a mean of 84.3. • since • The z value corresponding to 0.99, is > qnorm(.995,0,1) 2.575829 This is confidence coefficient. The standard error is 92.3) (76.3, 4)2.58(3.09884.3 is for interval confidence 99% The .0984.315/144SEM ± == µ 01.,99.1 ==− αα 58.2995.2/1 ==− zz α Student t distribution • The confidence interval formulas are of the form (point estimate) (confidence coefficient)*(standard error) • When is known, the coefficient for this interval directly from the standard normal distribution. What happen when is not known? • Student’s t distribution is similar to the standard normal distribution in that is symmetrical, unimodal. However, t distributions have slightly broad tails than normal distribution. • Student’s t distribution is a family of distribution, with each family member indexed by a parameter known as degree of freedom. An t distribution with infinite degree of freedom is a standard normal (z) distribution. t percentiles are found in t-table. ± σ σ • Use the t distribution allows for the computation of a confidence interval for the population mean when the population standard deviation is not known. In such instances, we use the sample standard deviation, s, as an estimation of and calculate a confidence interval for with • In general, a confidence interval for will have chance of capturing the population mean and an chance of missing the population mean. Confidence interval for when is not knownµ σ σ %100)1( α− µ )(/SEM,SEM)( 2/1,1 nsqrtstx n =± −− α %100)1( α− µ %100)1( α− α Interval estimation of a population variance • Percentile of chi-square distribution are given in table. >qchisq(.95,9) ,16.92 • To obtain a confidence interval for , first obtain a confidence interval for . I 2 2 2/ 2 2 2 2/1 2 2 2/12 2 2 2/ of interval confidence )100%-(1 theis )1()1( )1( σα χ σ χ χ σ χ αα αα snsn sn − << − < − < − − %100)1( α− 2σ %100)1( α− 22 /)1( σsn − σα χ σ χ αα of interval confidence )100%-(1 theis )1()1( 2 2/ 2 2 2/1 2 snsn − << − − Confidence interval for the ration of the variances of two normally distributed population • It is frequently of interest to compare two variances, one way to do it is to form their ratio, . If two variances are equal, their ratio will be equal to 1. We have to rely on some sampling distribution, this time • The assumptions are that are computed from two independent samples of size respectively, drawn from two normally distributed populations. • The F distribution Under the assumption follows a distribution known as the F distribution 2 2 2 1 /σσ )//()/( 22 2 2 2 1 2 1 σσ ss 2 2 2 1 s and s 21 n and n (10,100) (10,10) (10,10) )//()/( 22 2 2 2 1 2 1 σσ ss To find the confidence interval for begin with A confidence interval for 2 2 2 1 /σσ%100)1( α− 2 2 2 1 /σσ 2/ 2 2 2 1 2 2 2 1 2/1 2 2 2 1 2/12 2 2 2 2 1 2 1 2/ // / / αα αα σ σ σ σ F ss F ss F s sF << << − − Test statistic • Test statistic computed from the sample • The decision to reject or not to reject the null hypothesis depends on the magnitude of the test statistic. relevant to the statistic • General formula for test statistic • Distribution of test statistic key to statistical inference the distribution of follows the standard normal distribution if the null hypothesis is true and the assumptions are met mean population a of valueedhypothesiz a is where / 0 0 µ σ µ n xz −= statisticrelevent theoferror standard parameter edhypothesiz - statisticrelevent statistictest = n xz /σ µ− = n xz / 0 σ µ− = Decision rule • All possible values that the test statistic can assume are divided into two groups; – one group the rejection region values less likely occur if the null hypothesis is true – the other group nonrejection region values more likely occur if the null hypothesis is true • Reject the null hypothesis if the value of the test statistic computed from the sample is one of the values in the rejection region • Not to reject the null hypothesis if the value of the test statistic computed from the sample is one of the values in the nonrejection region Significance level • Level of significance is a probability of rejecting a true null hypothesis • Level of significance set-up the rejection and nonrejection regions • A computed value of the test statistic that falls in the rejection region is said to be significant • Since to reject a true null hypothesis would constitute an error, it seems only reasonable that make small • Frequently take .01, .05 and .10 α α α A single population mean: sampling from normally distributed population, population variance known • The testing statistic for testing is • Example : Given a sample of 10 individuals randomly drawn from the population of interest. The sample mean is We want to conclude if the mean age of this population is different from 30 years? Assumption : the population whose ages are approximately normally distributed with a variance We can conclude that the mean age is different from 30 if the null hypothesis is rejected. • Test statistic follows standard normal distributed 00 : µµ =H n xz / 0 σ µ− = 30:0 =µH 27=x 202 =σ 30: ≠µAH 10/20 30− = xz Example continued • Let 05.=α 0.95 -1.96 1.96 96.1975. =z ,96.1025. −=z 025.2/ =α 025.2/ =α 1.96z1.96- 1.96 z and 96.1 ≤≤ >−<z Rejection region Nonrejection region Z value for the current sample is reject Conclude that mean age is not 30 P value : p{z<-2.12 or z>2.12}=.0340 from table 30:0 =µH 12.2 10/20 3027 −= − =z Testing by means of a confidence interval • For the example, we can arrive at the same conclusion by using a percent confidence interval. • The 95 percent confidence interval for • Since the interval does not include 30, we say 30 is not a candidate for the mean, therefore is not equal to 30, and is rejected. 0H 100*)1( α− µ 7718.29,2282.24 10/2096.127 ± 0Hµ When testing a null hypothesis by means of a two-side confidence interval, we reject at the level when the hypothesized parameter is not contained within the percent confidence interval. If the hypothesized parameter is contained within the interval , cannot be rejected at the level of significance 0H 100*)1( α− 0H α α
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved