Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Two-Sample Hypothesis Testing: Comparing the Means of Two Variables, Study notes of Electronic Measurement and Instrumentation

How to perform hypothesis tests to compare the means of two variables, xa and xb, using both paired and unpaired samples. The procedures for calculating the critical t-statistic and p-value, setting the null and alternative hypotheses, and interpreting the results. It also discusses the differences between paired and unpaired samples and the various options for calculating degrees of freedom.

Typology: Study notes

2012/2013

Uploaded on 10/02/2013

sonu-kap
sonu-kap 🇮🇳

4.4

(38)

167 documents

1 / 3

Toggle sidebar

Related documents


Partial preview of the text

Download Two-Sample Hypothesis Testing: Comparing the Means of Two Variables and more Study notes Electronic Measurement and Instrumentation in PDF only on Docsity! Two Samples Hypothesis Testing, Page 1 Two Samples Hypothesis Testing Introduction • In a previous learning module, we discussed how to perform hypothesis tests for a single variable x. • Here, we extend the concept of hypothesis testing to the comparison of two variables xA and xB. Two Samples Hypothesis Testing when n is the same for the two Samples Two-tailed paired samples hypothesis test: • In engineering analysis, we often want to test whether some modification to a system causes a statistically significant change to the system (the system is either improved or made worse). • We conduct some experiments in which the sample mean Ax of sample A (without the modification) is indeed different than the sample mean Bx of sample B (with the modification). In other words, the modification appears to have led to a change, but is the change statistically significant? • Here we discuss the simplest such statistical test – a test of whether one sample of data has a significantly different predicted population mean compared to a second sample of data, and with the number of data points n being the same in the two samples. • Statisticians refer to this case (equal n in the two samples) as a paired samples hypothesis test. • The procedure is very similar to the single-sample hypothesis tests we have already discussed, except that we replace variable x by the difference between the two variables, B Ax xδ = − . • In a two-tailed paired-samples hypothesis test, we want to know whether there is a statistically significant change in the predicted population means of the two samples. We don’t care if the change is positive or negative in a two-tailed hypothesis test – we are concerned only about whether there is a change. • From the definition of variable δ, we see that an appropriate null hypothesis is δ = 0, i.e., there is no change in the population mean between the two samples (the least likely scenario). Thus, we set: [This is a two-tailed hypothesis test.] o Null hypothesis: Critical value is μ0 = 0; the least likely scenario is μ = μ0 (there is no statistically significant change in the population means). [This is the least likely scenario since A Bx x≠ .] o Alternative hypothesis: (opposite of the null hypothesis), μ ≠ μ0. In other words, either μ < μ0 or μ > μ0 (there is a statistically significant change in the population means). [This is the most likely scenario since A Bx x≠ .] • The critical t-statistic is calculated as previously, but using the sample mean of δ instead of x, and the sample standard deviation of δ instead of x, i.e., 0 / t S nδ δ μ− = . • The corresponding p-value is calculated as previously, based on the critical t-statistic. In this case we are considering a two-tail hypothesis test. p is calculated in Excel using the function TDIST(ABS(t),df,2), where df is the number of degrees of freedom, df = n – 1, and the “2” specifies two tails. • If Excel is not available, we can use tables; some modern calculators can also calculate the p-value. • We formulate our conclusions (to 95% confidence level) based on the p-value: o If p < 0.05, we reject the null hypothesis because the least likely scenario (μ = μ0) has less than a 5% chance of being true. Thus, we can state confidently that there is a statistically significant change in the population mean of the variable, i.e., μA ≠ μB. o If 0.05 < p < 0.95, we cannot reject or accept the null hypothesis because the least likely scenario (μ = μ0) has more than a 5% chance of being true, but less than a 95% chance of being true. The results are therefore inconclusive – we should conduct more tests. o If p > 0.95, we accept the null hypothesis because what we set as the least likely scenario (μ = μ0) turns out to have more than a 95% chance of being true. Thus, we can state confidently that there is no statistically significant change in the population mean of the variable, i.e., μA = μB. One-tailed paired samples hypothesis test: [This is the more common one used in engineering analysis.] • We assume here that our experiments yield B Ax x> . In other words, the modification we made leads to an improvement in the mean between Sample A and Sample B. But is the improvement statistically significant? • In a one-tailed paired-samples hypothesis test, we want to know whether there is a statistically significant improvement in the predicted population means of the two samples. From the definition of variable δ, we see docsity.com Two Samples Hypothesis Testing, Page 2 that an appropriate null hypothesis is δ < 0, i.e., the modification caused the population mean between the two samples to decrease (the least likely scenario since we are assuming here that our experiments show that B Ax x> ). Thus, we set: [This is a one-tailed hypothesis test.] o Null hypothesis: Critical value is μ0 = 0; the least likely scenario is μ < μ0 (the population mean has decreased due to the modification, or μB < μA). [This is the least likely scenario since B Ax x> .] o Alternative hypothesis: μ > μ0. In other words, there is a statistically significant increase in the population means, μB > μA). [This is the most likely scenario since B Ax x> .] • The critical t-statistic is calculated exactly as above for the two-tailed test. • The corresponding p-value is calculated based on the critical t-statistic. In this case we are considering a one- tail hypothesis test. So, p is calculated in Excel using the function TDIST(ABS(t),df,1), where the “1” specifies one tail. You can also use the tables if Excel is not available; do not multiply p by 2 for a 1-tail test. • For a one-tailed hypothesis test in which the null hypothesis is set to the least likely scenario, the p-value is limited in range from 0 to 0.5 (0% to 50%). Thus, we formulate our conclusions (to 95% confidence level) as follows: o If p < 0.05, we reject the null hypothesis because the least likely scenario (μB < μA) has less than a 5% chance of being true. Thus, we can state confidently that there is a statistically significant increase in the population mean of the variable, i.e., μB > μA. o If 0.05 < p < 0.50, we cannot reject or accept the null hypothesis because the least likely scenario (μB < μA) has more than a 5% chance of being true, but less than a 50% chance of being true. The results are therefore inconclusive – we should conduct more tests. • For 99% confidence, substitute 0.01 for 0.05 in the above criteria. • Excel has a built-in macro in Data Analysis that performs this type of hypothesis test automatically. It is called t-Test: Paired Two Sample for Means. • The procedure is best illustrated by example, which we will do in class. Two Sample Hypothesis Testing when n is not the same for the two Samples Two-tailed un-paired samples hypothesis test: • Now consider the more general case in which the number of data points nA in sample A is not the same as the number of data points nB in sample B (e.g., nA = 10 and nB = 15). • The analysis is similar to the above simpler case, except we need to combine the two samples in some appropriate manner to calculate the t-statistic. • Consider the following general case: o Sample A: Number of data points = nA, sample mean = Ax , and sample standard deviation = SA. o Sample B: Number of data points = nB, sample mean = Bx , and sample standard deviation = SB. o Our goal is to predict whether there is a statistically significant difference between μA (the population mean of sample A) and μB (the population mean of sample B). • Statisticians refer to this kind of hypothesis test as hypothesis testing of two independent samples. • As usual, we set the null hypothesis and alternative hypothesis: o Null hypothesis: There is no difference between the population means, i.e., μA = μB. [This is the least likely scenario since A Bx x≠ .] [This is a two-tailed hypothesis test.] o Alternative hypothesis: μA ≠ μB. In other words, either μA > μB or μA < μB. [This is the most likely scenario since A Bx x≠ .] • The critical t-statistic is formed using a root sum of the squares approach, similar to the way we handled multiple uncertainties previously using RSS uncertainty analysis, namely, 2 2 A B A B A B x xt S S n n − = + . • The corresponding p-value is calculated as previously, based on the critical t-statistic. In this case we are considering a two-tail hypothesis test. p is calculated in Excel using TDIST(ABS(t),df,2), where df is the number of degrees of freedom, and the “2” specifies two tails. Use the tables if Excel is not available. • But what should we use as the value of df? There are several options, and statisticians seem to disagree on which is best: docsity.com
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved