Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Two-Sample Hypothesis Testing Lab: T-Tests and Variance Analysis, Lab Reports of Statistics

Instructions for performing two-sample hypothesis tests using r software and the 't.test' function. The lab covers independent samples t-tests for the difference in population means, one-sided and two-sided tests, and the test for homogeneity of variances. Students are expected to read the document, complete the lab exercises, and write out null and alternative hypotheses, significance levels, and conclusions.

Typology: Lab Reports

Pre 2010

Uploaded on 07/30/2009

koofers-user-pq1
koofers-user-pq1 🇺🇸

10 documents

1 / 4

Toggle sidebar

Related documents


Partial preview of the text

Download Two-Sample Hypothesis Testing Lab: T-Tests and Variance Analysis and more Lab Reports Statistics in PDF only on Docsity! Lab 3 STAT 3000 Two Sample Hypothesis Testing • Importing Data (repeat): First download the kudzu dataset and put it in a familiar directory (create one if you need to, or put it on the desktop). Start R (if it's not already started) and click on the 'File' drop down menu at the top left. Choose 'Change Directory' and then browse to find the directory (or folder) where you stored the data file. Click on that directory and then click 'Ok'. Now R knows to look for the data file you will soon be referring to in the appropriate folder. In order to get that data into R, use the following commands: > kudzu.df=read.csv("kudzu.csv") > kudzu.df Notice that the kudzu data are in two columns (with fairly long headings) and the first column contains only 20 numbers and then a sequence of "NA". That is because for the first sample, there are only 20 observations, whereas the second column or sample has a total of 25 observations. To make the dataset more manageable, rename the column headings to something shorter: > names(kudzu.df)=c("without","with") > kudzu.df Notice that the names have now been changed. See page 384 of the textbook for a description of this dataset. • Two Sample t-Tests for Independent Samples: The built-in function called 't.test' can also perform calculations for hypothesis tests for the difference in population means. By default, the 't.test' function provides information about a two-sided hypothesis test for the situation where the null difference in means is zero. In order to change it you just have to specify a few more options in the command. For example, assuming the population variances are unequal, to perform a one-sided hypothesis test to determine if the difference between mean pulp yield that has been treated and mean pulp yield that has not been treated is less than 5 (this implies the null hypothesis is that the difference in means is greater than or equal to 5): > t.test(x=kudzu.df$with,y=kudzu.df$without, alt="less",mu=5,var.equal=FALSE) Notice the ordering of the samples in the R command, this is important, if the order was switched we would have to use 'mu=-5' and 'alt="greater"'. If we are using a 0.05 level of significance, would we have sufficient evidence to reject the null hypothesis given this information? What if you wanted to test whether the population means are significantly different? Use the following command: > t.test(x=kudzu.df$with,y=kudzu.df$without, var.equal=FALSE) Notice that here the ordering of samples could be switched with no effect on the results because the test is two-sided with null difference equal to zero. Now would you reject the null hypothesis? Does the two-sided confidence interval for difference in population means support this decision? Does assuming the population variances are equal change the p-value? • Paired Test for Two Population Means: When you have paired data you only have to change one option in the 't.test' function in R. For example, download the dataset called 'golfball.csv' and go through the usual steps to load it into R (change directory to where you put it, etc…): > golf.df=read.csv("golfball.csv") > names(golf.df)=c("golfer","old","new") > golf.df
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved