Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistics Worksheet - Homework 6 - Confidence Intervals and Tests | BIOL 356, Assignments of Biology

Material Type: Assignment; Professor: Carrington; Class: FNDATNS IN ECOLOGY; Subject: Biology; University: University of Washington - Seattle; Term: Winter 2008;

Typology: Assignments

Pre 2010

Uploaded on 03/18/2009

koofers-user-gne
koofers-user-gne 🇺🇸

4

(1)

10 documents

1 / 8

Toggle sidebar

Related documents


Partial preview of the text

Download Statistics Worksheet - Homework 6 - Confidence Intervals and Tests | BIOL 356 and more Assignments Biology in PDF only on Docsity! Biology 356, Statistics worksheet p. 1 Homework 6: Confidence intervals and t-tests Name _____________________ Due Tues. Feb 26, 2008 Section/ TA ________________ Note: This worksheet was designed to introduce you to statistical concepts that may be unfamiliar, but are necessary for testing ecological (and other biological) hypotheses. What you will learn in this worksheet should make field-class 5 much easier and should also improve your general capacity in study design and analysis. Do this before assignment before finishing FC5! Objectives: 1. Extend your ability to communicate variation in data. To this point, you have calculated variance, standard deviation, and standard error. In this worksheet, you add the concept of confidence intervals. These are particularly useful because they allow you to evaluate visually whether two groups are the “same” (because their confidence intervals overlap). 2. Develop the ability to test statistically if two groups are the “same” based on a t-test. Understand how to interpret p-values. Know when to apply paired vs. unpaired and 1- tailed vs. 2-tailed tests. ***Use the HW06stats_answersheet to respond to each numbered question**** Part 1: What are confidence intervals? Confidence intervals are useful for evaluating how close your sample mean is to the true population mean. Conceptually, what is the difference between a sample mean and a true mean? _1._ Read through p. 360 in your Molles text on calculating a confidence interval. It indicates the chance that your confidence interval actually contains the true population mean – obviously, you would like this chance to be high, for instance, 95%. What two statistics are used to calculate the confidence interval? _2._ The level of significance (α) can be understood as the decimal value of the percent probability that the true population mean lies outside of the range of the confidence interval. It is essentially the opposite of the confidence interval: a 95% CI corresponds to α=0.05, indicating a 95% chance that the true mean is inside the CI and 5% probability that the true mean is outside of the CI. To ensure that the confidence interval has a high probability of containing the population mean, ecologists (and biologists in general) set α to 0.05 or lower. Fill in the table describing the relationship between confidence intervals, α, and chance that the CI contains the true population mean. _3._ Like for standard error, the calculation of the probability (e.g. 95%) that the confidence interval contains the true mean is based on repeatedly sampling a population and calculating a confidence interval for each sample. For a 95% confidence interval, 95% of these (repeatedly sampled) confidence intervals will contain the true population mean. Click on the following link to open the online statistics textbook http://onlinestatbook.com/ From the top menu select, Mode: standard, Chapter: 8 Estimation, Section: Confidence interval simulation. Click on the “show simulation” button to run the confidence interval simulation. In this pop-up window, first read the general instructions. Next, click on the “general instructions” Biology 356, Statistics worksheet p. 2 tab and change it to “step-by-step instructions”. What is the true population mean and standard deviation? _4._ Which confidence limits are wider? _5._ Which confidence limits are more likely to include the true population mean? _6._ How does increasing the sample size change the width of the confidence intervals? _7._ How does changing the sample size change the proportion of the time that the intervals contain the true population mean? _8._ To check your understanding (and your answers!), run through the questions in the light green box on the main page of the confidence interval simulation. The second time that you run through the questions the program will tell you if you are correct and explain why. Part 2: What is the role of the critical value of t for confidence intervals? The critical value of t that is used to calculate confidence intervals (and other statistical tests such as t-tests) is generated from a curve called the t distribution. To understand how t-values are involved in the calculation of a confidence interval, click on the link for the online statistics textbook http://onlinestatbook.com/ From the top menu select, Mode: standard, Chapter: 8 Estimation, Section: t distribution. Scroll down to the bottom of this page and select the link for Online: Calculator: t distribution. The t calculator in the pop-up window shows a graph of the t distribution and allows you to visualize how the critical value of t defines the confidence interval for a particular significance level (α). The t distribution is a symmetric curve (note: not a bell-shaped normal distribution; the t-distribution has a slightly different shape from a normal distribution) centered on 0. You enter the degrees of freedom (df) and the t-value that is appropriate for your desired level of significance (e.g. α=0.05). How is df related to your sample size (see Molles p. 387/360)? _9._ You can use Molles p. 554) to select a t-value. For instance, what t-value is appropriate for sample size (n) =10 and α=0.05? _10._ Enter these df and t values into the online simulation. How much of the area under the curve is shaded? _11._ Is this “shaded area” contained in one tail or both tails of the t distribution? _12._ The scenario you just entered corresponds to the example on p. 360 of Molles. The unshaded area represents 95% of the area under the curve and the shaded areas represent the other 5%. Therefore the 95% or α=0.05 point on the t distribution equals the critical value of t. The critical value of t is a cut-off point corresponding to a particular confidence interval (or α-level). Do larger critical values of t give you more or less confidence in containing the true population mean? _13._ Repeat the simulation on the t calculator for a 99% confidence interval for the mean body length of the loach minnow population in your text. In this case, what is df? α? critical t-value? _14._ Is the unshaded area wider or narrower than for α =0.05? _15._ Likewise, is a 99% confidence interval wider or narrower than a 95% confidence interval? _16._ Just above, you examined how confidence intervals change with α. Now let’s explore how they change with sample size (n). On the critical value of t table (p. 554), look only at the column of t values listed under α=0.05. As your sample size (and degrees of freedom) increase does the critical value of t increase or decrease? _17._ Will this increase or decrease the width of the confidence interval that you calculate? _18._ Biology 356, Statistics worksheet p. 5 A one-tailed t-test to differentiate between means is appropriate when you have a directional hypothesis. For example, your null hypothesis is that sample mean A is less than or equal to sample mean B. Therefore, you will reject the null hypothesis only if sample A is significantly larger than sample mean B. The following link below summarizes one and two tailed tests to help you answer the following questions on this worksheet. http://onlinestatbook.com/chapter9/tailsC.html (If this link does not work go to http://onlinestatbook.com/ and from the top menu bar select: Mode: Condensed, Chapter: 9. Logic of hypothesis testing, Section: One and two-tailed tests) Is it statistically valid to decide to do a one-tailed t-test after you look at your data?_29._ Which type of t-test is more common in scientific research (including ecology)?_30._ When are one-tailed tests appropriate?_31._ The calculation of a one-tailed t-test is the same as a two tailed t-test, however, as you will see in the next simulation, the critical value of t is different for a two-tailed test and a one-tailed test with the same level of significance. To visualize the difference between one and two tailed probabilities click on the following link to open the t calculator from the online stat book website. http://onlinestatbook.com/calculators/t_calc.html In this exercise you will compare the shaded areas (P-values) under the t-distribution graph for two- and one-tailed t-tests. Begin with a 2-tailed test (default). Use the example dataset for N. acacia t-test on Molles p. 422-423. Enter the appropriate pooled df and critical t-value for α=0.05. Sketch the shaded area that you see on the t-calculator on the two-tailed t-distribution, and fill in blank values. _32a._ Now change to a 1-tailed test: Keep the DFpooled and the critical t- value the same, and click on the button for a one-tailed test on the t-calculator. What is the new shaded area (P-value)? _33._ To re-calculate this one-tailed t-test with a level of significance of 0.05, return to the table of critical t-values in Molles and examine the entire row of values corresponding to your DF. Enter these critical t-values into the t-calculator until you find one that produces a shaded area of 0.05, which is equal to a one tailed t-test of α = 0.05. Sketch the shaded area that you see on the t-calculator on the one-tailed t-distribution, and fill in blank values. _32b._ In general, what are the relative values of α for a two-tailed vs. one-tailed t-test with the same critical value of t and the same degrees of freedom? _34._ Compare the shaded right hand tail of the two–tailed distribution that you sketched in part 32 to the one-tailed distribution that you sketched. Is the shaded area in the right hand tail of the one-tailed distribution larger or smaller than the shaded right hand tail of the two-tailed distribution? _35._ The shaded area represents the probability of rejecting the null hypothesis. So, does the two-tailed or one-tailed t-test give you a greater probability of rejecting the null hypothesis in the right hand direction? _36._ This illustrates why the one-tailed t-test has a greater power to reject a directional null hypothesis than a two-tailed t-test if both tests have the same level of significance. Paired vs. unpaired When your observations or measurements for your two samples are made at independent sites and none of individuals in the two treatments are the same then you can compute an unpaired t- test. This is the kind of t-test that you learned to calculate in your text. However, when there is a correlation between the individual data points in your two samples, paired t-tests are necessary. In a paired experimental design the data from the two experimental treatments are not Biology 356, Statistics worksheet p. 6 independent. Each data point for one treatment is correlated with one data point from the second treatment, such that they can be paired.1 An example of this is when the two traits that you are comparing are measured from the same individuals or the two treatments in your experiment are applied to the same individuals. Another example is when two treatments are applied to either side of the same experimental plot. Using a paired experimental design is a way to control for variation between individuals or sites. The formula for a paired t-test is more complicated than an unpaired t-test so if you need to do this type of analysis you will need to look up the formula in a statistics textbook or online. Is the t-test comparing the mean biomass of N. acacia in flooded and unflooded areas in your text (p.422-423) one-tailed or two-tailed? _37._ Is it paired or unpaired? _38._ You are an ecologist studying predator prey interactions in the marine intertidal zone. You are interested in the response to of the whelk snail Nucella lamellosa to chemical cues from the red rock crab, Cancer productus. You want to know whether the snails change their behavior in response to the presence of the crab. You decide to quantify the snails’ behavioral response in terms of speed of travel. There are three effects that water borne cues from the crab could have on a snail’s speed of travel; it could move faster, move slower, or not change its speed of travel. Depending on your hypothesis and experimental set-up you will use a different type of t-test to compare the means of your data. For each of the following examples circle the appropriate type of t-test (one-tailed vs. two, and unpaired vs. paired): Your null hypothesis is that chemical cues from crabs have no effect on the speed of travel of the whelk snails and your alternative hypothesis is that chemical cues do have an effect on the speed of travel of snails. You measure the speeds of ten snails (one at a time) in plain seawater and then measure the speeds of ten other snails in seawater from a tank containing a crab. You calculate the mean speed for each sample (plain seawater and crab cue) and compare with them with what type of t-test? _39._ Next you decide to use a different approach to test the same hypothesis. You take one snail, measure its crawling speed in seawater for two minutes and then add water from a tank containing a crab and measure the speed of the snail in the presence of the crab cue for two minutes. You repeat this with twenty snails. What type of t-test would you use to analyze this data? _40._ You do a similar experiment with a new population of snails where the only ecologically relevant effect is if snails move faster in the presence of predator cues from crabs. So now your null hypothesis is that snails do not move faster in the presence of crabs, and your alternative hypothesis is that that they do move faster. You use the same experimental design as above, where you measure the crawling speed of one snail before and after the addition of a predator cue from the crab. What t-test would you use to compare the sample means of your data? _41._ Now you use a different experimental design to test the hypothesis in question three, that snails move faster in the presence of predator cues from crabs. This time you compare the sample means from two treatments. In the control treatments you measure the speeds of ten snails in 1 Zar, J. H. 1999. Biostatistical Analysis 4th edition. Upper Saddle River, NJ: PrenticeHall 161-162. Biology 356, Statistics worksheet p. 7 seawater (one at a time) and then measure the speed of ten other snails in seawater containing chemical cues from a crab. What t-test would you use to compare the sample means in this experiment? _42._ Part 6: Case study: Using a t-test to compare the effect of disturbance on the means of two populations at a disturbed and undisturbed site. You are a wetlands ecologist and are interested in the effects of disturbance on the growth of the invasive reed canary grass, Phalaris arundinacea, which has been displacing native vegetation in local wetland areas2. You decide to test whether flooding has an effect on the above ground biomass of Phalaris. You set up an experiment similar to the one that you will do in Field Class 5. You sample the above ground biomass of Phalaris in ten randomly placed 0.25 m2 quadrats in a wetland area that flooded last spring, and a wetland area that remained unflooded. You take your samples back to the lab, dry and weigh them, and record the total dry biomass of Phalaris in each of your quadrats. What is your null hypothesis for this experiment? _43._ What is your alternative hypothesis? _44._ What type of t-test is appropriate for this dataset? _45._ Below is your biomass data from each treatment: Flooded Unflooded 61.6 39.3 64.6 26.3 55.6 32.4 45.2 21.5 50.6 60.3 70.5 24.3 67.7 36.4 57.5 47.4 66.5 33.2 42.3 57.2 Biomass (g) of Phalaris arundinacea Open the Excel spreadsheet for the t-test that you downloaded from the course website. Replace the N. acacia data with your new dataset. What are the pooled df, calculated t-value, and critical t-value for α=0.05? _46._ (Remember, you will have to determine pooled df based on sample size, then look up the appropriate critical t-value in Molles p. 554) Note: The sum of squares calculation can be disrupted by changes in sample size when you enter your new data. Double check that each data point in your sample has a corresponding calculation for the "difference between the mean and data point" in the same row. Also, make sure that the 2 Kercher, S. M. and Zedler, J. B. 2004. Multiple disturbances accelerate invasion of reed canary grass (Phalaris arundinacea L.) in a mesocosm study. Oecologia 138: 455-464.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved