Download Data Analysis Lecture: Evaluating and Comparing Data Sets - Prof. Benjamin Kerr and more Study notes Ecology and Environment in PDF only on Docsity! 1 Data Analysis E3: Lab Lecture • Suppose your hear that a high-protein diet during puberty leads to an increased height as an adult. - The mean height in a high protein treatment was 5’11” and the mean height in a control treatment was 5’5” - What would you feed your kids? How do you gauge this? • The New York Times has just done an expose about sexism in graduate admissions in a famous department of mathematics - While the number of male and female applicants was equal, the number of males admitted was greater. - Should an formal inquiry take place? How do you evaluate the data? How do we evaluate data? • When you were a child, your father tells you he will let you stay up late if the result of a coin he flips is heads. - Suppose the coin comes up heads 25% of the time - Is your Dad using a fair coin? How would you evaluate this? 12 3 100 25 4 1 Number of flips Number of “heads” Control High-protein Control High-protein ♀a ♂a Control High- protein ♀e ♂e fr eq ue nc y Data Analysis Lecture Outline • Processing Data (using Excel) • Visualizing Data (using Excel) • Analyzing Data (using Excel) -Difference in means (t-test) -Difference in distributions (χ2 test) Data Analysis Lecture Outline • Processing Data (using Excel) • Visualizing Data (using Excel) • Analyzing Data (using Excel) -Difference in means (t-test) -Difference in distributions (χ2 test) Handling Data • After a laboratory experiment or time out in the field, you will have several data points. • How should one process this (potentially voluminous) data? - A first step to dealing with your data is to organize it (spreadsheet programs, like Excel, can help) - A second step is to process your data 1) Investigate portions of the data set 2) Look at relevant descriptive statistics 3) Transform data points in a well-defined way 4) Combine data points in a well-defined way - A third step is to visualize your data - A fourth step is to subject your data to an appropriate statistical test Massaging? Dressing-up? 3 F colonies 4 N colonies Fitness Across Two Environments 0 0.5 1 1.5 Tube Dish Competition Arena w (F ,N ) * Focusing Worksheet • Go to our class website: http://faculty.washington.edu/kerrb/biol481/ • On the “Class Data” link, download the file labeled “Excel Practice Sheet” • DEMO: Functions in Excel • GOALS - Get comfortable using fx line in Excel - Understand the “translational” properties of Excel - Learn to fix rows and columns (or both) with $ • Fill in the empty boxes under the “Simple Functions” tab on the “Excel Practic Sheet” file (you can check your work with your calculator). 2 Data Analysis Lecture Outline • Processing Data (using Excel) • Visualizing Data (using Excel) • Analyzing Data (using Excel) -Difference in means (t-test) -Difference in distributions (χ2 test) Picture = Words × 1000 Grade Distribution A B C D E Understanding the Black Box 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 Number of Trials A cc u ra cy o f P re d ic ti o n . Colony Distribution from the Luria-Delbruck Experiment 0 0.2 0.4 0.6 1 3 5 7 9 11 13 15 17 19 21 Colonies F re q u en cy . Observed Expected • We are visual animals and often can see patterns when data is presented visually • Examples: - Pie-chart illustrates the distribution of values of a single variable - X-Y plot illustrates the form of the relationship between two variables - Paired histograms illustrate the relationship between the distributions of two variables. • The most appropriate picture will often depend on the data: - Categorical or quantitative? - Frequencies, counts or measurements? - Relationship between data points? Worksheet • DEMO: Graphing in Excel • GOALS - Get comfortable using the “Chart Wizard” - Picking the right graphical representation for your data - Labeling your axes and adding a title • Graph both y=erx and y=esx on the same plot (you can check your work with your calculator if it graphs). • Label your x axis “x” and your y axis “y” and title your graph “Exponential Growth” • What happens when you change the value of r from 0.1 to −0.1? Data Analysis Lecture Outline • Processing Data (using Excel) • Visualizing Data (using Excel) • Analyzing Data (using Excel) -Difference in means (t-test) -Difference in distributions (χ2 test) Student’s t-test William Sealy Gossett • DEMO: Performing a t-test - Computing a p-value from a t-test - Distinguish the different types of t-tests: Paired versus Unpaired data Equal versus Unequal variance One-tailed versus Two-tailed tests • Gossett published a paper using the pseudonym “Student” that dealt with distinguishing the differences between means of small data sets. • The t-test uses the statistics from two groups of data (means and s.d.) to generate a third statistic (the t statistic). • If the two groups of data come from populations with the same mean, the t statistic has a characteristic distribution itself (note the shape will depend on the sample sizes). • If the computed t is extreme, then the chance that there are equal means from the two groups is slim (this is quantified by the p-value from the test). The means are significantly different if p<0.05. • Assumptions - Each datum is independent - Data is normally distributed Worksheet 10mL 5mL t = 0 t = 0 t = 24 t = 24 TUBE I DISH I 10mL 10-1 10-2 10-3 10-4 10-5 10-6 10-1 10-2 10-3 10-4 10-5 10-6 1/10 1/10 1/10 1/10 1/10 1/10 1/2 1/10 1/10 1/10 1/10 1/10 1/10 100µL 100µL F N 50µL 100µL TUESDAY WEDNESDAY THURSDAY • Click on the “Tradeoff Data” tab. From the colony counts, write functions that will give the cell counts Fb, Nb, Fe, Ne. Then write functions giving w(F,N). • After finding average fitnesses (use the function “AVERAGE”), graph the average fitnesses from the TUBE and DISH environments. Label your graph. • Perform an unpaired and paired t-test on your data. Which test should you use for this data? What can you conclude?