Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Data Analysis Lecture: Evaluating and Comparing Data Sets - Prof. Benjamin Kerr, Study notes of Ecology and Environment

A lecture outline for a data analysis class focusing on evaluating and comparing data sets. Topics include processing and organizing data using excel, visualizing data through graphs and charts, analyzing data through t-tests and chi-square tests, and understanding the assumptions and limitations of these statistical tests. The lecture also includes exercises and demos to help students gain practical experience.

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-tza
koofers-user-tza 🇺🇸

10 documents

1 / 3

Toggle sidebar

Related documents


Partial preview of the text

Download Data Analysis Lecture: Evaluating and Comparing Data Sets - Prof. Benjamin Kerr and more Study notes Ecology and Environment in PDF only on Docsity! 1 Data Analysis E3: Lab Lecture • Suppose your hear that a high-protein diet during puberty leads to an increased height as an adult. - The mean height in a high protein treatment was 5’11” and the mean height in a control treatment was 5’5” - What would you feed your kids? How do you gauge this? • The New York Times has just done an expose about sexism in graduate admissions in a famous department of mathematics - While the number of male and female applicants was equal, the number of males admitted was greater. - Should an formal inquiry take place? How do you evaluate the data? How do we evaluate data? • When you were a child, your father tells you he will let you stay up late if the result of a coin he flips is heads. - Suppose the coin comes up heads 25% of the time - Is your Dad using a fair coin? How would you evaluate this? 12 3 100 25 4 1 Number of flips Number of “heads” Control High-protein Control High-protein ♀a ♂a Control High- protein ♀e ♂e fr eq ue nc y Data Analysis Lecture Outline • Processing Data (using Excel) • Visualizing Data (using Excel) • Analyzing Data (using Excel) -Difference in means (t-test) -Difference in distributions (χ2 test) Data Analysis Lecture Outline • Processing Data (using Excel) • Visualizing Data (using Excel) • Analyzing Data (using Excel) -Difference in means (t-test) -Difference in distributions (χ2 test) Handling Data • After a laboratory experiment or time out in the field, you will have several data points. • How should one process this (potentially voluminous) data? - A first step to dealing with your data is to organize it (spreadsheet programs, like Excel, can help) - A second step is to process your data 1) Investigate portions of the data set 2) Look at relevant descriptive statistics 3) Transform data points in a well-defined way 4) Combine data points in a well-defined way - A third step is to visualize your data - A fourth step is to subject your data to an appropriate statistical test Massaging? Dressing-up? 3 F colonies 4 N colonies Fitness Across Two Environments 0 0.5 1 1.5 Tube Dish Competition Arena w (F ,N ) * Focusing Worksheet • Go to our class website: http://faculty.washington.edu/kerrb/biol481/ • On the “Class Data” link, download the file labeled “Excel Practice Sheet” • DEMO: Functions in Excel • GOALS - Get comfortable using fx line in Excel - Understand the “translational” properties of Excel - Learn to fix rows and columns (or both) with $ • Fill in the empty boxes under the “Simple Functions” tab on the “Excel Practic Sheet” file (you can check your work with your calculator). 2 Data Analysis Lecture Outline • Processing Data (using Excel) • Visualizing Data (using Excel) • Analyzing Data (using Excel) -Difference in means (t-test) -Difference in distributions (χ2 test) Picture = Words × 1000 Grade Distribution A B C D E Understanding the Black Box 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 Number of Trials A cc u ra cy o f P re d ic ti o n . Colony Distribution from the Luria-Delbruck Experiment 0 0.2 0.4 0.6 1 3 5 7 9 11 13 15 17 19 21 Colonies F re q u en cy . Observed Expected • We are visual animals and often can see patterns when data is presented visually • Examples: - Pie-chart illustrates the distribution of values of a single variable - X-Y plot illustrates the form of the relationship between two variables - Paired histograms illustrate the relationship between the distributions of two variables. • The most appropriate picture will often depend on the data: - Categorical or quantitative? - Frequencies, counts or measurements? - Relationship between data points? Worksheet • DEMO: Graphing in Excel • GOALS - Get comfortable using the “Chart Wizard” - Picking the right graphical representation for your data - Labeling your axes and adding a title • Graph both y=erx and y=esx on the same plot (you can check your work with your calculator if it graphs). • Label your x axis “x” and your y axis “y” and title your graph “Exponential Growth” • What happens when you change the value of r from 0.1 to −0.1? Data Analysis Lecture Outline • Processing Data (using Excel) • Visualizing Data (using Excel) • Analyzing Data (using Excel) -Difference in means (t-test) -Difference in distributions (χ2 test) Student’s t-test William Sealy Gossett • DEMO: Performing a t-test - Computing a p-value from a t-test - Distinguish the different types of t-tests:  Paired versus Unpaired data  Equal versus Unequal variance  One-tailed versus Two-tailed tests • Gossett published a paper using the pseudonym “Student” that dealt with distinguishing the differences between means of small data sets. • The t-test uses the statistics from two groups of data (means and s.d.) to generate a third statistic (the t statistic). • If the two groups of data come from populations with the same mean, the t statistic has a characteristic distribution itself (note the shape will depend on the sample sizes). • If the computed t is extreme, then the chance that there are equal means from the two groups is slim (this is quantified by the p-value from the test). The means are significantly different if p<0.05. • Assumptions - Each datum is independent - Data is normally distributed Worksheet 10mL 5mL t = 0 t = 0 t = 24 t = 24 TUBE I DISH I 10mL 10-1 10-2 10-3 10-4 10-5 10-6 10-1 10-2 10-3 10-4 10-5 10-6 1/10 1/10 1/10 1/10 1/10 1/10 1/2 1/10 1/10 1/10 1/10 1/10 1/10 100µL 100µL F N 50µL 100µL TUESDAY WEDNESDAY THURSDAY • Click on the “Tradeoff Data” tab. From the colony counts, write functions that will give the cell counts Fb, Nb, Fe, Ne. Then write functions giving w(F,N). • After finding average fitnesses (use the function “AVERAGE”), graph the average fitnesses from the TUBE and DISH environments. Label your graph. • Perform an unpaired and paired t-test on your data. Which test should you use for this data? What can you conclude?
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved