¡Descarga Effective Visualization of Categorical Data: Graphs, Frequency Tables, and Histograms - Pr y más Apuntes en PDF de Estadística solo en Docsity! Chapter 3 Displaying and Describing Categorical Data Slide 3 - 1 In “Aula Virtual & Fotocopiadora” (telephones off; computers off) Today’s Questions • Once we have collected a large number of measurements, how can we summarize or describe those measurements most effectively by using visual techniques? • Answer: Always draw a graph to begin with!!!!! Slide 3 - 2 Frequency Polygon • A visual representation of information contained in a frequency table • Align all possible values on the bottom of the graph (the x-axis) • On the vertical line (the y- axis), place a point denoting the frequency of scores for each value • Connect the lines • (Typically add an extra value above and below the actual range of values) Slide 3 - 5 Stress Rating Fr eq ue nc y 0 2 4 6 8 10 0 5 10 15 20 25 30 Histograms • Another way of visually representing information contained in a frequency table • Histograms are kind of like bar charts; bars are used instead of connected points • The bars typically cover “intervals” of values. The first bar here covers scores > 0 and < 1. Slide 3 - 6 0 2 4 6 8 10 0 5 10 15 20 25 30 Stress Rating Fr eq ue nc y Pie Charts and Nominal Data • Pie charts are commonly used to represent the frequency of scores for nominal data • Here, frequency of referents in a letter written by a subject in a psychological study. • 70% of the pronouns are in reference to the writer; 10% are in reference to the person being written to. Slide 3 - 7 Other 20% Self 70% Misc 10% Other Self Misc Frequency Tables: Making Piles (cont.) • A relative frequency table is similar, but gives the percentages (instead of counts) for each category: Titanic death as a function of class Slide 3 - 10 Frequency Tables: Making Piles (cont.) • Both types of tables show how cases are distributed across the categories. • They describe the distribution of a categorical variable because they name the possible categories and tell how frequently each occurs. Slide 3 - 11 Important point to take into account: What’s Wrong With This Picture?: Problems • You might think that a good way to show the Titanic data is with this display: Slide 3 - 12 Bar Charts (cont.) • A relative frequency bar chart displays the relative proportion of counts for each category. • A relative frequency bar chart also stays true to the area principle. • Replacing counts with percentages in the ship data: Slide 3 - 15 Pie Charts • When you are interested in parts of the whole, a pie chart might be your display of choice. • Pie charts show the whole group of cases as a circle. • They slice the circle into pieces whose size is proportional to the fraction of the whole in each category. Slide 3 - 16 Contingency Tables • A contingency table allows us to look at two categorical variables together. • It shows how individuals are distributed along each variable, contingent on the value of the other variable. – Example: we can examine the class of ticket and whether a person survived the Titanic: Slide 3 - 17 Conditional Distributions • A conditional distribution shows the distribution of one variable for just the individuals who satisfy some condition on another variable. – The following is the conditional distribution of ticket Class, conditional on having survived: Slide 3 - 20 Conditional Distributions (cont.) – The following is the conditional distribution of ticket Class, conditional on having perished: Slide 3 - 21 Conditional Distributions (cont.) • The conditional distributions tell us that there is a difference in class for those who survived and those who perished. • This is better shown with pie charts of the two distributions: Slide 3 - 22