Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Analysis of Discrete Variables in Archaeology: Chi-Square Hypothesis Testing, Lab Reports of World History

The concept of discrete variables in archaeology and the importance of using appropriate statistical procedures, such as the chi-square (x2) statistic, to test hypotheses about them. An example of hypothesis testing using x2 for the comparison of handaxe and chopper frequencies in an archaeological assemblage. It also discusses the role of statistics in archaeology and the significance of the x2 statistic in evaluating observed frequencies against theoretical frequencies.

Typology: Lab Reports

Pre 2010

Uploaded on 08/19/2009

koofers-user-76i-1
koofers-user-76i-1 🇺🇸

10 documents

1 / 7

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Analysis of Discrete Variables in Archaeology: Chi-Square Hypothesis Testing and more Lab Reports World History in PDF only on Docsity! Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you learned can be described in terms of the central tendency, such as mean length, and the variability about that mean, as expressed by the variance and standard deviation. But many archaeological data come to us as discrete variables, which need to be treated very differently. For example, among an assemblage of artifacts including handaxes, cleavers and scrapers there is no “mean artifact type.” In faunal assemblages there is no “mean animal species.” For these kinds of data we need to treat the individual categories, whether they are artifact types, raw material types or categories of ceramic decorations. These are discrete variables, and to test hypotheses about them we need to use appropriate statistical procedures. We will give you the opportunity to do this by applying discrete statistical analysis to the testing of hypotheses about differences in lithic artifact samples. Background Statistical analysis is very common to archaeology and other sciences, because it is the way we qualify our conclusions, rather than just make claims that they are true. It is true that many scientific conclusions can be reached without statistics. Indeed many major contributions from science are observations. Also, we need to understand when we need statistical support for our work and when we don’t. For example, we do not need statistics to test the hypothesis that one group used pottery and another did not; we document this by the presence or absence of ceramics in their assemblages. But a hypothesis concerning the cultural relationships between prehistoric groups based on the similarities and differences in their pottery styles may very well require that we use statistical analysis, since this involves degrees of similarity, and a decision about what is “significantly” similar or different. Statistics are most commonly applied when we are testing hypotheses about archaeological phenomena, such that it is the archaeological hypotheses that drive our work. We perform statistical analyses in order to verify that our observations on archaeological data are valid and not just impressions or erroneous conclusions based on small samples. In most cases there are two kinds of hypotheses that are being tested. One, we’ll call the research hypothesis, is based on the actual archaeological data, and our understanding of how those data are related. So if we pose the hypothesis that the cultural relationships between two groups are accurately reflected by their pottery styles, we need to be confidant that pottery indeed can be used to measure cultural similarities. We can support this by analogy to modern potters, and from the archaeological record itself. Once we have defined our research hypothesis, we can decide what kind of statistical hypothesis is appropriate. In the following exercises we will illustrate the sequence of testing a research hypothesis with a statistical hypothesis that is appropriate for the situation. But in the end, it is the archaeological issue that we are concerned with. There are many discrete statistical analytical methods. We will use the chi-square (X2) analysis. The X2 statistic is commonly used in testing archaeological hypotheses because we deal with so 51 many categorical variables as mentioned above. These can involve single classifications or analysis of independence in two-way (two categories compared at the same time) classification. In both, the X2 statistic allows us to evaluate the observed frequency of certain categories against a theoretical frequency of those categories. In some cases we are simply deciding if variations in our data are random, and therefore of little meaning. For example, I claim that in my site I found 46 handaxes and 54 choppers, and claim that the higher number of choppers is a significant difference, indicative of the cultural tradition that made those artifacts. You are not convinced, and ask me to test whether this difference could be due to chance. Let’s use this as an example of hypothesis testing in archaeology, supported by use of the the X2 statistic. Testing for Randomness of Observations – One Variable In this example, we are going to test the following research hypothesis: Ho: Any differences between handaxe and chopper frequencies in my assemblage are due to chance. Statistically, we need to compare the observed frequencies of these artifact types against their theoretical frequencies that would be expected if their frequencies were indeed random. If two variables occur with random frequencies, such as tossing of coins, then their theoretical frequencies are 50-50. Once we have determined the theoretical frequencies, we can proceed with testing the hypothesis using the X2 statistic that is calculated with the following equation: X2 = ∑ − i ii T TO 2)( where, ∑ "the sum of'” O observed frequency (count) T theoretical frequency (count) i label for categories (1 for handaxes, 2 for choppers) Let’s consider this equation briefly, to make sure you understand what it’s about. First, note that the X2 statistic is going to be calculated by completing the operations shown on the right side of the equation. These can be summarized as follows: 1. For the first category (handaxes) subtract the theoretical frequency from that which we have observed. 2. Then square this difference to make all differences a positive value. We need to do this because some differences are negative and some are positive, but we’re only interested in how large the difference is. 3. Next, divide the squared difference by the theoretical frequency. This is actually computing the ratio of the difference to the absolute value. If this ratio is a large number it will 52 hypothesis is rejected. Thus, the likelihood is that the frequencies are not due to chance and the difference between the observed and theoretical frequencies of handaxes and choppers is statistically significant. If the X2 value we calculate is less than the selected X2 probability then the null hypothesis cannot be rejected and the likelihood is that the frequencies are due to chance and the difference between the observed and theoretical frequencies of handaxes and choppers is not statistically significant. So now you know that the frequencies of handaxes and choppers in my sample could easily be due to chance, and the difference in their frequencies is not statistically significant. Therefore we reject my hypothesis, and your suspicions are now statistically affirmed. We can also use X2 analysis to compare the frequencies of categories between two archaeological samples not to see if they are random, but to see if they are different or similar. This requires a slightly different approach, and one that you will use in your lab exercise to compare artifact samples you will analyze. Testing for Randomness of Observations – Multiple Variables In this next example, we are going to examine differences across two variables. The mechanics of conducting a X2 analysis is similar to the previous example. The main difference is in the calculation of the theoretical values and the degrees of freedom. In this example, we will examine whether there are differences in the raw materials used to make handaxes and choppers. Here are the frequencies of raw materials for handaxes and choppers from a site in East Africa: Basalt Rhyolite Total Handaxes 30 24 54 Choppers 18 37 55 Total 48 61 109 Overall, it appears that basalt was somewhat preferred for making these tools than rhyolite, but were these raw materials used the same for the two tool categories or was there some preference by their makers? We can find out by testing the following null hypothesis: Ho: Any differences between raw materials for handaxes and chopper are due to chance. To calculate the theoretical values for our X2 analysis, we use the following formula: (Ri x Ci)/T where Ri is the row total ( 1 for handaxes and 2 for choppers), Ci is the column total (1 for basalt 55 and 2 for rhyolite) , and T is the total of all artifacts (109). Here are the theoretical frequencies: Basalt handaxes (54x48)/109 = 23.8 Basalt choppers (55x48)/109 = 24.2 Rhyolite handaxes (54x61)/109 = 30.2 Rhyolite choppers (55x61)/109 = 30.8 Calculating the X2 is the same as in the previous example: X2 = ∑ − i ii T TO 2)( Basalt Rhyolite Total Handaxes (30-23.8)2/23.8=1.62 (24-30.2)2/30.2=1.27 2.89 Choppers (18-24.2)2/24.2=1.59 (37-30.8)2/30.8=1.25 2.84 Total X2 =5.73 The degrees of freedom for multiple variables is calculated slightly differently from in the previous example. Because we now have more than one column, we use the following formula: df= (r-1)x(c-1) where r is the number of rows and c is the number of columns. Thus in our example, there are two columns (basalt and rhyolite) and two rows (handaxes and choppers). So the degrees of freedom are calculated as follows: (2-1) x (2-1) = 1. If we use the probability level of 95% with the degrees of freedom of 1, the selected probability cut-off ( P95 )in the X2 Table 2 is 3.84. The X2 value for our analysis (5.73) is greater than the expected value, thus the null hypothesis that raw material selection for handaxes and choppers is due to chance is rejected if we are willing to make an error in this decision no more than 5% of the time. On the other hand, if we wanted to risk an error in testing our hypothesis no more than 1% of the time ( P99 ) our X2 is less than the cut-off of 6.63, and we would conclude that the differences in raw material selection are not significant. Thus, we get to decide the level of significance. 56 57 Percentiles of the X2 Distribution df PO.5 PI P2.5 P5 PIO P90 P95 P97.5 P99 P99.5 1 0.000039 0.00016 0.00098 0.0039 0.158 2.71 3.84 5.02 6.63 7.88 2 0.0100 0.201 0.0506 0.1026 0.2107 4.61 5.99 7.38 9.21 10.60 3 0.0717 0.115 0.216 0.352 0.584 6.25 7.81 9.35 11.34 12.84 4 0.207 0.297 0.484 0.711 1.064 7.78 9.49 11.14 13.28 14.86 5 0.412 0.554 0.831 1.15 1.61 9.24 11.07 12.83 15.09 16.75 6 0.676 0.872 1.24 1.64 2.20 10.64 12.59 14.45 16.81 18.55 7 0.989 1.24 1.69 2.17 2.83 12.02 14.07 16.01 18.48 20.28 8 1.34 1.65 2.18 2.73 3.49 13.36 15.51 17.53 20.09 21.96 9 1.73 2.09 2.70 3.33 4.17 14.68 16.92 19.02 21.67 23.59 10 2.16 2.56 3.25 3.94 4.87 15.99 18.31 20.48 23.21 25.19 11 2.60 3.05 3.82 4.57 5.58 17.28 19.68 21.92 24.73 26.76 12 3.07 3.57 4.40 5.23 6.30 18.55 21.03 23.34 26.22 28.30 13 3.57 4.11 5.01 5.89 7.04 19.81 22.36 24.74 27.69 29.82 14 4.07 4.66 5.63 6.57 7.79 21.06 23.68 26.12 29.14 31.32 15 4.60 5.23 6.26 7.26 8.55 22.31 25.00 27.49 30.58 32.80 16 5.14 5.81 6.91 7.96 9.31 23.54 26.30 28.85 32.00 34.27 18 6.26 7.01 8.23 9.39 10.86 25.99 28.87 31.53 34.81 37.16 20 7.43 8.26 9.59 10.85 12.44 28.41 31.41 34.17 37.57 40.00 24 9.89 10.86 12.40 13.85 15.66 33.20 36.42 39.36 42.98 45.56 30 13.79 14.95 16.79 18.49 20.60 40.26 43.77 46.98 50.89 53.67 40 20.71 22.16 24.43 26.51 29.05 51.81 55.76 59.34 63.69 66.77 60 35.53 37.48 40.48 43.19 46.46 74.40 79.08 83.38 88.38 91.95 120 83.85 86.92 91.58 95.70 100.62 140.23 146.57 152.21 158.95 163.64
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved