Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistics is the branch of mathematics that deals with collecting, analyzing, interpretin, Study notes of Statistics

1. Introduction to statistics 2. Method of data collection and presentation 3. Measures of central tendency 4. Measures of Dispersion 5 . Probability 6. Probability distribution

Typology: Study notes

2023/2024

Uploaded on 04/25/2024

mahabub-abdureman
mahabub-abdureman 🇪🇹

1 document

1 / 98

Toggle sidebar

Related documents


Partial preview of the text

Download Statistics is the branch of mathematics that deals with collecting, analyzing, interpretin and more Study notes Statistics in PDF only on Docsity! 1 | P a g e Contents CHAPTER ONE ............................................................................................................................................... 4 1. Introduction .......................................................................................................................................... 4 1.1 Definitions and classification of Statistics ..................................................................................... 4 1.2 Stages in Statistical Investigation .................................................................................................. 5 1.3 Definition of Some Terms ............................................................................................................. 6 1.4 Applications, Uses and Limitations of Statistics ............................................................................ 8 1.5 Scales of Measurement ................................................................................................................. 9 CHAPTER TWO ............................................................................................................................................ 13 2. Methods of Data Collection and Presentation ................................................................................... 13 2.1 Methods of Data Collection ........................................................................................................ 13 2.1.1 Sources of Data ....................................................................................................................... 13 2.2 Methods of Data Presentation ................................................................................................... 16 2.2.1 Introduction ........................................................................................................................ 16 2.2.2 Frequency Distribution ....................................................................................................... 16 2.2.3 Diagrammatic and Graphical Presentation of Data ............................................................ 22 2.2.3.1 Diagrammatic display of data: Bar charts, Pie-chart, Cartograms ...................................... 23 2.2.3.2. Graphical presentation of data: Histogram, Frequency Polygon, Ogive Curves ............. 26 CHAPTER THREE .......................................................................................................................................... 28 3. Measures of Central Tendency ........................................................................................................... 28 3.1 Introduction ...................................................................................................................................... 28 3.2 Objectives of Measures of Central Tendency ................................................................................... 29 3.3 The Summation Notation () ........................................................................................................... 29 3.4 Important Characteristics of Measures of Central Tendency ........................................................... 30 3.5 Types of Measures of Central Tendency ........................................................................................... 30 3.5.1 Arithmetic Mean ........................................................................................................................ 31 3.5.2 Median ....................................................................................................................................... 37 3.5.3 The Mode ................................................................................................................................... 39 3.5.4 The Relationship of the Mean, Median and Mode ............................................................. 40 3.6 The Quantiles (Quartiles, Deciles, Percentiles) ................................................................................. 42 2 | P a g e CHAPTER FOUR ........................................................................................................................................... 46 4. Measures of Dispersion (Variation) .................................................................................................... 46 4.1 Introduction ...................................................................................................................................... 46 4.2 Absolute and Relative Measures of Dispersion ................................................................................ 47 4.3 Types of Measures of Variation ........................................................................................................ 48 4.3.1 The Range and Relative Range ................................................................................................... 48 4.3.2 The Quartile Deviation and Coefficient of Quartile Deviation ................................................... 50 4.3.3 The Mean Deviation and Coefficient of Mean Deviation .......................................................... 51 4.3.4 The Variance, Standard Deviation and Coefficient of Variation ................................................ 54 4.4 Standard Scores (Z-Scores) ......................................................................................................... 59 4.5 Moments, Skewness and Kurtosis ....................................................................................... 60 4.5.1 Moments ............................................................................................................................. 60 4.5.2 Skewness ............................................................................................................................. 62 4.5.3 Kurtosis ............................................................................................................................... 64 CHAPTER FIVE ............................................................................................................................................. 66 5 Elementary Probability........................................................................................................................ 66 5.1 Introduction ....................................................................................................................................... 67 5.2 Definitions of Some concepts of Probability Terms ......................................................................... 67 5.3 Counting Rules.................................................................................................................................. 68 5.3.1 Addition Rule .............................................................................................................................. 69 5.3.2 Multiplication (Fundamental) Rule ............................................................................................ 69 5.3.3 Permutation Rule ....................................................................................................................... 69 5.3.4 Combination Rule ....................................................................................................................... 71 5.4 Approaches in Probability Definition ............................................................................................... 71 5.4.1 Subjective approach: .................................................................................................................. 72 5.4.2 Objective approach: ................................................................................................................... 72 5.5 Some Probability Rules ..................................................................................................................... 74 5.5 Conditional Probability and Independence ....................................................................................... 74 CHAPTER SIX ................................................................................................................................................ 76 6 Probability Distribution ....................................................................................................................... 76 6.1 The Definition of Random Variable and Probability Distribution .................................................... 76 6.1.1 Discrete Random Variable and Probability Distribution (pmf) .................................................. 77 5 | P a g e 2. Inferential Statistics: consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables, and making predictions. Statistical techniques based on probability theory are required. Example 1.1: In the above example if we predict the number of malaria patients in the year 1995 to be 9917, then our work belongs to the domain of inferential statistics. Example 1.2: Suppose we want to have an idea about the percentage of illiterates in our country. We take a sample from the population and find the proportion of illiterates in the sample. This sample proportion with the help of probability enables us to make some inferences about the population proportion. This study belongs to inferential statistics. 1.2 Stages in Statistical Investigation Before we deal with statistical investigation, let us see what statistical data mean. Each and every numerical data can’t be considered as statistical data unless it possesses the following criteria. These are:  The data must be aggregate of facts  They must be affected to a marked extent by a multiplicity of causes  They must be estimated according to reasonable standards of accuracy  The data must be collected in a systematic manner for predefined purpose  The data should be placed in relation to each other A statistician should be involved at all the different stages of statistical investigation. This includes formulating the problem, and then collecting, organizing and classifying, presenting, analyzing and interpreting of statistical data. Let’s see each stage in detail I. Formulating the problem: first research must emanate if there is a problem. At this stage the investigator must be sure to understand the problem and then formulate it in statistical term. Clarify the objectives very carefully. Ask as many questions as necessary because “An approximate answer to the right question is worth a great deal more than a precise answer to the wrong question.”-The first golden rule of applied mathematics- Therefore, the first stage in any statistical investigation should be to:  Get a clear understanding of the physical background to the situation under study;  Clarify the objectives;  Formulate the objective in statistical terms 6 | P a g e II. Proper collection of data: in order to draw valid conclusions, it is important ‘good’ data. Data are gathered with aim to meet predetermine objectives. In other words, the data must provide answers to problems. The data itself form the foundation of statistical analyses and hence the data must be carefully and accurately collected. In section 1.6 we will see the methods of data collection. III. Organization and classification of data: in this stage the collected data organized in a systematic manner. That means the data must be placed in relation to each other. The classification or sorting out of data is, by itself, a kind of organization of data. IV. Presentation of data: The purpose of putting the organized data in graphs, charts and tables is two-fold. First, it is a visual way to look at the data and see what happened and make interpretations. Second, it is usually the best way to show the data to others. Reading lots of numbers in the text puts people to sleep and does little to convey information. V. Analyses of data: is the process of looking at and summarizing data with the intent to extract useful information and develop conclusions. Data analysis is closely related to data mining, but data mining tends to focus on larger data sets, with less emphasis on making inference, and often uses data that was originally collected for a different purpose. In this stage different types of inferential statistical methods will apply. For instance, hypothesis testing such as 2 test of association. VI. Interpretation of data: interpretation means drawing valid conclusions from data which form the basis of decision making. Correct interpretation requires a high degree of skill and experience. Note that: Analyses and interpretation of data are the two sides of the same coin. 1.3 Definition of Some Terms In this section, we will define those terms which will be used most frequently. These are: Data: are the values (measurements or observations) that the variables can assume. OR Facts or figures from which the conclusion can be drawn. Data set: Facts or figures collected for a particular study. Each value in the data set is called data value or datum. Raw Data: Data sheets are where the data are originally recorded. Original data are called raw data. Data sheets are often hand drawn, but they can also be printouts from database programs like Microsoft Excel. 7 | P a g e Population: The totality of all subjects with certain common characteristics that are being studied in a specified time and place. Sample: Is a portion of a population which is selected using some technique of sampling. Sample must be representative of the population so that it must be selected by any of the developed technique. Sampling: Is the process of selecting units (e.g., people, households, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. Sample size: The number of elements or observation to be included in the sample. Parameter: Any measure computed from the data of a population. Example: Populations mean (µ) and population standard deviation (𝜎) Statistic: Any measure computed from the sample. Example: sample mean (?̅?), sample standard deviation (s) Survey: A collection of quantitative information about members of a population when no special control is exercised over any of the factors influencing the variable of interest. Sample survey: A survey that include only a portion of the population. Census: A collection of information about every member of a population Sample survey has the following advantages over census • Sample survey saves time and cost • Has great accuracy • Avoid wastage of material Variable: A variable is a characteristic or attribute that can assume different values. Variables whose values are determined by chance are called random variables. Variables are often specified according to their type and intended use and hence variable can be classified in to two namely qualitative and quantitative variables. • A quantitative variable is naturally measured as a number for which meaningful arithmetic operations make sense. Examples: Height, age, crop yield, GPA, salary, temperature, area, air pollution index (measured in parts per million), etc. • Qualitative variable: Any variable that is not quantitative is qualitative. Qualitative variables take a value that is one of several possible categories. As naturally measured, 10 | P a g e is used more broadly and is more appropriately termed scales of measurement. Scales of measurement refer to ways in which variables or numbers are defined and categorized. Each scale of measurement has certain properties which in turn determine the appropriateness for use of certain statistical analyses. The four scales of measurement are nominal, ordinal, interval, and ratio. Nominal Scales Nominal scales possess the following properties.  Level of measurement which classifies data into mutually exclusive, all-inclusive categories in which no order or ranking can be imposed on the data.  No arithmetic and relational operation can be applied.  No quantitative information is conveyed  Thus only gives names or labels to various categories. Examples:  Political party preference (Republican, Democrat, or Other,)  Sex (Male or Female.)  Marital status (married, single, widow, divorce)  Country code  Regional differentiation of Ethiopia. 2. Ordinal Scales Ordinal Scales are measurement systems that possess the following properties:  Level of measurement which classifies data into categories that can be ranked, however Differences between the ranks do not exist.  Arithmetic operations are not applicable but relational operations are applicable.  Ordering is the sole property of ordinal scale. Examples:  Letter grades (A, B, C, D, F).  Rating scales (Excellent, Very good, Good, Fair, poor).  Military status. 3. Interval Scales Interval scales are measurement systems that possess the following properties: 11 | P a g e  Level of measurement which classifies data that can be ranked and differences are meaningful. However, there is no meaningful zero, so ratios are meaningless.  All arithmetic operations except division are applicable.  Relational operations are also possible. Examples:  IQ, Temperature in F0. 4. Ratio Scales Ratio scales measurement possess the following properties: Level of measurement which classifies data that can be ranked, differences are meaningful, and there is a true zero. True ratios exist between the different units of measure.  All arithmetic and relational operations are applicable. Examples:  Weight  Height  Number of students  Age Use of level of measurements  Helps you decide how to interpret the data from the variable.  Helps you decide what statistical analysis is appropriate on the values that were assigned. For example if a measurement is nominal then you know that you never average the data level. Exercise 1: Classify the following different measurement systems into one of the four types of scales. a) Your checking account number as a name for your account. b) Your checking account balance as a measure of the amount of money you have in that account c) Your score on the first statistics test as a measure of your knowledge of statistic d) A response to the statement "Abortion is a woman's right" where "Strongly Disagree" = 1, "Disagree" = 2, "No Opinion" = 3, "Agree" = 4, and "Strongly Agree" = 5, as a measure of attitude toward abortion. e) Times for swimmers to complete a 50-meter race 12 | P a g e f) Months of the year Meskerm, Tikimit… g) Socioeconomic status of a family when classified as low, middle and upper classes. h) Blood type of individuals, A, B, AB and O. i) Pollen counts provided as numbers between 1 and 10 where 1 implies there is almost no pollen and 10 that it is rampant, but for which the values do not represent an actual counts of grains of pollen. j) Regions numbers of Ethiopia k) The number of students in a college l) The net wages of a group of workers m) The height of the men in a town 15 | P a g e • Interviewers can observe the surroundings and can use nonverbal communication and visual aids. • The interviewer can help the respondent if he/she has difficulty in understanding the questions. • Respondent is likely to answer all the questions alone Disadvantage:- • Cost is high • Interviewer bias is also high • Untrained interviewer may distort the meaning of the questions b) Telephone Interview Advantage:- • It is less expensive in time and money compared to face to face interviews • Relatively high response rate • Reach people who would not open their doors to an interviewer, but might willing to talk on the telephone Disadvantage:- • Unrepresentative of the groups which do not have telephones • Unlisted telephone numbers are excluded from the study. • Respondent may be substitute by another 3. The use of documentary source • Extracting information from existing resources. • Is much less expensive than any other two sources • It is difficult to get the information needed when records are compiled in unstandardized manner. Example: - Hospital records, professional institutes, Official statistics, - - - Editing of Data: After collecting the data either from primary or secondary source, the next step is it’s editing. Editing means the examination of collected data to discover any error and mistake before presenting it. It has to be decided before hand what degree of accuracy is wanted and what extent of errors can be tolerated in the inquiry. The editing of secondary data is simpler than that of primary data. 16 | P a g e 2.2 Methods of Data Presentation 2.2.1 Introduction This topic introduces tabular and graphical methods commonly used to summarize both qualitative and quantitative data. Tabular and graphical summaries of data can be obtained in annual reports, newspaper articles and research studies. Everyone is exposed to these types of presentations, so it is important to understand how they are prepared and how they will be interpreted. Modern statistical software packages provide extensive capabilities for summarizing data and preparing graphical presentations. MINITAB, SPSS, STATA and R are three packages that are widely available. Tabulation of Data: The process of placing classified data into tabular form is known as tabulation. A table is a symmetric arrangement of statistical data in rows and columns. Rows are horizontal arrangements whereas columns are vertical arrangements. 2.2.2 Frequency Distribution A frequency distribution is the organization of row data in table form, using classes and frequencies. There are three basic types of frequency distributions, and there are specific procedures for constructing each type. The three types are categorical, ungrouped and grouped frequency distributions. The reasons for constructing a frequency distribution are as follows • To organize the data in a meaningful, intelligible way. • To enable the reader to determine the nature or shape of the distribution • To facilitate computational procedures for measures of average and spread • To enable the researcher to draw charts and graphs for the presentation of data • To enable the reader to make comparisons between different data set Some of basic terms that are most frequently used while we deal with frequency distribution are the following: • Lower Class Limits are the smallest number that can belong to the different class. • Upper Class Limits are the largest number that can belong to the different classes. • Class Boundaries are the number used to separate classes, but without the gaps created by class limits. • Class midpoints are the midpoints of the classes. Each class midpoint can be found by adding the lower class limit to the upper class limit and dividing the sum by 2. • Class width is the difference between two consecutive lower class limits or two consecutive lower class boundaries. 17 | P a g e 2.2.2.1 Categorical Frequency Distribution The categorical frequency distribution is used for data which can be placed in specific categories such as nominal or ordinal level data. For example, data such as data such as political affiliation, religious affiliation, or major field of study would use categorical frequency distribution. The major components of categorical frequency distribution are class, tally and frequency. Moreover, even if percentage is not normally a part of a frequency distribution, it will be added since it is used in certain types of graphical presentations, such as pie graph. Steps of constructing categorical frequency distribution 1. You have to identify that the data is in nominal or ordinal scale of measurement 2. Make a table as show below A Class B Tally C Frequency D Percent 3. Put distinct values of a data set in column A 4. Tally the data and place the result in column B 5. Count the tallies and place the results in column C 6. Find the percentage of values in each class by using the formula % = 𝑓 𝑛 𝑥100% Where 𝑓 is frequency and 𝑛 is total number of values Example 2.1: Twenty-five army inductees were given a blood test to determine their blood type. The data set is given as follows: A B B AB O O O B AB B B B O A O A O O O AB AB A O B A Construct a frequency distribution for the data. 20 | P a g e Note that: Round the answer up to the nearest whole number if there is a reminder. For instance, 4.7 ≈ 5 and 4.12 ≈ 5 5. Select the starting point as the lowest class limit. This is usually the lowest score (observation). Add the width to that score to get the lower class limit of the next class. Keep adding until you achieve the number of desired class(𝐾) calculated in step 3. 6. Find the upper class limit; subtract unit of measurement(𝑈) from the lower class limit of the second class in order to get the upper limit of the first class. Then add the width to each upper class limit to get all upper class limits. Unit of measurement: Is the next expected upcoming value. For instance, 28, 23, 52, and then the unit of measurement is one. Because take one datum arbitrarily, say 23, then the next upcoming value will be 24. Therefore,𝑈 = 24 − 23 = 1. If the data is 24.12, 30, 21.2 then give priority to the datum with more decimal place. Take 24.12 and guess the next possible value. It is 24.13. Therefore, 𝑈 = 24.12 − 24.13 = 0.01. Note that: 𝑈 = 1 is the maximum value of unit of measurement and is the value when we don’t have a clue about the data. 7. Find the class boundaries. 𝑳𝑜𝑤𝑒𝑟 𝑪𝑙𝑎𝑠𝑠 𝑩𝑜𝑢𝑛𝑑𝑎𝑟𝑦 = 𝑳𝑜𝑤𝑒𝑟 𝑪𝑙𝑎𝑠𝑠 𝑳𝑖𝑚𝑖𝑡 − 𝑈 2 and, 𝑼𝑝𝑝𝑒𝑟 𝑪𝑙𝑎𝑠𝑠 𝑩𝑜𝑢𝑛𝑑𝑎𝑟𝑦 = 𝑼𝑝𝑝𝑒𝑟 𝑪𝑙𝑎𝑠𝑠 𝑳𝑖𝑚𝑖𝑡 − 𝑈 2 . In short, 𝐿𝐶𝐵 = 𝐿𝐶𝐿 − 𝑈 2 and 𝑈𝐶𝐵 = 𝑈𝐶𝐿 − 𝑈 2 . 8. Tally the data and write the numerical values for tallies in the frequency column 9. Find cumulative frequency. We have two type of cumulative frequency namely less than cumulative frequency and more than cumulative frequency. Less than cumulative frequency is obtained by adding successively the frequencies of all the previous classes including the class against which it is written. The cumulate is started from the lowest to the highest size. More than cumulative frequency is obtained by finding the cumulate total of frequencies starting from the highest to the lowest class. For example, the following frequency distribution table gives the marks obtained by 40 students: 21 | P a g e The above table shows how to find less than cumulative frequency and the table shown below shows how to find more than cumulative frequency. 5.511 Example 2.3: Consider the following set of data and construct the grouped frequency distribution. 11 29 6 33 14 21 18 17 22 38 31 22 27 19 22 23 26 39 34 27 Steps 1. Highest value = 39, Lowest value = 6 2. R = 39 − 6 = 33 3. K = 1 + 3.32 log 20 = 5.32 ≈ 6 4. W = R K = 33 6 = 5.5 ≈ 6 5. Select starting point. Take the minimum which is 6 then add width 6 on it to get the next class LCL. 6 12 18 24 30 36 6. Upper class limit. Since unit of measurement is one. 12 − 1 = 11. So 11 is the UCL of the first class. Therefore, 6 − 11 is the first class Class limit 6-11 12-17 18-23 24-29 30-35 36-41 7. Find the class boundaries. Take the formula in step 7. 22 | P a g e LCB1 = LCL1 − 0.5 , and UCB1 = UCL1 − 0.5 8. Steps 9 and 10 2.2.2.4 Relative Frequency Distribution An important variation of the basic frequency distribution uses relative frequencies, which are easily found by dividing each class frequency by the total of all frequencies. A relative frequency distribution includes the same class limits as a frequency distribution, but relative frequencies are used instead of actual frequencies. The relative frequencies are sometimes expressed as percent. 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 = 𝐶𝑙𝑎𝑠𝑠 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠 Relative frequency distribution enables us to understand the distribution of the data and to compare different sets of data. 2.2.3 Diagrammatic and Graphical Presentation of Data We have discussed the techniques of classification and tabulation that help us in organizing the collected data in a meaningful fashion. However, this way of presentation of statistical data does not always prove to be interesting to a layman. Too many figures are often confusing and fail to convey the massage effectively. One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in which statistical data may be displayed pictorially such as different types of graphs and diagrams. General steps in constructing graphs 1. Draw and label the x and y axes 2. Choose a suitable scale for the frequencies or cumulative frequencies and label it on the y axis. Class Boundaries 5.5-11.5 11.5-17.5 17.5-23.5 23.5-29.5 29.5-35.5 35.5-41.5 25 | P a g e b) Multiple bar charts are used two or more sets of inter-related data are represented (multiple bar diagram facilities comparison between more than one phenomenons). The technique of simple bar chart is used to draw this diagram but the difference is that we use different shades, colors, or dots to distinguish between different phenomena. Example 2.6: Draw a multiple bar chart to represent the import and export of Canada (values in $) for the years 1991 to 1995. Years 1991 1992 1993 1994 1995 Imports 7930 8850 9780 11720 12150 Exports 4260 5225 6150 7340 8145 c) Stratified (Stacked or component) Bar Chart is used to represent data in which the total magnitude is divided into different or components. In this diagram, first we make simple bars for each class taking total magnitude in that class and then divide these simple bars into parts in the ratio of various components. This type of diagram shows the variation in different components within each class as well as between different classes. Sub-divided bar diagram is also known as component bar chart or staked chart. 26 | P a g e Example 2.7: The table below shows the quantity in hundred kgs of Wheat, Barley and Oats produced on a certain form during the years 1991 to 1994. Draw stratified bar chart. Years 1991 1992 1993 1994 Wheat 34 43 43 45 Barley 18 14 16 13 Oats 27 24 27 34 Solution: To make the component bar chart, first of all we have to take year wise total production. Years 1991 1992 1993 1994 Wheat 34 43 43 45 Barley 18 14 16 13 Oats 27 24 27 34 Total 79 81 86 92 The required diagram is given below: 2.2.3.2. Graphical presentation of data: Histogram, Frequency Polygon, Ogive Curves Statistical graphs can be used to describe the data set or to analyze it. Graphs are also useful in getting the audience’s attention in a publication or a speaking presentation. They can be used to discuss an issue, reinforce a critical point, or summarize a data set. They can also be used to discover a trend or pattern in a situation over a period of time. The three most commonly used graphs in research are i. The histogram. ii. The frequency polygon. iii. The cumulative frequency graph, or ogive (pronounced o-jive). 27 | P a g e i. Histogram Histogram is a special type of bar chart in which the horizontal scale represents classes of data values and the vertical scale represents frequencies. The height of the bars correspond to the frequency values, and the drawn adjacent to each other (without gaps). We can construct a histogram after we have first completed a frequency distribution table for a data set. Example 2.8: Take the data in example 2.3. Relative frequency histogram has the same shape and vertical (𝑦 𝑎𝑥𝑖𝑠) scale as a histogram, but the vertical (𝑦 𝑎𝑥𝑖𝑠) scale is marked with relative frequencies instead of actual frequencies. ii. Frequency Polygon A frequency polygon uses line segment connected to points located directly above class midpoint values. The heights of the points correspond to the class frequencies, and the line segments are extended to the left and right so that the graph begins and ends on the horizontal axis with the same distance that the previous and next midpoint would be located. Example 2.9: Take the data in example 2.3. 41.5 35.5 17.5 11.5 5.5 Class boundaries 7.0 6.0 5.0 4. 0 3.0 2.0 1.0 0.0 23.5 29.5 38.5 32.5 26.5 20.5 14.5 8.5 Midpoints 7.0 6.0 5.0 4.0 3.0 2.0 2.5 44.5 Frequency 30 | P a g e 3. ∑ (𝑎 + 𝑏𝑥𝑖) 𝑛 𝑖=1 = n.a + b∑ 𝑥𝑖 𝑛 𝑖=1 4. ∑ ((𝑥𝑖 ± 𝑦𝑖) 𝑛 𝑖=1 = ∑ 𝑥𝑖 𝑛 𝑖=1 ± ∑ 𝑦𝑖 𝑛 𝑖=1 5. ∑ 𝑥𝑖 𝑛 𝑖=1 𝑦𝑖 ≠ ∑ 𝑥𝑖 𝑛 𝑖=1 ∑ 𝑦𝑖 𝑛 𝑖=1 Example 3.1: ∑ 𝑥𝑖 7 𝑖=1 = 20 , ∑ 𝑦𝑖 7 𝑖=1 = 30, ∑ 𝑥𝑖 27 𝑖=1 = 420, ∑ 𝑦𝑖 27 𝑖=1 =280 Find i/ ∑ (6𝑥𝑖 7 𝑖=1 + 4𝑦𝑖) = 6∑ 𝑥𝑖 7 𝑖=1 + 4∑ 𝑦𝑖 7 𝑖=1 = 6.20 + 4.30 = 240 ii/ 3∑ 𝑥𝑖 27 𝑖=1 − 2∑ 𝑦𝑖 27 𝑖=1 = 3.420 – 2.280 = 700 3.4 Important Characteristics of Measures of Central Tendency A typical average should possess the following:  It should be rigidly defined, exist and be unique.  It should be based on all observation under investigation.  It should be as little as affected by extreme observations.  It should be capable of further algebraic treatment.  It should be as little as affected by fluctuations of sampling.  It should be ease to calculate and simple to understand. 3.5 Types of Measures of Central Tendency Measures of Central Tendency:- give us information about the location of the center of the distribution of data values. A single value that approximately describes the characteristics of the entire mass of data is called measures of central tendency. We will discuss briefly the three measures of central tendency: Mean, Median and Mode in this unit. The following are types of Central Tendency which are suitable for a particular type of data. These are  Mean -Arithmetic Mean - Weighted Arithmetic Mean - Combined mean - Geometric Mean -Harmonic Mean  Median  Mode or modal value 31 | P a g e 3.5.1 Arithmetic Mean Arithmetic mean is defined as the sum of the measurements of the items divided by the total number of items. It is usually denoted by ?̅?. Arithmetic Mean for individual series Suppose 𝑥1 , 𝑥2 , … , 𝑥𝑛 are observed values in a sample of size n from a population of size N, n<N then the arithmetic mean of the sample, denoted by ?̅? is given by ?̅? = 𝑥1+ 𝑥2+… +𝑥𝑛 𝑛 = ∑ 𝑥𝑖 𝑛 𝑖=1 𝑛 If we take an entire population the mean is denoted by μ and is given by: 𝜇= 𝑋1+ 𝑋2+… +𝑋𝑁 𝑁 = ∑ 𝑋𝑖 𝑁 𝑖=1 𝑁 Where N stands for the total number of observations in the population. Example 3.2: The data represent the number of days off per year for a sample of individuals Selected from nine different countries. Find the mean. 20, 26, 40, 36, 23, 42, 35, 24, 30 Solution: The sample values are 20, 26, 40, 36, 23, 42, 35, 24, and 30 ?̅? = ∑ 𝑥𝑖 𝑛 𝑖=1 𝑛 = 20+ 26+40+36+23+42+35+24+30 9 = 276 9 = 30.7 days Hence, the mean of the number of days off is 30.7 days. Arithmetic mean for discrete data arranged in frequency distribution When the numbers 𝑥1 , 𝑥2 , … , 𝑥𝑘 occur with frequencies 𝑓1, 𝑓2 , … , 𝑓𝑘, respectively, then the mean can be expressed in a more compact form as: ?̅? = 𝑥1𝑓1+𝑥2𝑓2+ …+𝑥𝑘𝑓𝑘 𝑓1+𝑓2+ …+ 𝑓𝑘 = ∑ 𝑥𝑖𝑓𝑖 𝑘 𝑖=1 ∑ 𝑓𝑖 𝑘 𝑖=1 Example 3.3: Calculate the arithmetic mean of the sample of numbers of students in 10 classes: 50 42 48 60 58 54 50 42 50 42 ?̅? = ∑ 𝑥𝑖 𝑛 𝑖=1 𝑛 = 50+42+48+60+58+54+50+42+50+42 10 = 496 10 = 49.6 ≈ 50 In this case there are three 42’s, one 48, three 50’s, one 54, one 58 and one 60. The number of times each number occurs is called its frequency and the frequency is usually denoted by f. The information in the sentence above can be written in a table, as follows. 32 | P a g e Value, xi 42 48 50 54 58 60 Frequency, fi 3 1 3 1 1 1 xifi 126 48 150 54 58 60 The formula for the arithmetic mean for data of this type is ?̅? = 𝑥1𝑓1+𝑥2𝑓2+ …+𝑥𝑘𝑓𝑘 𝑓1+𝑓2+ …+ 𝑓𝑘 = ∑ 𝑥𝑖𝑓𝑖 𝑘 𝑖=1 ∑ 𝑓𝑖 𝑘 𝑖=1 In this case we have: ?̅? = 42𝑥3 + 48𝑥1 + 50𝑥3 + 54𝑥1+58𝑥1+60𝑥1 3+1+3+1+1+1 = 126+48 + 150+54+58+60 10 = 496 10 = 49.6 ≈ 50 The mean numbers of students in ten classes is 50. Arithmetic Mean for Grouped Continuous Frequency Distribution If data are given in the form of continuous frequency distribution, the sample mean can be computed as ?̅? = ∑ 𝑥𝑖𝑓𝑖 𝑘 𝑖=1 ∑ 𝑓𝑖 𝑘 𝑖=1 = 𝑥1𝑓1+𝑥2𝑓2+ …+𝑥𝑘𝑓𝑘 𝑓1+𝑓2+ …+ 𝑓𝑘 where 𝑥𝑖 is the class mark of the ith class; i=1, 2, . . . , k , 𝑓𝑖 is the frequency of the ith class and k is the number of classes Note that ∑ 𝑓𝑖 𝑘 𝑖=1 = n = the total number of observations. Example 3.4: The following frequency table gives the height (in inches) of 100 students in a college. Class Interval (CI) 60- 62 62-64 64-66 66-68 68-70 70-72 Total Frequency (f) 5 18 42 20 8 7 100 Calculate the mean Solution: The formula to be used for the mean is as follows: ?̅? = ∑ 𝑥𝑖𝑓𝑖 𝑘 𝑖=1 ∑ 𝑓𝑖 𝑘 𝑖=1 Let us calculate these values and make a table for these values for the sake of convenience. Class Interval (CI) 60-62 62-64 64-66 66-68 68-70 70-72 Total Frequency (f) 5 18 42 20 8 7 100 Mid-Point (𝑥𝑖) 61 63 65 67 69 71 𝑓𝑖𝑥𝑖 305 1134 2730 1340 552 497 6558 Substituting these values with ∑ 𝑓𝑖 6 𝑖=1 = 100, we get 35 | P a g e Example 3.6: In the Previous year there were two sections taking Statistics course. At the end of the semester, the two sections got average marks of 70 & 78. There were 45 and 50 students in each section respectively. Find the mean mark for the entire students. Solution: ?̅?𝑐 = ?̅?1𝑛1+?̅?2𝑛2+⋯+?̅?𝑘𝑛𝑘 𝑛1+𝑛2+⋯+𝑛𝑘 = ?̅?1𝑛1+?̅?2𝑛2 𝑛1+𝑛2 = 70𝑥45 +78𝑥50 45+50 = 7050 95 = 74.21 The combined mean of the entire students will be 74.21. Geometric Mean The geometric mean like arithmetic mean is calculated an average. It is used when observed values are measured as ratios, percentages, proportions, indices or growth rates. Geometric mean for individual series: The geometric mean, G.M. of an individual series of positive numbers (> 0) 𝑥1 , 𝑥2 , … , 𝑥𝑛 is defined as the nth root of their product. n nxxxMG 21..  = antilog ( 1 𝑛 ∑ 𝑙𝑜𝑔𝑥𝑖) Example 3.7: Find the G. M of (a) 3 and 12 b) 2, 4 and 8 Solution: a) 636123 GM ; b) GM= √2𝑥4𝑥8 3 = √64 3 = 4 Properties of geometric mean  It is less affected by extreme values. E.g. x = 2, 5, 8, 72; Find compare for Arithmetic and geometric mean?  It takes each and every observation into consideration.  If the value of one observation is zero its values becomes zero. Geometric mean for discrete data arranged in FD: When the numbers 𝑥1 , 𝑥2 , … , 𝑥𝑘 occur with frequencies 𝑓1 , 𝑓2 , … , 𝑓𝑚, respectively, then the geometric mean is obtained by n f m ff mxxxMG ..... 21 21  = antilog ( 1 𝑛 ∑𝑓𝑖𝑙𝑜𝑔𝑥𝑖) where n is sum of 𝑓𝑖 for all i. Example 3.8: Compute the geometric mean of the following values: 3, 3, 4, 4, 4, 5, 6 and 6. Solution Values 3 4 5 6 Frequency 2 3 1 2 G.M. = √32𝑋43𝑋51𝑋62 8 = 4.236 The geometric mean for the given data is 4.236. 36 | P a g e Geometric mean for continuous grouped FD:- The above formula can also be used whenever the frequency distribution is grouped continuous, class marks of the class intervals are considered as xi. Harmonic Mean It is a suitable measure of central tendency when the data pertains to speed, rate and time. The harmonic Mean of n values is defined as n divided by the sum of their reciprocal. Harmonic mean for individual series: If 𝑥1 , 𝑥2 , … , 𝑥𝑛 are n observations, then harmonic mean can be represented by the following formula: nxxx n MH 111 . 21    Example 3.9 A car travels 25 miles at 25 mph, 25 miles at 50 mph, and 25 miles at 75 mph. Find average mean ( the harmonic mean) of the three velocities. Solution: nxxx n MH 111 . 21    = 3 1 25 + 1 50 + 1 75 = 40.9 Harmonic mean for discrete data arranged in FD: If the data is arranged in the form of frequency distribution m m x f x f x f n MH    2 2 1 1 . , where    m k kfn 1 Harmonic mean for continuous grouped FD: Whenever the frequency distribution are grouped continuous, class marks of the class intervals are considered as 𝑥𝑖 and the above formula can be used as H.M. = 𝑛 ∑ 𝑓𝑖 𝑥𝑖 𝑛 𝑖=1 where    m k kfn 1 𝑥𝑖 is the class mark of ith class Properties of harmonic mean  It is unique for a given set of data.  It takes each and every observation into consideration.  Difficult to calculate and understand. 37 | P a g e  Appropriate measure of central tendency in situations where data is in time, speed or rate. Relations among different means i. If all the observations are positive we have the relationship among the three means given as: ?̅? ≥ GM ≥HM ii. For two observations √?̅? ∗ HM = GM iii. ?̅? = GM = HM if all observation are positive and have equal value. 3.5.2 Median The median is as its name indicates the middle most value in the arrangement which divides the data into two equal parts. It is obtained by arranging the data in an increasing or decreasing order of magnitude and denoted by ?̃?. Median for individual series We arrange the sample in ascending order of the variable of interest. Then the median is the middle value (if the sample size n is odd) or the average of the two middle values (if the sample size n is even). For individual series the median is obtained by a/ ?̃? = ( 𝑛+1 2 )𝑡ℎ value if n is odd, and b/ 𝑥 ̃ = ( 𝑛 2 )𝑡ℎ𝑣𝑎𝑙𝑢𝑒 + ( 𝑛 2 +1)𝑡ℎ𝑣𝑎𝑙𝑢𝑒 2 if n is even Example 3.10: Find the median for the following data. a/ -5 15 10 5 0 2 1 4 6 and 8 b/ 5 2 2 3 1 8 4 Solution; i. The data in ascending order is given by: -5 0 1 2 4 5 6 8 10 15 n=10 n is even. The two middle values are 5th and 6th observations. So the median is, 𝑥 ̃ = ( 10 2 )𝑡ℎ+( 10 2 +1)𝑡ℎ 2 value = 5𝑡ℎ+6𝑡ℎ 2 = 4+5 2 = 4.5 ii. The data in ascending order is given by: 1 2 2 3 4 5 8 The middle value is the 4th observation. So the median is 3. Note: The median is easy to calculate for small samples and is not affected by an "outlier". Median for Discrete data arranged in a frequency distribution:- In this case also, the median is obtained by the above formula. After arranging the values in an increasing order find the smallest 40 | P a g e After locating this class, the mode is interpolated using: w    21 1L Mode , where L = the lower class boundary of the modal class; 1mod1 ff  , 2mod2 ff  , w = the common class width, 1f = frequency of the class immediately preceding the modal class; 2f = frequency of the class immediately succeeding the modal class; and fmode = frequency of the modal class. Example 3.13: Calculate the mode for the frequency distribution of data of example 3.11. Solution: By inspection, the mode lies in the third class, where L =10.5, fmod = 12, f1=8, f2=6, w = 5 Using the formula, the mode is: w    21 1L Mode = 10.5 + (12-8)*5/(12-8)+(12-5) = 12.5 Merits of mode  Mode is not affected by extreme values.  We can change the size of the observations without changing the mode.  It can be computed for all level of data i.e. ratio, interval, ordinal or nominal. Demerits of mode  It may not exist.  It does not take every value into consideration.  Mode may not exist in the series and if it exists it may not be unique. 3.5.4 The Relationship of the Mean, Median and Mode Comparing the Mean, Median, and the Mode  If the data is skewed –avoid the mean.  If there is high gap around the middle- avoid the median.  A measure is a resistant measure if its value is not affected by an outlier or an extreme data value.  The mean is not a resistant measure of central tendency because it is not resistant to the influence of the extreme data values or outliers.  The median is resistant to the influence of extreme data values or outliers and its value does not respond strongly to the changes of a few extreme data values regardless of how large the change may be. 41 | P a g e  The mode has an advantage over both the mean and the median when the data is categorical since it is not possible to calculate the mean or median for this type of data. Also, the mode usually indicates the location within a large distribution where the data values are concentrated. However, the mode cannot always be calculated because if a distribution has all different data values, then the distribution is non modal.  In the case of symmetrical distribution; mean, median and mode coincide. That is mean=median = mode. However, for a moderately asymmetrical (nonsymmetrical) distribution, mean and mode lie on the two ends and median lies between them and they have the following important empirical relationship, which is Mean – Mode = 3(Mean - Median) Example 3.14: In a moderately asymmetrical distribution, the mean and the mode are 30 and 42 respectively. What is the median of the distribution? Solution: Median = (2mean + Mode)/2 = (2*30 + 42)/3 = 34 Hence the median of the distribution is 34. Which of the Three Measures is the ‘’Best’’? At this stage, one may ask as to which of these three measure of central tendency is the best. There is no simple answer to this question. It is because these three measures are based upon different concepts. The arithmetic mean is the sum of the values divided by the total number of observations in the series. The median is the value of the middle observations tend to concentrate. As such; the use of a particular measure will largely depend on the purpose of the study and the nature of the data. For example, when we are interested in knowing the consumers’ preferences for different brands of television sets or kinds of advertising, the choice should go in favor of mode. The use of mean and median would not be proper. However, the median can sometimes be used in the case of qualitative data when such data can be arranged in an ascending or descending order. Let us take another example. Suppose we invite applications for a certain vacancy in our company. A large number of candidates apply for that post. We are now interested to know as to which age or age group has the largest concentration of applicants. Here, obviously the mode will be the most appropriate choice. The arithmetic mean may not be appropriate as it may be influenced by some extreme values. 42 | P a g e 3.6 The Quantiles (Quartiles, Deciles, Percentiles) Median is the value of the middle item, which divides the data in to two equal parts and found by arranging the data in an increasing or decreasing order of magnitude, whereas quintiles are measures, which divides a given set of data in to approximately equal subdivision and are obtained by the same procedure to that of median. They are averages of position (non-central tendency). Some of these are quartiles, deciles and percentiles. Quartiles: are values which divide the data set in to approximately four equal parts, denoted by 𝑄1, 𝑄2 𝑎𝑛𝑑 𝑄3. The first quartile (𝑄1) is also called the lower quartile and the third quartile (𝑄3) is the upper quartile. The second quartile ( 𝑄2) is the median. • Quartiles for Individual series: Let nxxx ,,, 21  be n ordered observations. The ith quartile  iQ is the value of the item corresponding with the [i(n+1)/4]th position, i = 1, 2, 3. That is, after arranging the data in ascending order, Q1, Q2, & Q3 are, obtained by: 𝑄1 = ( 1(𝑛+1) 4 ) 𝑡ℎ 𝑣𝑎𝑙𝑢𝑒, 𝑄2 = ( 2(𝑛+1) 4 ) 𝑡ℎ 𝑣𝑎𝑙𝑢𝑒 and 𝑄3 = ( 3(𝑛+1) 4 ) 𝑡ℎ 𝑣𝑎𝑙𝑢𝑒. • Quartiles for discrete data arranged in a frequency distribution:-Arranged in a frequency distribution this case also, we will follow the same procedure as the median. That is, we construct the less than cumulative frequency distribution and apply the formula of quartile for individual series. • Quartiles in continuous data:- For continuous data, use the following formula:        CF in f w LQ iQ i 4 Where i = 1,2, 3, and L, w ,fQi and CF are defined in the same way as the median. i.e. Q1 = L + 𝑤 𝑓𝑄1 ( 𝑛 4 − 𝐶𝐹) , Q2 = L + 𝑤 𝑓𝑄2 ( 2𝑛 4 − 𝐶𝐹) 𝑎𝑛𝑑 Q3 = L + 𝑤 𝑓𝑄3 ( 3𝑛 4 − 𝐶𝐹) The class under question is the one including (ixn/4)th value. That is, the class with the minimum frequency greater than or equal to (ixn/4) th is the class of the ith quartile. Deciles: are values dividing the data approximately in to ten equal parts, denoted by 𝐷1, 𝐷2,…,𝐷9. • Deciles for Individual Series: 45 | P a g e 𝑃40 = ( 40(𝑛+1) 100 ) 𝑡ℎ 𝑣𝑎𝑙𝑢𝑒 = ( 40(219+1) 100 ) 𝑡ℎ 𝑣𝑎𝑙𝑢𝑒 = 88th value = 14 𝑃90 = ( 90(𝑛+1) 100 ) 𝑡ℎ 𝑣𝑎𝑙𝑢𝑒 = ( 90(219+1) 100 ) 𝑡ℎ 𝑣𝑎𝑙𝑢𝑒 = 198th value = 16 Example 3.16: Marks of 50 students out of 85 is given below. Based on the data find 𝑄1, 𝐷4 𝑎𝑛𝑑 𝑃7. Marks 46-50 51-55 56-60 61-65 66-70 71-75 76-80 fi 4 8 15 5 9 5 4 Solution: first find the class boundaries and cumulative frequency distributions. Marks 46-50 51-55 56-60 61-65 66-70 71-75 76-80 class boundary 45.5-50.5 50.5-55.5 55.5-60.5 60.5-65.5 65.5-70.5 70.5-75.5 75.5-80.5 fi 4 8 15 5 9 5 4 Cum. frequency 4 12 27 32 41 46 50 Q1 Measure of (n/4)th value = 12.5th value which lies in group 55.5 – 60.5 Q1 = L + 𝑤 𝑓𝑄1 ( 𝑛 4 − 𝐶𝐹) = 55.5 + 5 15 (12.5 − 12) = 55.7 D4 Measure of (4n/10)th value = 20th value which lies in group 55.5 – 60.5. D4 = L + 𝑤 𝑓𝐷4 ( 4𝑛 10 − 𝐶𝐹) = 55.5 + 5 15 (20 − 12) = 58.2 P7 Measure of (7n/100)th value = 3.5th value which lies in group 45.5 – 50.5 P7 = L + 𝑤 𝑓𝑃7 ( 7𝑛 100 − 𝐶𝐹) = 45.5 + 5 4 (3.5 − 0) = 49.875. Exercise- 3 1. Calculate the median, quartiles, 8th decile, and 75th percentile for the following data. Show that the value of 75th percentile is the same as that of Q3. Lifetime (C.M) 50 100 150 200 250 300 350 400 No of Batteries 6 8 13 20 9 6 3 2 2. The following data represent the number of offences for various robberies in a town per a given day. No. of robberies 26 34 30 15 10 32 12 25 7 46 | P a g e No. of days 13 19 12 30 14 8 19 20 3 Compute the mean, median and mode 3. Calculate Q1, Q2, Q3, D5, D8, and P90 for the following table Temperature (oF) 50-59 60-69 70-79 80-89 90-99 Days 2 8 20 4 1 4. The following data represent the pulse rates (beats per minute) of nine students 76 60 60 81 72 80 80 68 and 73. Calculate the mean, mode and the third quartile. 5. The number of births in a hospital is given below Days Monday Tuesday Wednesday Thursday Friday Saturday Sunday Num. of births 50 60 52 55 62 30 40 Find the average number of births per day and the mode. 6. From the table given below find the mode and 5th decile. Size 11-15 16-20 21-25 26-30 31-35 36-40 41-45 46-50 Frequency 7 10 13 26 35 22 11 5 7. If the arithmetic mean of two items is 5 and G.M. is 4, find their H.M. 8. The following frequency distribution represents the magnitude of earth quake. Magnitude 0-0.9 1-1.9 2-2.9 3-3.9 4-4.9 5-5.9 6-6.9 7-7.9 Frequency 20 50 45 30 10 8 6 1 Compute the median and verify that it is equal to the second quartile and find 72nd percentile. CHAPTER FOUR 4. Measures of Dispersion (Variation) 4.1 Introduction Just as central tendency can be measured by a number in the form of an average, the amount of variation (dispersion, spread, or scatter) among the values in the data set can also be measured. The measures of central tendency describe that the major part of values in the data set appears to concentrate around a central value called average with the remaining values scattered (distributed) on either sides of that value. But these measures do not reveal how these values are dispersed (spread or scatter) on each side of the central value. The dispersion of values is indicated by the 47 | P a g e extent to which these values tend to spread over an interval rather than cluster closely around an average. The term dispersion is generally used in two senses. Firstly, dispersion refers to the variations of the items among themselves. If the value of all the items of a series is the same, there will be no variation among different items of a series. Secondly, dispersion refers to the variation of the items around an average. If the difference between the value of items and the average is large, the dispersion will be high and on the other hand if the difference between the value of the items and averaging is small, the dispersion will be low. Thus, dispersion is defined as scatteredness or spreadness of the individual items in a given series. After studying this chapter, you should be able to:  Explain the meaning of measures of dispersion  Compare two or more sets of data using relative measures of dispersion.  Apply the Z-score to find out the relative standing of values.  Explain measures of skewness and kurtosis. Objectives of measuring Variation:  To judge the reliability of measures of central tendency  To control variability itself.  To compare two or more groups of numbers in terms of their variability.  To make further statistical analysis. 4.2 Absolute and Relative Measures of Dispersion Absolute measures of dispersion : Absolute measure is expressed in the same statistical unit in which the original data are given such as kilograms, tones etc. These measures are suitable for comparing the variability in two distributions having variables expressed in the same units and of the same averaging size. These measures are not suitable for comparing the variability in two distributions having variables expressed in different units. 50 | P a g e  It is not based on all observations of the series.  It can’t be calculated in case of open-ended distribution.  It is affected by sampling fluctuation.  It is affected by extreme values in the series. 4.3.2 The Quartile Deviation and Coefficient of Quartile Deviation Inter-quartile range and quartile deviation are other measures of dispersion. The difference between the upper quartile (𝑄3) and lower quartile (𝑄1) is called inter-quartile range. Symbolically, 𝑰𝑛𝑡𝑒𝑟 𝑸𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑹𝑎𝑛𝑔𝑒 (𝐼𝑄𝐷) = 𝑄3 − 𝑄1 The inter-quartile ranges covers dispersion of middle 50% of the items of the series. Quartile deviation, also called semi-inter-quartile range, is half of the difference between the upper and lower quartile. That is, half of the inter-quartile range. Its formula is 𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (𝑄𝐷) = 𝑄3 − 𝑄1 2 The relative measure of quartile deviation also called the coefficient of quartile deviation (CQD) is defined as: 𝐶𝑄𝐷 = 𝑄3 − 𝑄1 𝑄3 + 𝑄1 Example 4.3: Find inter-quartile range, quartile deviation and coefficient of quartile deviation from the following data. 28, 18, 20, 24, 27, 30, 15 Solution: First arrange the data in ascending order. 15, 18, 20, 24, 27, 28, 30 𝑄1 = 𝑠𝑖𝑧𝑒 𝑜𝑓 ( 𝑛 + 1 4 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 = 𝑠𝑖𝑧𝑒 𝑜𝑓 ( 7 + 1 4 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 = 𝑠𝑖𝑧𝑒 𝑜𝑓 2𝑛𝑑𝑖𝑡𝑒𝑚 = 18 𝑚𝑎𝑟𝑘𝑠 𝑄3 = 𝑠𝑖𝑧𝑒 𝑜𝑓 3 ( 𝑛 + 1 4 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 𝑠𝑖𝑧𝑒 𝑜𝑓 3 ( 7 + 1 4 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 = 𝑠𝑖𝑧𝑒 𝑜𝑓 6𝑡ℎ𝑖𝑡𝑒𝑚 = 28 𝑚𝑎𝑟𝑘𝑠 𝐼𝑄𝑅 = 𝑄3 − 𝑄1 = 28 − 18 = 10 𝑄𝐷 = 𝑄3 −𝑄1 2 = 28 − 18 2 = 5 𝐶𝑄𝐷 = 𝑄3 − 𝑄1 𝑄3 + 𝑄1 = 28 − 18 28 + 18 = 0.217 51 | P a g e Example 4.4: Find inter-quartile range, quartile deviation and coefficient of quartile deviation from the following data Marks 2 3 4 5 6 7 8 9 No. Of students 10 11 12 13 5 12 7 5 Solution: Marks 2 3 4 5 6 7 8 9 No. of students 10 11 12 13 5 12 7 5 CF 10 21 33 46 51 63 70 75=N 𝑄1 = ( 𝑁 + 1 4 ) = 75 + 1 4 = 19𝑡ℎ𝑖𝑡𝑒𝑚 = 3 𝑄3 = 3( 𝑁+1 4 ) = 3 ( 75+1 4 ) = 57th item = 7 𝐼𝑄𝑅 = 𝑄3 − 𝑄1 = 7 − 3 = 4 𝑄𝐷 = 𝑄3 − 𝑄1 2 = 7 − 3 2 = 2 𝐶𝑄𝐷 = 𝑄3 − 𝑄1 𝑄3 + 𝑄1 = 7 − 3 7 + 3 = 0.4 Remark: Q.D or CQD includes only the middle 50% of the observation. Merits of QD  It is well-defined, easy to compute and simple to understand.  It helps in studying the middle 50% item in the series.  It is not affected by the extreme items.  It is useful in measuring variations in the case of open-ended distributions. Demerits of QD  It is not based on all the items (it ignores 50% items, i.e., the first 25% and the last 25%).  It is greatly influenced by sampling fluctuations.  It is not amenable to algebraic manipulations. 4.3.3 The Mean Deviation and Coefficient of Mean Deviation The mean deviation (MD) measures the average deviation of a set of observations about their central value, generally the mean or the median, ignoring the plus/minus sign of the deviations. In other words the mean deviation of a set of items is defined as the arithmetic mean of the values of 52 | P a g e the absolute deviations from a given average. Depending up on the type of averages used we have different mean deviations.  The mean deviation of a sample of n observations x1, x2, . . .,xn (individual series)is given as 𝑀𝐷 = ∑|𝑋𝑖 − 𝐴| 𝑛 Where |𝑋𝑖 − 𝐴| denotes the absolute value of the deviation. Generally, arithmetic mean and median are used in calculating mean deviation. So, 𝐴 stands for the average used for calculating 𝑀𝐷. That is, 𝐴 = 𝑚𝑒𝑑𝑖𝑎𝑛(𝑋 ̃) 𝑜𝑟 𝐴 = 𝑚𝑒𝑎𝑛(?̅?).  In case of discrete data arranged in FD and continuous grouped data, the formula for MD becomes 𝑀𝐷 = ∑𝑓𝑖|𝑋𝑖−𝐴| 𝑛 , where 𝑋𝑖 is the class mark of the ith class, 𝑓𝑖 is the frequency of the ith class and n = ∑𝑓𝑖. 1. The mean deviation about the arithmetic mean is, therefore, given by 𝑀𝐷(?̅?) = ∑|𝑋𝑖−?̅?| 𝑛 … for ungrouped data (individual series). 𝑀𝐷 (?̅?) = ∑𝑓𝑖|𝑋𝑖−?̅?| 𝑛 . . . for discrete data arranged in FD and a grouped continuous frequency distribution; where 𝑋𝑖 is the value for discrete data arranged in FD and class mark of the ith class for continuous grouped data, 𝑓𝑖 is the frequency of the ith class and n = ∑𝑓𝑖. Steps to calculate M.D for (?̅?)  Find the arithmetic mean, ?̅?  Find the deviations of each reading from ?̅?  Find the arithmetic mean of the deviations, ignoring sign. 2. The mean deviation about the median is also given by 𝑀𝐷(?̃?) = ∑|𝑋𝑖−x̃| 𝑛 … for ungrouped data (individual series). 𝑀𝐷(?̃?) = ∑𝑓𝑖|𝑋𝑖−x̃| 𝑛 . . . for discrete data arranged in FD and a grouped continuous frequency distribution; where 𝑋𝑖 is the value for discrete data arranged in FD and class mark of the ith class for continuous grouped data , 𝑓𝑖 is the frequency of the ith class and n = ∑𝑓𝑖. Steps to calculate M.D (𝑋 ̃) 55 | P a g e manipulated than absolute deviations. Thus, if we averaged the squared deviations from the mean and take the square root of the result (to compensate for the fact that the deviations were squared), we obtain the standard deviation. This overcomes the limitation of the mean deviation. Population Variance (𝝈𝟐) If we divide the variation by the number of values in the population, we get something called the population variance. This variance is the "average squared deviation from the mean".  For ungrouped data (individual series ) 𝝈𝟐 = ∑ (𝑿𝒊−𝝁) 𝟐𝑵 𝒊=𝟏 𝑵 = 𝟏 𝑵 [∑ Xi 2N i=1 −𝑵𝝁𝟐] where 𝝁 is the population arithmetic mean and N is the total number of observations in the population.  For discrete data arranged in FD & for continuous grouped data 𝝈𝟐 = ∑𝒇𝒊(𝑿𝒊−𝝁) 𝟐 𝑵 = 𝟏 𝑵 [∑ fiXi 2 −𝑵𝝁𝟐] where 𝝁 is the population arithmetic mean, 𝑿𝒊 is the class mark of the ith class, fi is the frequency of the ithclass and N=∑ fi Sample Variance (𝑺𝟐) One would expect the sample variance to simply be the population variance with the population mean replaced by the sample mean. However, one of the major uses of statistics is to estimate the corresponding parameter. This formula has the problem that the estimated value isn't the same as the parameter. To offset this, the sum of the squares of the deviations is divided by one less than the sample size.  For ungrouped data 𝑆2 = ∑ (𝑥𝑖−?̅?) 2𝑛 𝑖=1 𝑛−1 = 1 𝑛−1 [∑ xi 2n i=1 − 𝑛?̅?2] , Where ?̅? is the sample arithmetic mean and n is the total number of observations in the sample.  For discrete data arranged in FD If the values ix have frequencies fi (i=1,2,…,m), then the sample variance is given by: 𝑆2 = ∑𝑓𝑖(𝑥𝑖−?̅?) 2 𝑛−1 = 1 𝑛−1 [∑ fixi 2 − 𝑛?̅?2] or   22 1 1 1 m i i i S f x x n       For continuous grouped data 56 | P a g e 𝑆2 = ∑𝑓𝑖(𝑥𝑖−?̅?) 2 𝑛−1 = 1 𝑛−1 [∑ fixi 2 − 𝑛?̅?2] where ?̅? is the sample arithmetic mean, 𝒙𝒊 is the class mark of the ith class, fi is the frequency of the ith class and n=∑ fi. The Standard Deviation There is a problem with variances. Recall that the deviations were squared. That means that the units were also squared. To get the units back the same as the original data values, the square root must be taken.  Population Standard Deviation (s ) 𝜎 = √𝝈𝟐 where 𝜎2 is the population variance.  Sample Standard Deviation ( S ) 𝑆 = √𝑆2 where 𝑆2 is the sample variance. Example 4.7: Find the sample variance and standard deviation of: xi 2 4 5 6 8 fi 2 2 3 1 2 Solution: Prepare the following table: xi fi fixi xi 2 fixi 2 2 2 4 4 8 4 2 8 16 32 5 3 15 25 75 6 1 6 36 36 8 2 16 64 128 Sum 10 49 279 Thus, n=∑ fi = 10,∑ fi xi = 49, ∑ fi xi 2 = 279. 𝑆2 = 1 𝑛 − 1 [∑fixi 2 − 𝑛?̅?2] = 1 9 [279 − 10( 49 10 )2] = 1 9 (38.9) = 4.32, 𝑎𝑛𝑑 𝑆 = √4.32 = 2.08. 57 | P a g e Example 4.8: Find the sample variance and standard deviation for the distribution: C.I 1-5 6-10 11-15 16-20 Freq. 4 1 2 3 Solution: In a continuous F.D., xi is the class mark representing the ith class. C.I ix if ii xf 1-5 3 4 12 36 6-10 8 1 8 64 11-15 13 2 26 338 16.20 18 3 54 972 Total 10 100 1410 Where, n=∑ fi = 10, x̅ = ∑ fixi 𝑛 = 100 10 = 10,∑ fi xi 2 = 1410, so that 𝑆2 = 1 𝑛−1 [∑ fixi 2 − 𝑛?̅?2] = 1 9 [1410 − 10(10)2] = 410 9 = 45.56, 𝑆 = √45.56 = 6.75. Properties of Variance & Standard Deviation 1. If a constant is added to (or subtracted from) all the values, the variance remains the same; i.e., for any constant k, )()( ii xVkxV  . Example 4.9 Consider the 6 sample values xi: 54,52,53,50,51, and 52. The sample variance is 2 =  ixV . Now, subtract 50 from each value to get: iy : 4, 2, 3, 0, 1, 2; and, the variance of this new series is 2. i.e.,     2 yVxV . 1. If each and every value is multiplied by a non-zero constant (k), the standard deviation is multiplied by |𝑘| and the variance is multiplied by k2; i.e., )()( 2 ii xVkkxV  . 2 ii xf 60 | P a g e Course Average score Standard deviation of the score Statistics 51 12 Mathematics 72 16 In which course did the student scored better as compared to his classmates? Solution: Z-score of student in Statistics: 𝑍 = 𝑋−𝜇 𝜎 = 66−51 12 = 15 12 = 1.25 Z-score of student in Mathematics: 𝑍 = 𝑋−𝜇 𝜎 = 80−72 16 = 8 16 = 0.5 From these two standard scores, we can conclude that the student has scored better in Statistics course relative to his classmates than in Mathematics course. 4.5 Moments, Skewness and Kurtosis The measures of central tendency and variation discussed in previous one do not reveal the entire story about a frequency distribution. Two distributions may have the same mean and standard deviation but may differ in their shape of the distribution. Further description of their characteristics is necessary that is provided by measures of skewness and kurtosis. 4.5.1 Moments Moments are statistical tools used in statistical investigation. The moments of a distribution are the arithmetic mean of the various powers of the deviations of items from some number. In our course, we shall use it in the study of Skewness and Kurtosis of statistical distribution. Moments about the origin 𝑀𝑟 = ∑𝑋𝑖 𝑟 𝑛 Where 𝑟 = 0, 1, 2, 3, … Moments about the origin for grouped frequency distribution and for ungrouped frequency distribution is 𝑀𝑟 = ∑𝑓𝑖𝑋𝑖 𝑟 𝑛 Where 𝑓𝑖 is the frequency of 𝑋𝑖. 𝑋𝑖 is the midpoint in the case of grouped frequency distribution or class value in the case of ungrouped frequency distribution. Note that: 𝑀1 = ?̅?, 𝑀0 = 1 Moments about the Mean (Central Moments) 61 | P a g e 𝑀𝑟 ′ = ∑(𝑋𝑖 − ?̅?) 𝑟 𝑛 Moments about the mean for grouped frequency distribution and for ungrouped frequency distribution. 𝑀𝑟 ′ = ∑𝑓𝑖(𝑋𝑖 − ?̅?) 𝑟 𝑛 Where 𝑓𝑖 is the frequency of 𝑋𝑖. 𝑋𝑖 is the midpoint in the case of grouped frequency distribution or class value in the case of ungrouped frequency distribution. Note that: 𝑀2 ′ = 𝑆𝐷2 if it is assumed𝑛 = 𝑛 − 1. Moments about any arbitrary constant 𝑨 𝑀𝑟 ′ = ∑(𝑋𝑖 − 𝐴) 𝑟 𝑛 Moments about any arbitrary constant 𝐴 for grouped frequency distribution and for ungrouped frequency distribution 𝑀𝑟 ′ = ∑𝑓𝑖(𝑋𝑖−𝐴) 𝑟 𝑛 . Example 4.13: Find the first four moments about the mean for the following individual series 𝑋𝑖: 3 6 8 10 18 Solution: n=5, S.No 𝑿𝒊 (𝑿𝒊 − ?̅?) (𝑿𝒊 − ?̅?) 𝟐 (𝑿𝒊 − ?̅?) 𝟑 (𝑿𝒊 − ?̅?) 𝟒 1 3 -6 36 -216 1296 2 6 -3 9 -27 81 3 8 -1 1 -1 1 4 10 1 1 1 1 5 18 9 81 729 6561 Total ∑𝑋 = 45 ∑(𝑋 − ?̅?) = 0 ∑(𝑋 − ?̅?)2 = 128 ∑(𝑋 − ?̅?)3 = 486 ∑(𝑋 − ?̅?)4 = 7940 Thus, ?̅? = 45 5 = 9, 𝑀1 ′ = ∑(𝑋𝑖−9) 1 5 = 0, 𝑀2 ′ = ∑(𝑋𝑖−9) 2 5 = 128 5 = 25.6, 𝑀3 ′ = ∑(𝑋𝑖−9) 3 5 = 486 5 = 97.2 𝑀4 ′ = ∑(𝑋𝑖 − 9) 4 5 = 7940 5 = 1588 62 | P a g e 4.5.2 Skewness Skewness refers to lack of symmetry (or departure from symmetry) in a distribution.  A skewed frequency distribution is one that is not symmetrical.  Skewness is concerned with the shape of the curve not size. A distribution is said to be symmetrical when the value is uniformly distributed around the mean (distribution of the data below the mean and above the mean are equal). In a symmetrical distribution, the mean, median and mode coincide (i.e., mean = median = mode). Positively skewed distribution: if the value of mean is greater than the mode, skewness is said to be positive. In a positively skewed distribution mean is greater than the mode and the median lies somewhere in between mean and mode. A positively skewed distribution contains some values that are much larger than the majority of other observations. Negatively Skewed distribution: if the value of mode is greater than the mean, skewness is said to be negative. In a negatively skewed distribution mode is greater than the mean and the median lies in between mean and mode. The mean is pulled towards the low-valued item (that is, to the left). A negatively skewed distribution contains some values that are much smaller than the majority of observations. Note that: In moderately skewed distributions the averages have the following relationship. (Mean – mode) = 3(mean - median) How to check the presence of skewness in a distribution? 65 | P a g e Solution: a/ 𝛼3 = 𝑀′3 𝑀′2 3/2 = −2.4 1.63/2 = -1.19 < 0, the distribution is negatively skewed. b/ 𝛼4= 𝑀′4 𝑀′2 2 = 5.8 1.62 = 2.26 < 3, the curve is platykurtic. Example 4.14: Find the coefficient of skewness and the coefficient of kurtosis for the above example 4.13. Solution: i) 𝛼3 = 𝑀′3 𝑀′2 3/2 = 97.2 (25.6) 3 2 = 97.2 129.527 = 0.75 the distribution is positively skewed. ii) 𝛼4= 𝑀′4 𝑀′2 2 = 1588 25.62 = 2.42 the curve is platykurtic. Exercise 4 1. Calculate the mean deviation about the mean, median and mode, and their coefficients and also variance and standard deviation for the following data. Size of shoes 3 6 11 2 4 10 5 7 8 9 No. of pairs sold 10 15 25 6 4 3 2 8 9 4 2. An analysis of the monthly wages paid (in birr) to workers in two firms A and B belonging to the same industry gives the following results. Value Firm A Firm B Mean wage 52.5 47.5 Variance 100 121 In which firm A or B is there greater variability in individual wages? 3. A meteorologist interested in the consistency of temperatures in three cities during a given week collected the following data. The temperatures for the five days of the week in the three cities were City 1: 25, 24, 23, 26, 17 City 2: 22, 21, 24, 22, 20 City 3: 32, 27, 35, 24, 28 Which city have the most consistent temperature, based on these data? 66 | P a g e 4. Some characteristics of annually family income distribution (in Birr) in two regions is as follows: Region Mean Median Standard deviation A 6250 5100 960 B 6980 5500 940 a) Calculate coefficient of skewness for each region b) For which region the income is more consistent? 5. The median and the mode of a mesokurtic distribution are 32 and 34 respectively. The 4thmoment about the mean is 243. Compute the Pearsonian coefficient of skewness and identify the type of skewness. Assume (n-1 = n). 6. If the standard deviation of a symmetric distribution is 10, what should be the value of the fourth moment so that the distribution is mesokurtic? CHAPTER FIVE 5 Elementary Probability “Life is a school of probability.” Walter Bagehot The notion that chance, or probability, can be treated numerically is relatively recent. Indeed, for most of recorded history it was felt that what occurred in life was determined by forces that were beyond one’s ability to understand. It was only during the first half of the 17th century, near the end of Renaissance, that people become curious about the world and the laws governing its operation. Among the curious were the gamblers. A cynical person once said, “The only two sure things are death and taxes.” This philosophy no doubt arose because so much in people’s lives is affected by chance. 67 | P a g e 5.1 Introduction Probability as a general concept can be defined as the chance of an event occurring. Most people are familiar with probability from observing or playing games of chance, such as card games or lotteries. Probability is the basis of inferential statistics. The basic concepts of probability are explained in this chapter. These concepts include probability experiments, sample spaces, the addition and multiplication rules, and the probabilities of complementary events. Also in this chapter, you will learn the rule for counting, the differences between permutations and combinations, and how to figure out how many different combinations for specific situations exist. Section 4–5 explains how the counting rules and the probability rules can be used together to solve a wide variety of problems. Finally in section six, the concept of probability is extended to conditional probability and independence. At the end of this chapter students are expected to:  Know what is meant by sample space, event, relative frequency, probability, conditional probability, independence. 5.2 Definitions of Some concepts of Probability Terms Terms that are most frequently used and cornerstone of probability are defined as follows: Probability experiment: It is a process that leads to well-defined results called outcomes. For example, flipping a coin once, rolling one die once, or the like. Outcomes: It is the result of a single trial of probability experiment. It is sometimes called sample point. Example 5.1: when a coin is tossed once, there are two possible outcomes: head or tail. In the roll of a single die, there are six possible outcomes: 1, 2, 3, 4, 5, or 6 Sample Space: It is the set of all possible outcomes of a probability experiment and denoted by𝑆 𝑜𝑟 Ω. Example 5.2: consider example 5.1, S={H, T}, S={1, 2, 3, 4, 5, 6} Event: It is a subset of sample space (contains one or more outcomes which are in the sample space) and is defined for a particular purpose. An event can be one outcome or more than one outcome. Simple event is an event having only single outcome. Compound event consisting of one or more outcomes or simple events. Event is denoted by capital letters such as A, B, F etc. Example 5.3: let A be the event of odd number in tossing a die experiment, then A={1, 3, 5} 70 | P a g e Permutation Rule 1: Suppose now that we have 𝑛 objects. Reasoning, similar to that we have just used for the 3 letter shows that there are 𝑛. (𝑛 − 1). (𝑛 − 2)…3.2.1 = 𝑛! Different permutations of the 𝑛 objects Example 5.12: A class of stat 173 consists of 6 men and 4 women. An examination is given, and the students are ranked according to their performance. Assume that no two students obtain the same score. A. How many different rankings are possible? B. If the men are ranked just among themselves and women among themselves, how many different rankings are possible? Solution: A. As each ranking corresponds to a particular ordered arrangement of the 10 people, we see that the answer to this part is 10! = 3, 628, 800 B. As there are 6! possible rankings of the men among themselves and 4! possible rankings of the women among themselves, it follows from the basic principle that the two groups arrange themselves; it follows the basic principle that the two groups arrange themselves in 2! way so that we have a total of 6! .4! .2! = 34560 possible rankings. Permutation Rule 2: We shall now determine the number of permutations of a set of 𝑛 objects when certain of the objects are indistinguishable from each other. Then the formula is: 𝑛! 𝑛1!. 𝑛2! … 𝑛𝑟! Different permutations of 𝑛 objects, of which 𝑛1 are alike 𝑛2 are alike, …, 𝑛𝑟are alike. Example 5.13: How many different letter arrangements can be formed using the letter PEPPER? Solution: 6! 3!.2!.1! = 60 possible teller arrangements. Permutation Rule 3: Generally, if we are asked to arrange 𝑟 objects among 𝑛 objects, then we will have the following total arrangements 𝑛𝑃𝑟 = 𝑛! (𝑛 − 𝑟)! 71 | P a g e Example 5.14: Suppose a business man has a choice of five locations in which to establish his business. He wishes to arrange only the top three locations. How many different ways can he arrange them? Solution: 5𝑃3 = 5! (5 − 3)! = 60 𝑤𝑎𝑦𝑠 5.3.4 Combination Rule We are often interested in determining the number of different groups of 𝑟 objects that could be formed from a total of 𝑛 objects. A selection of objects without regard to order is called a combination. That is, combinations are used when the order or arrangement is not important. The number of combinations of 𝑟 objects selected from 𝑛 objects is denoted by 𝑛𝐶𝑟 and is given by the formula 𝑛𝐶𝑟 = 𝑛! 𝑟!(𝑛−𝑟)! = ( 𝑛 𝑟 ) Example 5.15: From a group of 5 women and 7 men, how many different committees consisting of 2 women and 3 men can be performed? What if 2 of the men are feuding and refuse to serve on the committee together? Solution: As there are ( 5 2 ) possible groups of 2 women, and ( 7 3 ) possible groups of 3 me, it follows from the basic principle that there are ( 5 2 ) ( 7 3 ) = ( 5.4 2.1 ) 7.6.5 3.2.1 = 350 Possible committees consisting of 2 women and 3 men. On the other hand, if 2 of the men refuse to serve on the committee together, then, as there are ( 2 0 ) ( 5 3 ) possible group of 3 men not containing either of the 2 feuding men and ( 2 1 ) ( 5 2 ) groups of 3 men containing exactly 1 of the feuding men, it follows that there are ( 2 0 ) ( 5 3 ) + ( 2 1 ) ( 5 2 ) = 30 groups of 3 men not containing both of the feuding men. Since there are ( 5 2 ) ways to choose the 2 women, it follows that in this case there are 30 ( 5 2 ) = 300 possible committees. 5.4 Approaches in Probability Definition The probability of an event is denoted by 𝑃(. ) where 𝑃 stands for probability and the dot stands for any event, say A, B, G etc. 72 | P a g e Generally approaches to probability can be divided into two, namely subjective approach and objective approach. 5.4.1 Subjective approach: A probability derived from an individual's personal judgment about whether a specific outcome is likely to occur. Subjective probabilities contain no formal calculations and only reflect the subject's opinions and past experience. Subjective probabilities differ from person to person. Because the probability is subjective, it contains a high degree of personal bias. An example of subjective probability could be asking Arsenal fan, before the football season starts, the chances of Arsenal winning the world champions. While there is no absolute mathematical proof behind the answer to the example, fans might still reply in actual percentage terms, such as the Arsenal having a 95% chance of winning the world champions. 5.4.2 Objective approach: The probability of an event in a certain experiment based on an experimental evidence or random process. In this approach to study probability theory there are three sub approaches. These are The classical approach The frequentist approach The axiomatic approach and 5.4.3.1 The Classical Approach If a procedure has 𝑛 different simple events, each with an equal chance of occurring, and event A can occur in 𝑠 of these ways, then 𝑃(𝐴) = 𝑛(𝐴) 𝑛(𝑆) = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 𝑖𝑛 𝐴 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 𝑖𝑛 𝑎 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑝𝑎𝑐𝑒 Assumptions in classical approach The outcomes must be equally-likely The experiment should never be repeated more than once The sample space should be finite Example 5.16: Toss a fair coin once and find the probability of the occurrence of head Solution: Since the sample space is finite i.e., either head or tail and the outcomes are equally-likely 75 | P a g e What is the probability of selecting white marble on the second draw, given that the first marble drawn was black? Solution:   72.0 47.0 34.0 )( )( |  BlackP WhiteandBlackP BlackWhiteP Example 5.13: The probability that it is Friday and that a student is absent is 0.03. Since there are 5 schooldays in a week, the probability that it is Friday is 0.2. What is the probability that a student is absent given that today is Friday? Solution:   15.0 2.0 03.0 )( )( |  FirdayP AbsentandFridayP FridayAbsentP It often happens that the knowledge that a certain event E has occurred has no effect on the probability that some other event F has occurred, that is, that )()|( EPFEP  . One would expect that in this case, the equation )()|( FPEFP  would also be true. If these equations are true, we might say the F is independent of E. Definition: Two events E and F are independent if both E and F have positive probability and if )()|()()|( FPEFPandEPFEP  Note that: If ,0)(0)(  FPandEP then E and F are independent if and only if )()()( FPEPFEP  Example 5.14: Suppose that we roll a pair of fail dice, so each of the 36 possible out come is equally likely. Let A denotes the event that the first die lands on 3, let C be the event that the sum of the dice is 7 A. Are A and B independent? B. Are A and C independent Solution: A. Since BA is the event that the first die lands on 3 and the second on 5, we see that   36 1 ))5,3(()(  PBAP  On the other hand   36 6 ))6,3(),4,3(),4,3(),3,3(),2,3(),1,3(()(  PAP and   36 5))2,6(),3,5(),4,4(),5,3(),6,2(()(  PBP 76 | P a g e Therefore, since 36 1  ), 36 5).( 36 6( we see that )()()( BPAPBAP  and so events A and B are not independent B. Events A and C are independent. This is seen by noting that   36 1 )4,3()(  PCAP  While 6 1 )( AP and   36 6 ))1,6(),2,5(),3,4(),4,3(),5,2(),6,1(()(  PCP .Therefore, )().()( CPAPCAP  and so events A and C are independent. CHAPTER SIX 6 Probability Distribution Before probability distribution is defined formally, the definition of reviewed. In the first chapter, a variable was defined as a characteristic or attribute that can assume different values various letter of the alphabet are used to represent the variables. At the end of this chapter students are expected to:  Know what meant by random variable, probability distribution, probability density function, expected value and variance;  Be familiar with some standard discrete and continuous probability distributions;  Be able to use standard statistical tables for Normal, t, Chi-square distributions. 6.1 The Definition of Random Variable and Probability Distribution Definition: Let S be a sample space of an experiment and X is a real valued function defined over the sample space S, then X is called a random variable (or stochastic variable). A random variable, usually shortened to r.v. (rv), is a function defined on a sample space S and taking values in the real line , and denoted by capital letters, such as X, Y, Z. Thus, the value of 77 | P a g e the r.v. X at the sample point s is X(s), and the set of all values of X, that is, the range of X, is usually denoted by X(S) or RX. The difference between a r.v. and a function is that, the domain of a r.v. is a sample space S, unlike the usual concept of a function, whose domain is a subset of  or of a Euclidean space of higher dimension. The usage of the term “random variable” employed here rather than that of a function may be explained by the fact that a r.v is associated with the outcomes of a random experiment. Of course, on the same sample space, one may define many distinct r.vs. Example 6.1: Suppose we are about to learn the sexes of the three children of a certain family. The sample space of this experiment consists of the following 8 outcomes. 𝑆 = {(𝑏, 𝑏, 𝑏), (𝑏, 𝑏, 𝑔), (𝑏, 𝑔, 𝑏), (𝑏, 𝑔, 𝑔, ), (𝑔, 𝑏, 𝑏), (𝑔, 𝑏, 𝑔), (𝑔, 𝑔, 𝑏), (𝑔, 𝑔, 𝑔)} The outcomes (𝑔, 𝑏, 𝑏) means, for instance that the youngest child is a girl, the next youngest is a boy, and the oldest is a boy. Suppose that each of these 8 possible outcomes is equally likely, and so each has probability 1/8. If we let X denote the number of female children in this family, then the value of X is determined by the outcomes of the experiment. That is, X is a random variable whose value will be 0, 1, 2 𝑜𝑟 3. i.e. 𝑋(𝑏𝑏𝑏) = 0, 𝑋(𝑔𝑏𝑏) = 𝑋(𝑏𝑔𝑏) = 𝑋(𝑏𝑏𝑔) = 1, 𝑋(𝑔𝑔𝑏) = 𝑋(𝑔𝑏𝑔) = 𝑋(𝑏𝑔𝑔) = 2, 𝑋(𝑔𝑔𝑔) = 3 Example 6.2: Recording the lifetime of an electronic device, or of an electrical appliance. Here S is the interval (0, T) or for some justifiable reasons, S = (0, ∞), a r.v. X of interest is X(s) = s, s ∈ S. Example 6.3: Measuring the dosage of a certain medication administered to a patient, until a positive reaction is observed. Here S = (0, D) for some suitable D. In the examples discussed above we have seen r.v.s with different values. Hence, random variables can be categorized in to two broad categories such as discrete and continuous random variables. 6.1.1 Discrete Random Variable and Probability Distribution (pmf) Definition 6.2:A random variable X is called discrete (or of the discrete type), if X takes on a finite or countably infinite number of values; that is, either finitely many values such as x1, . . . , xn, or countably infinite many values such as x0, x1, x2, . . . . Or we can describe discrete random variable as, it  Take whole numbers (like 0, 1, 2, 3 etc.)  Take finite or countably infinite number of values 80 | P a g e                   54.0 3.016.06.013.0 secsec sec,sec,1     ondonsalenoPfirstonsalenoondonsalenoPfirstonSaleP ondonsalefirstonsaleNoPondonsalenofirstonSalePXP Finally, the random variable X will equal 2 if both appointments result in sales; thus         18.0 6.03.0 sec sec,2     x ondonSalePfirstonsaleP ondonsalefirstonsalePXP As check on this result, we note that       118.054.028.0210  XPXPXP Exercise 6.1: Check whether the function given by 𝑓(𝑥) = 𝑥+2 25 for x = 1, 2, 3, 4, 5 is a p.m.f? Definition: If X is a discrete random variable, the function given by    xt tfxXPxF )()()( for all x in  and t ∈ X. Where f(t) is the value of probability distribution or p.m.f of X at t, is called the distribution function, or the cumulative distribution function of X. If X takes on only a finite number of values x1, x2, . . . , xn, then the distribution function is given by Example 6.9: Find the distribution function F of the total number of heads obtained in four tosses of a balanced coin? The distribution function, or the cumulative distribution function F(X) will be the following; 81 | P a g e 𝐹(𝑋) = { 0 𝑓𝑜𝑟 𝑥 < 0 1 16 𝑓𝑜𝑟 0 ≤ 𝑥 < 1 5 16 𝑓𝑜𝑟 1 ≤ 𝑥 < 2 11 16 𝑓𝑜𝑟 2 ≤ 𝑥 < 3 15 16 𝑓𝑜𝑟 3 ≤ 𝑥 < 4 1 𝑓𝑜𝑟 𝑥 ≥ 4 Exercise 6.2: A telephone survey of households throughout Washington State is given below: a. What is the probability that a household will have no telephone? b. What is the probability that a household will have 2 or more telephone lines? c. What is the probability that a household will have 2 to 4 phone lines? d. What is the probability a household will have no phone lines or more than 4 phone lines? e. Who do you think is in that 3.5% of the population? 6.1.2 Continuous Random Variable and Probability Distribution Definition : A r.v X is called continuous (or of the continuous type) if X takes all values in a proper interval I ⊆ . Or we can describe continuous random variables as follows:  Take whole or fractional number.  Obtained by measuring.  Take infinite number of values in an interval.  Too many to list like discrete variable Example 6.10: 82 | P a g e The following examples are continuous r.v.s Experiment Random Variable X Variable values Weigh 100 People Weight 45.1, 78, ... Measure Part Life Hours 900, 875.9, … Ask Food Spending Spending 54.12, 42, ... Measure Time Between Arrivals Inter-Arrival time 0, 1.3, 2.78, ... Definition 6.4: A function with values f(x), defined over the set of all real numbers, is called a probability density function of the continuous random variable X if and only if P (a ≤ x ≤ b) = ∫ 𝑓(𝑥)𝑑𝑥 𝑏 𝑎 for any real constant a ≤ b. Probability density function also referred as probability densities (p.d.f.), probability function, or simply densities. Remark:  The probability density function f (x) of the continuous random variable X, has the following properties (satisfy the conditions) 1. f(x) ≥ 0 for all x, or for −∞ < x < ∞ 2. 1)()(     dxxfxf  If X is a continuous random variable and a and b are real constants with a ≤ b, then P (a ≤ x ≤ b) = P (a < x ≤ b) = P (a ≤ x < b) = P (a < x < b) Example 6.11: If X is the probability density 𝑓(𝑥) = {𝑘. 𝑒 −3𝑥 𝑓𝑜𝑟 𝑥 > 0 0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒 Find the constant k and P (0.5 ≤ X ≤ 1)? Solution: ∫ 𝑓(𝑥)𝑑𝑥 = 1 ∞ 𝟎 , since 𝑓(𝑥) is pdf. ∫ 𝑘. 𝑒−3𝑥 𝑑𝑥 = 1 ∞ 𝟎 ⇒ − 𝑘 3 [( lim 𝑥→∞ 𝑒−3𝑥) − 1] = 1 ⇒ 𝑘 3 = 1 ⇒ 𝑘 = 3 And 𝑃(0.5 ≤ 𝑋 ≤ 1) = ∫ 3. 𝑒−3𝑥 𝑑𝑥 = [−𝑒−3𝑥]0.5 1𝟏 𝟎.𝟓 = −𝒆−𝟑 + 𝒆−𝟏.𝟓 = 𝟏 𝒆𝟏.𝟓 − 𝟏 𝒆𝟑 85 | P a g e Solution: Letting 𝑌 = 𝑔(𝑥) = 𝑋2, 𝐸[𝑔(𝑥)] = ∑𝑔(𝑥)𝑃(𝑥) = (−1)2. 𝑃(𝑋 = −1) + 02. 𝑃(𝑋 = 0) + 12. 𝑃(𝑋 = 1) = 1(0.2) + 0(0.5) + 1(0.3) = 0.5 The reader should note that (𝐸 [𝑋])2 = 0.01 0.5 = 𝐸 [𝑋2] ≠ (𝐸[𝑋])2 = 0.01 If 𝑎 and 𝑏 are constants then 𝐸(𝑎𝑋 + 𝑏) = 𝑎𝐸[𝑋] + 𝑏 The expected value of a random variable 𝑋, 𝐸 [𝑋] is also referred to as the mean or the first moment of 𝑋. The quantity 𝐸[𝑋𝑛], 𝑛 ≥ 1, is called the 𝑛𝑡ℎ moment of 𝑋. By definition 𝐸[𝑋𝑛] = { ∑𝑋𝑛𝑃(𝑋 = 𝑥), 𝑖𝑓 𝑋 𝑖𝑠 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒 ∫𝑋𝑛𝑓(𝑥)𝑑𝑥, 𝑖𝑓 𝑋 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑜𝑢𝑠. Exercise 6.6: The following are the annual income of 7 men and 7 women residents of a certain community. Annual income (in $ 1000) Men Women 33.5 24.2 25.0 19.5 28.6 27.4 41.0 28.6 30.5 32.2 86 | P a g e 29.6 22.4 32.8 21.6 Suppose that a woman and a man randomly chosen. Find the expected value of the sum of their incomes. Solution: Let 𝑋 be the man’s income and Y is the woman’s income. Since 𝑋 is equally likely to be any of the values in the men’s column, we see that 𝐸 (𝑋) = 1 7 (33.5 + 25 + ⋯+ 32.8) = 31.571 Similarly, 𝐸 [ 𝑌] = 1 7 (24.2 + 19.5 + ⋯+ 21.6) = 25.129 Therefore, the expected value of the sum of their incomes is 𝐸 [𝑋 + 𝑌] = 𝐸 [𝑋] + 𝐸 [𝑌] = 56.7 That is, the expected value of the sum of their incomes is approximately $ 56,700. Definition: If 𝑋 is a random variable with mean μ, then the variance of x, denoted by 𝑉𝑎𝑟 (𝑥), is defined by 𝑉𝑎𝑟 (𝑋) = 𝐸 [(𝑋 − 𝜇)2] An alternative formula for 𝑉𝑎𝑟(𝑋) is derived as follows 𝑉𝑎𝑟(𝑋) = 𝐸[ 𝑋 – 𝜇]2 = ∑(𝑋 − 𝜇)2 𝑃(𝑋) = ∑(𝑋2 − 2𝜇𝑋 + 𝜇2) 𝑃(𝑋) = 𝐸[𝑥2] − 2𝜇2 + 𝜇2 = 𝐸(𝑋2) – 𝜇2 That is, 𝑉𝑎𝑟 (𝑋) = 𝐸[𝑋2] – (𝐸[𝑋])2 Example 6.14: The return from a certain investment is a random variable X with probability distribution. 𝑃{𝑋 = −1} = 0.7, 𝑃{𝑋 = 4} = 0.2, 𝑃 {𝑋 = 8} = 0.1 Find 𝑉𝑎𝑟 (𝑋), the variance of the return. Solution: Let us first compute that expected return as follows: 𝜇 = 𝐸(𝑋) = −1(0.7) + 4 (0.2) + 8 (0.1) = 0.9 To compute 𝑉𝑎𝑟(𝑋), we use the formula 𝑉𝑎𝑟(𝑋) = 𝐸(𝑋2) – 𝜇2 87 | P a g e Now, since 𝑋2 will equal (−1)2, 42, 𝑜𝑟 82 with respective probabilities of 0.7, 0.2, 𝑎𝑛𝑑 0.1, we have 𝐸[𝑋2] = 1 (0.7) + 16 (0.2) + 64 (0.1) = 10.3 Therefore, 𝑉𝑎𝑟 (𝑋) = 10.3 – (0.9)2 = 9.94 Properties of Variance 1. For any random variance X and constant C, it can be shown that 𝑉𝑎𝑟 (𝐶𝑋) = 𝐶2𝑉𝑎𝑟(𝑋) 𝑉𝑎𝑟 (𝐶 + 𝑋) = 𝑉𝑎𝑟 (𝑋) 2. If 𝑋 and 𝑌are independent random variable, 𝑉𝑎𝑟 (𝑋 + 𝑌) = 𝑉𝑎𝑟 (𝑋) + 𝑉𝑎𝑟(𝑌) 3. The square root of the 𝑉𝑎𝑟 (𝑋) is called the standard deviation of 𝑋, and we denote it by 𝑆𝐷 (𝑋) . That is, 𝑆𝐷 (𝑥) = √𝑉𝑎𝑟(𝑥) 6.3 Common Discrete Probability Distribution 6.3.1. Binomial Distribution Many types of probability problems have only two outcomes, or they can be reduced to two outcomes. For example, when a coin is tossed, it can land heads or tails. A probability experiment is a binomial probability experiment that satisfies the following four requirements: 1. Each trial can have only two outcomes or outcomes that can be reduced to two outcomes. 2. There must be a fixed number of trials 3. The outcomes of each trial must be independent 4. The probability of a success must remain the same for each trial The outcomes of a binomial experiment and the corresponding probabilities of these outcomes are called a binomial distribution. The probability mass function of a binomial random variable having parameter (n, p) is given by 𝑃(𝑋 = 𝑥) = ( 𝑛 𝑥 ) 𝑃𝑥 (1 – 𝑃)𝑛 – 𝑥 , 𝑖 = 0, 1 , … . , 𝑛 Example 6.15: Five fair coins are flipped. If the outcomes are assumed independent, find the probability of the number of heads obtained Solution: If we let 𝑋 equal the number of heads (successes) parameters(𝑛 = 5, 𝑃 = ½). Hence, 90 | P a g e Solution: If we let x denote the number of defective items, then x is a Binomial random variable with parameters 10n and 1.0P . Thus the desired probability is            91100 9.01.0 1 10 9.01.0 0 10 10              XPXp 7361.0 Since   11.010 nP , the Poisson approximation yields the value.     7358.010 11   eexPxP Thus, even in this case, where n is equal to 10 (which is not that large) and p is equal to 0.1 (which is not that small), the Poisson approximation to the Binomial probability is quite accurate. 6.4 Common Continuous Probability Distribution Every continuous random variable X has a curve associated with it. This curve, formally known as a probability density function, can be used to obtain probabilities associated with the random variable. This is accomplished as follows, consider any two points a andb , where a is less than b . The probability that x assumes a value that lies between a and b is equal to the area under the curve between a and b . That is,  bxaP  = 𝐴𝑟𝑒𝑎 𝑢𝑛𝑑𝑒𝑟 𝑐𝑢𝑟𝑣𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 a 𝑎𝑛𝑑 b Since X must assume some value, it follows that the total area under the density curve must equal 1. Also, since the area under the graph of the probability density function between points a and b is the same regardless of whether the end points a and b are themselves included. That is,    bxaPbxaP  6.4.1 Normal Random Variables The most important type of random variable is the normal random variable. The probability density function of a normal random variable X is determined by two parameters: the expected value and the standard deviation of X . We designate these values as  and , respectively.  XE And  XSD The normal probability density function is a bell-shaped density curve that is symmetric about the value  ; its variability is measured by . The larger  is, the more variability there is in the curve. 91 | P a g e Since the probability density function of a normal random variable 𝑋 is symmetric about its expected value 𝜇; it follows that 𝑋 is equally likely to be on either side of 𝜇. That is, 𝑃{𝑋 < 𝜇} = 𝑃{𝑋 ≥ 𝜇} = 0.5 Not all bell-shaped symmetric density curves are normal. The normal density curves are specified by a particular formula: 𝑓(𝑥) = 1 √2𝜋𝜎 𝑒 − (𝑥−𝜇)2 2𝜎2 A normal random variable having mean value 0 and standard deviation 1 is called a standard normal variable, and its density curve is called the standard normal curve. The letter 𝑍 represents a standard normal random variable. 𝑓(𝑥) = 1 √2𝜋 𝑒− (𝑥)2 2 Probabilities Associated with a Standard Normal Random Variable 𝑍 = 𝑣𝑎𝑙𝑢𝑒 − 𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑋 − 𝜇 𝜎 Once the 𝑋 values are transformed by using the above formula, they are called 𝑍 value is actually the number of standard deviations that a particular 𝑋 value is a way from the mean. Steps to find areas under the normal distribution curve 1. Between 0 and any 𝑍 value: Look up the 𝑍 value in the table to get the area 2. In any tail a. Look up the 𝑍 value to get the area b. Subtract the area from 0.5 3. Between 𝑍 values on the same side of the mean a. Look up both 𝑍 values to get the area b. Subtract the smaller area from the larger area 4. Between two 𝑍 values on opposite sides of the mean a. Look up both 𝑍 values to get the area b. Add the areas 5. Less than any 𝑍 value to get the right of the mean a. Look up the 𝑍 value to get the area 92 | P a g e b. Add 0.5 to the area 6. Greater than any 𝑍 value to the left of the mean a. Look up the 𝑍 value in the table to get the area b. Add 0.5 to the area 7. In any two tailed a. Look up 𝑍 values in the table to get the areas b. Subtract both areas from 0.5 c. Add the answer General procedure is Draw the picture Shade the area desired Find the correct figure Follow the direction Example 6.15: Find the area under the normal distribution curve between 𝑍 = 0 and 𝑍 = 2.34 Solution: Draw the area as follows: 0 2.34 Since 𝑍 table gives the area between 0 and any 𝑍 value to the right of 0, one need look up the 𝑍 value in the table. Find 2.3 in the left column and 0.04 in the top row. The value where the column and row meet in the table is the answer, 0.4904. 𝒁 0.00 0.01 0.02 0.03 0.04 … 0.0 0.1 0.2 95 | P a g e 𝑃{−1.5 < 𝑍 < 2.5} = 𝑃{−1.5 < 𝑍 < 0} + 𝑃{0 < 𝑍 < 2.5} Since 𝑃{−1.5 < 𝑍 < 0} = 𝑃{0 < 𝑍 < 1.5} , due to symmetric property of normal distribution 𝑃{−1.5 < 𝑍 < 2.5} = 𝑃{0 < 𝑍 < 1.5} + 𝑃{0 < 𝑍 < 2.5} = 0.4332 + 0.4938 = 0.9270 Finding Normal Probabilities: Conversion to the Standard Normal Let 𝑋be a normal random variable with mean 𝜇 and standard deviation 𝜎. We can determine probabilities concerning 𝑋 by using the fact that the variable 𝑍 defined by 𝑍 = 𝑋 − 𝜇 𝜎 has a standard normal distribution. We can compute any probability statements in terms of 𝑍. For example, 𝑃{𝑋 < 𝑎} = 𝑃 { 𝑋 − 𝜇 𝜎 < 𝑎 − 𝜇 𝜎 } = 𝑃 {𝑍 < 𝑋 − 𝜇 𝜎 } where 𝑍 is a standard normal random variable Example 6.18: IQ examination scores for sixth-graders are normally distributed with mean value 100 and standard deviation 14.2. A. What is the probability a randomly chosen sixth-grader has a score greater than 130? B. What is the probability a randomly chosen sixth-grader has score between 90 and 115? Solution: Let 𝑋 denote the score of a randomly chosen student. We compute probabilities concerning 𝑋 by making use of the fact that the standardized variable 𝑍 = 𝑋 − 100 14.2 has a standard normal distribution A. 𝑃{𝑋 > 130} = 𝑃 { 𝑋−100 14.2 > 130−100 14.2 } = 𝑃{𝑍 > 2.1127} = 0.0170 B. The inequality 90 < 𝑋 < 115 is equivalent to 0 2.5 -1.5 0 -1.5 0 2.5 96 | P a g e 90 − 100 14.2 < 𝑋 − 100 14.2 < 115 − 100 14.2 Or equivalently, −0.7042 < 𝑍 < 1.0560 Therefore, 𝑃{90 < 𝑋 < 115} = 𝑃{−0.7042 < 𝑍 < 1.0560} = 𝑃{0 < 𝑍 < 0.7042} + 𝑃{0 < 𝑍 < 1.0560} = 0.6120 Properties of the Normal distribution 1. The normal distribution curve is bell-shaped 2. The mean, median and mode are equal and located at the center of the distribution 3. The normal distribution curve is unimodal 4. The curve is symmetrical about the mean, which is equivalent to saying that is shape the same on both sides of vertical line passing through the center 5. The curve is continuous. That is, no gaps or holes 6. The curve never touches the 𝑥 axis 7. The total area under the normal distribution curve is equal to 1 Relation between Binomial and Normal Distribution Normal distribution is a limiting case of the Binomial probability distribution under the following condition: I. 𝑛, the number of trial is indefinitely large II. Neither 𝑃 and 𝑞 is very small We know that for a Binomial variable 𝑋 with parameters 𝑛 and 𝑝 𝐸[𝑋] = 𝑛𝑝 𝑉𝑎𝑟[𝑋] = 𝑛𝑝𝑞 De-Moivre provide that under the above two conditions, the distribution of standard Binomial variable 𝑍 = 𝑋 − 𝐸[𝑋] 𝜎 = 𝑋 − 𝑛𝑝 √𝑛𝑝𝑞 tends to the distribution of standard normal distribution. If 𝑝 and 𝑞 are nearly equal (i.e., 𝑝 is nearly 0.5), then the normal approximation is surprisingly good even for small values of 𝑛. Relation between Poisson and Normal Distribution 97 | P a g e If 𝑋 is a random variable following Poisson distribution with parameter 𝜆, then 𝐸[𝑋] = 𝜆, 𝑉𝑎𝑟[𝑋] = 𝜆 Thus standard Poisson variable becomes 𝑍 = 𝑋−𝐸[𝑋] 𝜎 = 𝑋−𝜆 √𝜆 . It has been proved that this variable tends to be a standard normal variable if 𝜆 → ∞ 6.4.2 Chi-square Distribution: onDistributi2 The square of a standard normal variable is called a chi-square variable with one degree of freedom. Thus if 𝑋 is a random variable following normal distribution with mean 𝜇 and standard deviation 𝜎, then (𝑋−𝜇) 𝜎 is a standard normal variable. ( 𝑋−𝜇 𝜎 ) 2 is a chi-square variate with 1 degree of freedom. If 𝑥1, 𝑥2, … , 𝑥𝑣 are 𝑣 independent random variables following normal distribution with means 𝜇1, 𝜇2, … , 𝜇𝑣 and standard deviations 𝜎1, 𝜎2, … , 𝜎𝑣 respectively then the variate 2 = ( x1 − μ1 σ1 ) 2 + ( x2 − μ2 σ2 ) 2 +⋯+ ( xv − μv σv ) 2 = 𝑍1 2 + 𝑍2 2 +⋯+𝑍𝑣 2 =∑𝑍𝑖 2 this is the sum of the square of 𝑣 independent standard normal variates, follows chi-square distribution with 𝑣 degree of freedom. Applications of chi-square distribution Chi-square distribution has a number of applications. Some of which are listed below Chi-square test of goodness of fit Chi-square test for independence of attributes To test whether the population has a specified value of the variance 6.4.3Student’s 𝒕 distribution It is often the case that one wants to calculate the size of sample needed to obtain a certain level of confidence in survey results. Unfortunately, this calculation requires prior knowledge of the population standard deviation (𝜎). Realistically, 𝜎 is unknown. Often a preliminary sample will be conducted so that a reasonable estimate of this critical population parameter can be made. If such a preliminary sample is not made, but confidence intervals for the population mean are to be constructing using an unknown 𝜎, then the distribution known as the Student t distribution can be used.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved