Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Mathematics is one of the challenging and best courses to master, Lecture notes of Mathematics

Please read these notes for better understanding about statistical math

Typology: Lecture notes

2021/2022

Uploaded on 09/13/2022

mohammed-jabateh
mohammed-jabateh 🇱🇷

1 document

1 / 9

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Mathematics is one of the challenging and best courses to master and more Lecture notes Mathematics in PDF only on Docsity! 1 www.mathsbox.org.uk A LEVEL MATHS - STATISTICS REVISION NOTES PLANNING AND DATA COLLECTION • PROBLEM SPECIFICATION AND ANALYSIS What is the purpose of the investigation? What data is needed? How will the data be used? • DATA COLLECTION How will the data be collected? How will bias be avoided? What sample size is needed? • PROCESSING AND REPRESENTING How will the data be ‘cleaned’? Which measures will be calculated? How will the data be represented? • INTERPRETING AND DISCUSSING 1 DATA COLLECTION Types of data Categorial/Qualitative data – descriptive Numerical/ Quantitative data Sampling Techniques Simple random Sampling - each member of the population has an equal chance of being selected for the sample Systematic – choosing from a sampling frame - if the data is numbered 1, 2, 3, 4….randomly select the starting point and then select every nth item in the list Stratified - A stratified sample is one that ensures that subgroups (strata) of a given population are each adequately represented within the whole sample population of a research study. Sample size from each subgroup = 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑜𝑜𝑜𝑜 𝑤𝑤ℎ𝑜𝑜𝑜𝑜𝑠𝑠 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑜𝑜𝑠𝑠 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑜𝑜𝑜𝑜 𝑤𝑤ℎ𝑜𝑜𝑜𝑜𝑠𝑠 𝑠𝑠𝑜𝑜𝑠𝑠𝑝𝑝𝑜𝑜𝑠𝑠𝑝𝑝𝑠𝑠𝑜𝑜𝑝𝑝 × 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑝𝑝𝑜𝑜 𝑝𝑝ℎ𝑒𝑒 𝑠𝑠𝑝𝑝𝑠𝑠𝑠𝑠𝑠𝑠𝑝𝑝𝑝𝑝𝑝𝑝 Quota Sampling - sample selected based on specific criteria e.g age group Convenience / opportunity sampling – e.g the first 5 people who enter a Leisure Centre or teachers in single primary school surveyed to find information about working in primary education across the UK Self Selecting Sample – people volunteer to take part in a survey either remotely (internet) or in person 2 PROCESSING AND REPRESENTATION Categorial/Qualitative data Pie Charts Bar charts (with spaces between the bars) Compound/Multiple Bar charts Dot charts Pictograms 2 www.mathsbox.org.uk Modal Class – used as a summary measure Numerical/ Quantitative data Represented using – Frequency diagrams Histograms Cumulative Frequency diagrams Box and Whisker Plots Measures of central tendency - Mode (can have more than one mode) - Median – middle value of ordered data - Mean ∑𝑥𝑥 𝑝𝑝 or ∑𝑜𝑜𝑥𝑥∑𝑜𝑜 If the mean is calculated from grouped data it will be an estimated mean Measures of Spread - Range (largest – smallest value) - Inter Quartile Range : Upper Quartile – Lower Quartile (not influenced by extreme values) - Standard Deviation (includes all the sample ) Finding the quartiles (sample size = n) STANDARD DEVIATION (sample) s = �𝑆𝑆𝑥𝑥𝑥𝑥 𝑝𝑝−1 where 𝑆𝑆𝑥𝑥𝑥𝑥 = ∑(𝑥𝑥 − ?̅?𝑥)2 or 𝑆𝑆𝑥𝑥𝑥𝑥 = ∑𝑥𝑥2 − 𝑝𝑝?̅?𝑥2 or 𝑆𝑆𝑥𝑥𝑥𝑥 = ∑𝑜𝑜𝑥𝑥2 − 𝑝𝑝?̅?𝑥2 s2 = 𝑆𝑆𝑥𝑥𝑥𝑥 𝑝𝑝−1 STANDARD DEVIATION (population) Standard deviation 𝜎𝜎 = �𝑆𝑆𝑥𝑥𝑥𝑥 𝑝𝑝 Variance = 𝜎𝜎2 = 𝑆𝑆𝑥𝑥𝑥𝑥 𝑝𝑝 n is odd (Data 2, 4, 5, 7, 8, 9, 9) 2 4 5 7 8 9 9 Median n is even (Data 2, 4, 5, 5, 7, 8, 9, 10) 2 4 5 5 7 8 9 10 LQ Median UQ Lower Quartile : middle value of data less than the median Upper Quartile : middle value of data greater than the median Lower Quartile : middle value of the lower half of the data Upper Quartile : middle value of the upper half of the data 6 4.5 8.5 Check with your syllabus/exam board to see if you are expected to divide by n or n-1 when calculating the standard deviation 5 www.mathsbox.org.uk 7 BINOMIAL DISTRIBUTION B(n,p) • 2 possible outcomes probability of success = p Probability of failure = (1 - p) • fixed number of trials n • The trials are independent • E(x) = np P(getting r successes out of n trials) = nCr × 𝒑𝒑𝒓𝒓 × (𝟏𝟏 − 𝒑𝒑)𝒏𝒏−𝒓𝒓 USING CUMULATIVE TABLES • Check if you can use your calculator for this • Remember the tables give you less than or equal to the lookup value • List the possible outcomes and identify the ones you need to include P(X < 5) 0 1 2 3 4 5 6 7 8 9 10 Look up x ≤ 4 P(X ≥ 4) 0 1 2 3 4 5 6 7 8 9 10 1 – Look up x ≤ 3 8 THE NORMAL DISTRIBUTION • Defined as X~N(𝜇𝜇,𝜎𝜎2) where 𝜇𝜇 is the mean of the population and 𝜎𝜎2 is the variance • Symmetrical distribution about the mean such at - two-thirds of the data is within 1 standard deviation of the mean - 95% of the data is within 2 standard deviations of the mean - 99.7% of the data is within 3 standard deviations of the mean - points of inflection of the Normal curve lie one standard deviation either side of the mean • X ~ N(𝜇𝜇,𝜎𝜎2) can be transformed to the standard normal distribution Z ~N(0,1) using 𝑧𝑧 = 𝑥𝑥− 𝜇𝜇 𝜎𝜎 𝜇𝜇 + 𝜎𝜎 𝜇𝜇 − 𝜎𝜎 Point of inflection Point of inflection 𝜇𝜇 Research has shown that approximately 10% of the population are left handed. A group of 8 students are selected at random. What is the probability that less than 2 of them are left handed? X : number of left handed students p = 0.1 1 – p = 0.9 n = 8 Less than 2 : P(0) + P(1) P(0) = 0.98 P(1) = 8C1 × 0.1 × 0.97 P(x < 2) = 0.813 (this can be found using tables) 6 www.mathsbox.org.uk Calculating probabilities Probabilities can be calculated by either using the function on a calculator or by transforming the distribution to the standard normal distribution A sketch graph shading the required region is a good idea. Calculating the mean, standard deviation or missing value (Using Inverse Normal) If the probability is given then you need to work backwards to find the missing value(s) Using the normal distribution to approximate a binomial distribution For a valid result the following conditions are suggested X ~ B(n,p) np > 5 and n(1-p) > 5 (ie p is close to ½ or n is large) If the conditions are true then X~B(n,p) can be approximated using X ~ N(np, np(1-p)) (NB As the binomial distribution is discrete and the Normal distribution is continuous some exam boards specify that a continuity correction is used. If you are calculating P(X < 80) you use P(X < 79.5) in your normal distribution calculation) X = 120 IQs are normally distributed with mean 100 and standard deviation 15. What percent of the population have an IQ of less than 120? X ~ N((100, 152) P(X <120) P( z < 120−100 15 ) P(z < 1.333) = 0.909 90.9 % of the population have an IQ less than 120 The time, X minutes to install an alarm system may also be assumed to be a normal random variable such that P(X<160) = 0.15 and P(X>200) = 0.05 Determine to the nearest minute, the values for the mean and standard deviation of X Use the tables or the calculator function to find the z values corresponding to the probabilities given P(z < -1.0364) = 0.15 P(z > 1.6449) = 0.05 160−𝜇𝜇 𝜎𝜎 = −1.0364 160− 𝜇𝜇 = −1.0364𝜎𝜎 200−𝜇𝜇 𝜎𝜎 = 1.6449 200 − 𝜇𝜇 = 1.6449𝜎𝜎 Solving simultaneously gives 𝜇𝜇 = 175 𝑚𝑚𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑒𝑒𝑠𝑠 𝜎𝜎 = 15 𝑚𝑚𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑒𝑒𝑠𝑠 0.15 0.05 X=160 X=200 A dice is rolled 180 times. The random variable X is the number of times three is scored. Use the normal distribution to calculate P(X < 27) X ~ B(180, 1 6 ) can be approximated by X ~ N(30, 25) Without continuity correction With continuity correction P(X < 27) = 0.274 (3 s.f.) P(X < 26.5) = 0.242 (3 s.f.) 7 www.mathsbox.org.uk 9 SAMPLING If you are working with the mean of a sample of several observations from a population (eg calculating the probability that the mean (?̅?𝑥) is less than a specified value) then the following distribution must be used 𝑋𝑋�~𝑁𝑁(𝜇𝜇, 𝜎𝜎 2 𝑝𝑝 ) where n is the sample size, 𝜇𝜇 is the population mean and 𝜎𝜎2 is the population variance 10 HYPOTHESIS TESTING Binomial Set up the hypothesis H1 : p < a one sided test Ho : p = a H1 : p ≠ a two sided test H1 : p > a one sided test • State the significance level (as a percentage) – the lower the value the more stringent the test. • State the distribution/model used in the test Binomial (n,p) • Calculate the probability of the observed results occurring using the assumed model • Compare the calculated probability to the significance level – Accept or reject Ho • Write a conclusion (in context) Reject Ho “There is sufficient evidence to suggest that ………is underestimation/overestimating…….” Accept Ho “There is insufficient evidence to suggest that ……increase/decrease……therefore we cannot reject the null hypothesis that p = a.” The probability that patients have to wait more than 10 minutes at a GP surgery is 0.3. One of the doctors claims that there is a decrease in the number of patients having to wait more than 10 minutes. She records the waiting times for the next 20 patients and 3 wait more than 10 minutes. Is there evidence at the 5% level to support the doctors claim? Ho : p = 0.3 H1 : p < 0.3 5% Significance level X = number of patients waiting more than 20 minutes X Binomial (20, 0.3) Using tables P(X ≤ 3) = 0.107 (10.7%) 10.7% > 5% There is insufficient evidence to suggest that the waiting times have reduced therefore accept Ho and conclude that p = 0.3 Alex spends X minutes each day looking at social media websites. X is a random variable which can be modelled by a normal distribution with mean 70 minutes and standard deviation 15 minutes. Calculate the probability that on 5 randomly selected days the mean time Alex spends on social media is greater than 85 minutes. n = 5 𝑋𝑋�~𝑁𝑁(70, 15 2 5 ) P(𝑋𝑋� > 85) = 0.0127 (3 s.f.)
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved