Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Quick Statistics Formula Cheat Sheet, Cheat Sheet of Statistics

In this cheat sheet you find basic principles of Statistics for introductory courses.

Typology: Cheat Sheet

2019/2020

Uploaded on 10/23/2020

geryle
geryle 🇺🇸

4.5

(18)

21 documents

Partial preview of the text

Download Quick Statistics Formula Cheat Sheet and more Cheat Sheet Statistics in PDF only on Docsity! . T L I \ T S T I C S F O R I N T R O D U G T O R Y C O U R S E S J STATISTICS - A set of tools for collecting, o rean iz ing , p resen t ing , and ana lyz ing numerical facts or observations. I . Descriptive Statistics - procedures used to organize and present data in a convenient, useable. and communicable form. 2. Inferential Statistics - procedures employed to arr ive at broader general izat ions or inferences from sample data to populations. -l STATISTIC - A number describing a sample characteristic. Results from the manipulation of sample data according to certain specified procedures. J DATA - Characteristics or numbers that a re co l l ec ted by observa t ion . J POPULATION - A complete set of actual or potential observations. J PARAMETER - A number describing a population characteristic; typically, inferred from sample stat is t ic . f SAMPLE - A subset of the population selected according to some scheme. J RANDOM SAMPLE - A subset selected in such a way that each member of the population has an equal opportunity to be selected. Ex.lottery numbers in afair lottery J VARIABLE - A phenomenon that may take on different values. f MEAN -The ooint in a distribution of measurements about which the summed deviations are equal to zero. Average value of a sample or population. POPULATION MEAN SAMPLE MEAN p: +!,*, o:#2*, Note: The mean ls very sensltlve to extreme measure- ments that are not balanced on both sides. I WEIGHTED MEAN - Sum of a set of observations multiplied by their respective weights, divided by the sum of the weights: 9, *, *, WEIGHTED MEAN -L- ,\r*' where xr , : weight , 'x , - observat ion; G : number of observa i i on g rdups . 'Ca lcu la ted f rom a popu la t i on . sample. or gr6upings in a frequency distribution. Ex. In the FrequencVDistribution below, the meun is 80.3: culculatbd by- using frequencies for the wis. When grouped, use closs midpoints Jbr xis. J MEDIAN - Observation or potenlial observation in a set that divides the set so that the same number of observations lie on each side of it. For an odd number of values. it is the middle value; for an even number it is the average of the middle two. Ex. In the Frequency Distribution table below, the median is 79.5. f MODE - Observation that occurs with the greatest tiequency. Ex. In the Frequency Distributioln nble below. the mode is 88. O SUM OF SOUARES fSSr- Der iations tiom the mean. squared and summed: , ( I r ,) ,Popu la t i onSS: I (X i - l . r x ) ' o r I x i ' - t N _ r , \ , ) 2 S a m p l e S S : I ( x i - x ) 2 o r I x i 2 - - - O VARIANCE - The average of square differ- ences between observations and their mean. POPULANONVARIANCE SAMPLEVARIANCE VARIANCES FOH GBOUPED DATA POPUIATION SAMPLE ^ { G - ' { G o 2 : * i t , ( r , - p ) t s 2 = ; 1 i t i l m ' - x ; 2 l I ; _ r t = 1 D STANDARD DEVIATION - Square root of the variance: Ex. Pop. S.D. o - n Y I U fi z ) D BAR GRAPH - A form of graph that uses bars to indicate the frequency of occurrence of observations. o Histogram - a form of bar graph used rr ith interval or ratio-scaled variables. - I n te rva l Sca le - a quan t i t a t i ve sca le tha t permits the use of arithmetic operations. The zero point in the scale is arbitrary. - R.at io Scale- same as interval scale excepl that there is a t rue zero point . D FREOUENCY CURVE - A form of graph representing a frequency distribution in the form of a continuous line that traces a histogram. o Cumulative Frequency Curve - a continuous line that traces a histogram where bars in all the lower classes are stacked up in the adjacent higher class. It cannot have a negative slop€. o Normal curve - bell-shaped curve. o Skewed curve - departs from symmetry and tails-off at one end. GROUpITG OF DATA Shows the number of times each observation occurs when the values ofa variable are arranged in order according to their magnitudes. II GROTJPED FREOUENCY EilSTRIBUTION - A frequency distribution in which the values ofthe variable have been grouped into classes. J il {il, I a rr I.)'A .l b]|, K I 3artl LQ x f x t x f x t 100 1 83 11 74 11f 65 o 99 1 ut 11111 75 1111 66 1 98 0 85 1 76 11 67 11 gl 0 86 o 77 111 68 1 96 11 87 1 7A I 69 111 95 0 88 1111111 79 1 1 70 1111 94 0 89 111 80 1 71 0 93 I 11 81 11 72 11 92 0 91 1 82 I 73 111 tr CUMULATUE FREOUENCY BISTRI. BUTION -A distribution which shows the to- tal frequency through the upper real limit of each class. tr CUMUIATIVE PERCENTAGE DISTRI. BUTION-A distribution which shows the to- tal percentage through the upper real limit of each class. !I! l lrfGl:I il {.ll lNl.l'tlz CLASS fI Cum f " 65-67 3 3 4.84 6&70 8 1 1 17.74 71-73 5 16 25.81 7+76 9 25 40.32 Tt-79 6 31 50.00 80-82 4 35 56.45 83-85 8 43 69.35 86-88 8 5 1 82.26 89-91 6 57 91.94 92-g 1 58 93.55 95-97 2 60 96.77 9&100 2 62 100.00 15 10 0 NORMAL CURVE ^/T\ ./ \ -t -att? \C L A S S f C L A S S t 9 8 - 1 0 0 1 5 1 0 0 SKEWED CURVE -- \ / \ -/ LEFT \ J- \ Probability of occurrence^t at -Number of outcomafamring EwntA oif'ent'l Ant=@ D SAMPLE SPACE - All possible outcomes of an experiment. N TYPE OF EVENTS o Exhaustive - two or more events are said to be exhaustive if all possible outcomes are considered. Symbolical ly, P (A or B or.. .) - l . rNon-Exhausdve -two or more events are said to be non- exhaustive if they do not exhaust all possible outcomes. rMutual ly Exclusive - Events that cannot occur simultaneously:p (A and B) = 0; and p (A or B) = p (A) + p (B). Ex. males, females oNon-Mutually Exclusive - Event-s that can occur s imu l taneous ly : p (A orB) = P(A) +p(B) - p (A and B) ' &x. males, brown eyes. Slndependent - Events whose probability is unaffected by occurrence or nonoccurrence of each other: p(A lB) = p(A); ptB In)= p(e); and p(A and B) = p(A) p(B). Ex. gender and eye color SDependent - Events whose probability changes deoendlns upon the occurrence or non-occurrence ofeach other: p{.I I bl di l fers lrom AA): p(B lA) dif fers from p ( B ) ; a n d p ( A a n d B ) : p ( A ) p ( B l A ) : p ( B ) A A I B ) Ex. rsce and eye colon C JOINT PROBABILITIES - Probability that2 ot more events occur simultaneously. tr MARGINAL PROBABILITIES or Uncondi- tional Probabilities = summation of probabilities' D CONDITIONAL PROBABILITIES - Probability of I given the existence of ,S, written, p (Al$. f l EXAMPLE- Given the numbers I to 9 as o b s e r v a t i o n s i n a s a m p l e s p a c e : .Events mutually exclusive and exhaustive' Example: p (all odd numb ers) ; p ( all eu-e n nurnbers ) .Evenls mutualty exclusive but not exhaustive- Example: p (an eien number); p (the numbers 7 and 5) .Events ni:ither mutually exclusive or exhaustive- Example: p (an even number or a 2) fl SAMPLING DISTRIBUTION - A theoretical probability distribution of a statistic that would iesult from drawing all possible samples of a given size from some population. THE STAIUDARD EBROR OF THE MEAN A theoretical standard deviation of sample mean of a given sample si4e, drawn from some speciJied popu- lation. DWhen based on a very large, known population, the standard error is : 6_ _ o" r _ ^ l n EWhen estimated from a sample drawn from very large population, the standard error is: lThe dispersion of sample means decreases as sample size is increased. O = = S ^ t - ' f n RANDOM VARIABLES A mapping or function that assigns one and'onlv one-numerical value to each outcome in an exPeriment. tl DISCRETE RANDOM VARIABLES - In- volves rules or probability models for assign- ing or generating only distinct values (not frac- tional measurements). C BINOMIAL DISTRIBUTION - A model for the sum of a series of n independent trials where trial results in a 0 (failure) or I (suc- cess). Ex. Coin to " t p ( r ) = ( ! ) n ' l - t r l " - ' where p(s) is the probability of s success in n trials with a constant n probability per trials, and whe re ( , 1 \= , n !- " - " ' - ' - t s / s ! ( n - s ) ! Binomial mean: ! : nx Binomial variance: o': n, (l - tr) As n increases, the Binomial approaches the Normal distribution. D HYPERGEOMETRIC DISTRIBUTION - A model for the sum of a series of n trials where each trial results in a 0 or I and is drawn from a small population with N elements split between N1 successes and N2 failures. Then the probabil- ity of splitting the n trials between xl successes and x2 failures is: Nl! {_z! p (x land t r r :W 't 4tlv-r;lr Hypergeometric mean : pt :E(xi - + and variance: o2 : ffit+][p] D POISSON DISTRIBUTION - A model for the number of occurrences of an event x : 0,1,2, . . . , when the probabi l i ty of occurrence is smal l , but the number of opportuni t ies for the occurrence is large, for x : 0,1,2,3. . . . and )v > 0 . otherwise P(x) =. 0. e$t=f f Po isson mean and ra r iance : , t . Fo r c ontinuo u s t' a ri u b I e s. .fi'e q u e n t' i e s u re e.t p re s s e d in terms o.f areus under u t'ttt.re. D CONTINUOUS RANDOM VARIABLES - Variable that may take on any value along an uninterrupted interval of a numberline. D NORMAL DISTRIBUTION - bell cun'e; a distribution whose values cluster symmetri- cally around the mean (also median and mode). f ( x ) = - 1 , ( x - P ) 2 1 2 o 2 o"t'2x wheref (x): frequency. at.a givenrzalue o : s tandard deviat lon of the distr ibut ion lt : approximately I 111q approximately 2.7183 p : the mean of the distribution x : any score in the distribution D STANDARD NORMAL DISTRIBUTION - A normal random variable Z. that has a mean of0. and standard eviation of l. Q Z-VALUES - The number of standard evia- tions a specific observation lies from the mean: ' : x - 1 1 tr LEVEL OF SIGNIFICANCE -Aprobabilin value considered rare in the sampling distrib ution. specified under the null hypothesis where one is willing to acknowledge the operation of chance factors. Common significance levels are 170, 50 , l0o . Alpha (a) level : the lowest leve for which the null hypothesis can be rejected. The significance level determinesthe critical region. [| NULL HYPOTHESIS (flr) - A statement that specifies hypothesized value(s) for one or more of the population parameter. lBx. Hs= a coin is unbiased. That isp : 0.5.] tr ALTERNATM HYPOTHESIS (.r/1) - A statement that specifies that the population parameter is some value other than the one specified underthe null trypothesis. [Ex. I1r: a coin is biased That isp * 0.5.1 I. NONDIRECTIONAL HYPOTHESIS - an alternative hypothesis (H1) that states onll that the population parameter is different from the one ipicified under H 6. Ex. [1 f lt + !t0 Two-Tailed Probability Value is employed when the alternative hypothesis is non-directional. 2. DIRECTIONAL HYPOTHESIS - an alternative hypothesis that states the direction rn which the population parameter differs fiom the one specified under 11* Ex. Ilt: Ir > pn r-tr H f lr ' t1 One-Tailed Probability Value is employed u'hen the alternative hypothesis is directional. D NOTION OF INDIRECT PROOF - Stnct interpretation ofhypothesis testing reveals that thc' null hypothesis can never be proved. [Ex. Ifwe toi. a coin 200 times and tails comes up 1 00 times. it i s no guarantee that heads will come up exactly hali the time in the long run; small discrepancies migfrt exist. A bias can exist even at a small magnitude. We can make the assertion however that NO BASIS EXISTS FOR REJECTING THE HYPOTHESIS THAT THE COIN IS UNBIASED . (The null hypothesis is not reieued. When employing the 0.05 level of significa reject the null hypothesis when a given res occurs by chance 5% of the time or less.] ] TWO TYPES OF ERRORS - Type 1 Error (Type a Error) = the rejection of 11, when it is actually true. The probability of a type 1 error is given by a. -Type II Error(Type BError) =The acceptance offl, when it is actually false. The probabilin of a type II error is given by B. (for sample mean X) rlf x 1, X2, X3,... xn , is a simple random sample of n elements from a large (infinite) population, with mean mu(p) and standard deviation o, then the distribution of T takes on the bell shaped distribution of a normal random variable as n increases andthe distribution ofthe ratio: 7-! 6l^J n approaches the standard normal distribution as n goes to ' in f in i t y . In p rac t ice . a normal approx imat ion is acceptable for samples of 30 or larger. Percentage Cumulative Distr ibution for selected Z values under a normal curye Z - v a l u e - 3 - 2 - l 0 + 1 + 2 + 3 Percenti feScore o-13 2.2a 15.87 50.00 a4.13 97.72 99.a7
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved