Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Basic Biostatistics Formulas Cheat Sheet, Cheat Sheet of Statistics

Exploratory and Summary Statistics, Probability, Inference chapters formulas

Typology: Cheat Sheet

2020/2021
On special offer
30 Points
Discount

Limited-time offer


Uploaded on 04/27/2021

hollyb
hollyb 🇺🇸

4.8

(39)

204 documents

Partial preview of the text

Download Basic Biostatistics Formulas Cheat Sheet and more Cheat Sheet Statistics in PDF only on Docsity! Basic Biostatistics Formulas Page 1 of 3 Exploratory and Summary Statistics (Chapters 3 & 4) Statistic Parameter Point Estimate Formula Interprétation Notes Sum of squares df×2σ SS ∑ = −= n i i xxSS 1 2)( No easy interpretation. Mean μ x ∑ = = n i ixn x 1 1 A measure of central location; balancing pt. Variance σ2 s2 1 2 − = n SSs A measure of spread expressed in units squared Standard Deviation σ s 2ss = or 1−n SS A measure of spread expressed in data units. More appropriate for descriptive purposes. • Mean and standard deviation are best suited to symmetrical distributions. • When distribution is Normal, 68% of data points lie within +1σ of µ, 95% within +2σ of µ, and 99.7% lie within +3σ of µ • For other distributions, use Chebychev’s rule (e.g., at least 75% of data lie within +2σ of µ). Statistic Formula Interpretation 5-point Summary Notes of boxplot Median Median has depth of 2 1+n A measure of central location Interquartile Range ( )IQR 13 QQIQR −= A measure of spread, aka “hinge-spread” Lower Fence ( )lF ( )IQRQFl 5.11−= Helps determine: Lower inside value Lower outside value(s) Upper Fence ( )uF ( )IQRQFu 5.13 += Helps determine: Upper inside value Upper outside value(s) Q0 – Minimum Q1 – First Quartile Q2 – Median Q3 – Third quartile Q4 – Maximum • Provide information about locations, spread, and shape. The box contains middle 50% of data. Line inside the box is the median. • Anything above the upper fence or below the lower fence is “outside.” (Fences are not drawn.) Plot outside values as separate points. • The lower whisker is drawn from Q1 to the lower inside value. The upper whisker is drawn from Q3 to the upper inside value. Page 2 of 3 Probability (Chapters 5–7) Probability ≡ relative frequency in the population; expected proportion after a very long run of trials; can be used to quantify subjective statements. Properties of probabilities Basic: (1) 0 ≤ Pr(A) ≤ 1; (2) Pr(S) = 1; (3) Pr(Ā) = 1−Pr(A); and (4) Pr(A or B) = Pr(A) + Pr(B) for disjoint events. Advanced: (5) If A and B are independent, Pr(A and B) = Pr(A) · Pr(B) (6) Pr(A or B) = Pr(A) + Pr(B) − Pr(A and B) (7) Pr(B|A) = Pr(A and B) / Pr(A) (8) Pr(A and B) = Pr(A) · Pr(B|A) (9) Pr(B) = [Pr(B and A)] + Pr(B and Ā) (10) Bayes’ Theorem (p. 111) Binomial variables: X ~ b(n, p), xnxxn qpCxX −== )Pr( where )!(! ! xnx nCxn − = and q = 1 – p Cumulative probability: Pr(X ≤ x) = sum all probabilities up to and including Pr(X = x); corresponds to AUC in the left tail of the pmf or pdf. Normal variables: X ~ N(μ, σ). To determine Pr(X ≤ x), standardize σ μ− = xz and look up cumulative probability in Z table. Use the fact that the AUC sums to 1 to determine probabilities for various ranges. To find a value that corresponds to a given probability, look up closest zp in the Z table and then unstandardize according to x = μ + zp·σ. Introduction to Inference (Chapters 8–11) The sampling distribution of the mean (SDM) is governed by the central limit theorem, law of large numbers, and square root law. When n is large, ),(~ xNx σμ where xσ is the standard error (SE) and is equal to n σ . The standard estimate is estimated by n s when the population standard deviation is not known. (1–α)100% confidence interval for μ. Use x SEzx ⋅± − 2 1 α when σ is known. Use xn SEtx ⋅± −− 21,1 α when relying on s. Hypothesis testing basics. Know all the steps, not just the conclusion and keep in mind that hypothesis tests require certain conditions (e.g., Normality, independence, data quality) to be valid. The steps are: A. H0 and H1 [For one-sample test of a mean, H0: µ = µ0 where µ0 is the mean specified by the null hypothesis.] B. Test statistic [For one-sample test of a mean, use either xSE x z 0stat μ− = or 1 with 0stat −= − = ndf SE xt x μ .] C. P-value. Convert the test statistic to a P-value. Small P → strong evidence against H0. D. Significance level. It is unwise to draw too firm a line. However, you can use the conventions regarding marginal significance, significance, and high significance when first learning. Power and sample size basics. Approach from estimation, testing, or “power” perspective. Sample size requirement for limiting margin of error m is given by 2 1 2 ⎟ ⎠ ⎞ ⎜ ⎝ ⎛= − m zn σα The power of testing a mean is ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ Δ +−Φ=− − σ β α nz ||1 21 . The sample size requirement of a one-sample z or t test: ( ) 2 2 11 2 2 Δ + = −− αβσ zz n . It is OK to use s as a substitute for σ in power and sample size formulas, when necessary.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved