Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistics: Understanding Lognormal PDF, Student's t PDF, and χ² PDF, Study notes of Electronic Measurement and Instrumentation

An introduction to three types of probability distributions: lognormal pdf, student's t pdf, and χ² pdf. The lognormal pdf is a probability distribution that becomes gaussian when plotted on a log scale. The student's t pdf is used for small sample sizes and is wider than the normal pdf. The χ² pdf is used to estimate the confidence interval for the population variance. Equations, graphs, and examples to help understand these concepts.

Typology: Study notes

2012/2013

Uploaded on 10/02/2013

sonu-kap
sonu-kap 🇮🇳

4.4

(38)

167 documents

Partial preview of the text

Download Statistics: Understanding Lognormal PDF, Student's t PDF, and χ² PDF and more Study notes Electronic Measurement and Instrumentation in PDF only on Docsity! Other PDFs, Page 1 Other PDFs Introduction  There are other useful standard distributions and PDFs besides the Gaussian PDF. These include the binomial, chi-squared, exponential, gamma, lognormal, Poisson, student’s t, uniform, and Weibull PDFs.  We discuss some of these in this learning module, although not in as much detail as for the Gaussian (normal) distribution. Lognormal PDF  A lognormal PDF is defined as a PDF that becomes Gaussian when the x-axis is plotted as a log scale. o Lognormal PDFs often appear in air quality measurements, e.g., the size distribution of particles. It is also useful for some life and durability analyses of components and equipment or instruments. o When the PDF is plotted as usual (linear x scale), it is skewed towards the left (lower values), and has a very long tail to the right (higher values). This is shown on the first plot below. o However, when the PDF is plotted with a logarithmic x scale, all else being equal, it is no longer skewed, but becomes symmetric. In fact, it’s bell shape is identical to that of a Gaussian or normal PDF. This is shown on the second plot below. o Another way to plot lognormal PDFs is to first convert the x values to log10(x) or ln(x), and then plot using a linear abscissa scale. Either way, the PDF again looks like a standard Gaussian PDF, as illustrated below. o To calculate statistics with a lognormal PDF, we substitute either log10(x) or ln(x) as our variable instead of x itself. For example, if the data are for particle diameter Dp in units of microns (m), we let our statistics variable be x = ln[Dp /(1 m)] instead of Dp itself. All statistics are then based on x as usual. docsity.com Other PDFs, Page 2 Student’s t PDF  A student’s t PDF (also sometimes called simply the t PDF) is similar to the Gaussian (normal) PDF, but is used for small sample sizes (typically when n < 30, where n is the number of data points in the sample). o In simple terms, when n is small, the sample mean and sample standard deviation may differ from the population mean and population standard deviation by some unknown amount. So, for a specified confidence level (typically 95%), the student’s t PDF is expected to be wider than the normal (Gaussian) PDF. o Mathematically, the statistic called student’s t is defined as / x t S n   , where x is the sample mean, S is the sample standard deviation, and n is the number of data points in the sample.  is the population mean or expected value, as defined previously, but  is not necessarily known. (This is the whole point of the student’s t analysis in the first place – we want to establish some confidence level in predicting .) o Statisticians use a parameter called degrees of freedom, with notation df (we do not use italics here so as to not confuse df with df, the differential of some variable f). Note that some authors use f or v (lower case italic V) as their notation for degrees of freedom. o We define degrees of freedom as the number of measurements minus the minimum number of measurements necessary to estimate a statistic. For example, it takes only one measurement to estimate the mean value of some quantity x. So, df = n – 1 in this case. o Example: If we take n = 10 measurements of pipe diameter, and calculate the sample mean, then df = n – 1 = 10 – 1 = 9. In other words, there are 9 degrees of freedom “left over” after we estimate the mean. [By the way, this is the reason we define standard deviation with n – 1 in the denominator instead of n itself, because we have already “used up” one degree of freedom to calculate the mean; there are thus n – 1 degrees of freedom left over to calculate the standard deviation.] o Mathematically, the student’s t PDF is defined as   df 1 2 2 df 1 2,df 1 df dfdf 2 t f t                     , where  is a standard mathematical function called the gamma function, defined for integers and half- integers as follows:  If y is a whole integer (e.g., 12, 17, 25),       integer 1 ! 1 2 ...(3)(2)(1)y y y y       You should recall from math class the factorial operation, represented by “!”.  if y is a half-integer (e.g., 12.5, 16.5),        half-integer 1 2 ... 3/ 2 1/ 2y y y       if y = 1/2,  1/ 2y     Examples: (5) = 4! = 4321 = 24. (5.5) = 4.53.52.51.50.5  = 52.3428, where the answer is given to 6 significant digits. o Like the normal PDF, the student’s t PDF is symmetric about zero, except now we plot the PDF with t instead of z as the abscissa. o The above equation for the student’s t PDF depends on both t and df, so there is actually a whole family of curves representing the student’s t PDF. o On the next page, the student’s t PDF f(t,df) is plotted as a function of t for several values of df. On the same plot is shown the normal (Gaussian) PDF, f(z) as a function of z. As you can see, as df increases, the student’s t PDF approaches the normal PDF. In fact, as df  , the student’s t PDF becomes identical to the normal PDF. docsity.com Other PDFs, Page 5  Table of critical values for the student’s t PDF – Finally, for convenience, and when a calculator or computer or is not readily available, we generate a table of the critical values associated with the t PDF. o We use Excel’s TINV(,df) function to calculate the critical value t /2 for several values of df and confidence levels. The table is shown below. This kind of table appears in most statistics books. o The 95% confidence level case ( = 0.05) is highlighted since it is the engineering standard. o The last row is for a very large value of df, which we approximate as infinity. docsity.com Other PDFs, Page 6 The 2 PDF  The 2 PDF is used somewhat like the student’s t PDF, except applies to the standard deviation rather than to the mean. o The student’s t PDF is used to estimate the confidence interval for the population mean, . o The 2 PDF is used to estimate the confidence interval for the population variance,  2. [Recall, variance is the square of the standard deviation.] o Mathematically, the statistic called 2 (chi-squared) is defined as  2 2 1 2 n i i x        , where xi is an individual data point, n is the number of data points in the sample,  is the population mean or expected value, as defined previously, and  is the population standard deviation. o Comparing the above definition to that of the sample variance (square of the sample standard deviation)  2 2 1 1 n i i x x S n      , we see that there is a relationship between 2 and S2, namely,   2 2 21 S n    [assuming x  ]. o Mathematically, the 2 PDF is defined as         2df-2 / 22 / 2 2 df / 2,df 2 df / 2 e f      , where  is the gamma function, defined previously, and df is the degrees of freedom, also defined previously. o Shown here is a plot of the 2 PDF for several values of df, noting that 2 must always be  0: o Unlike the normal (Gaussian) or student’s t PDF, the 2 PDF is not symmetric – it is skewed.  We use the 2 PDF to estimate a confidence interval for the population variance, much like we use the student’s t PDF to estimate a confidence interval for the population mean. o It turns out that the confidence interval for the variance is     2 2 2 2 2 / 2 1 / 2 1 1S Sn n           , where  2 /2 is the value of 2 at which the area under the PDF is equal to  /2 on one side of the PDF.  21– /2 is the value of 2 at which the area under the PDF is equal to  /2 on the other side of the PDF. o This is illustrated on the PDF shown here for the case in which df = 3 (3 degrees of freedom). o Just as with the t PDF, we can generate a table of critical 2 values as a function of df and . o The main difference is that the 2 PDF is not symmetric, so we must list on our table two critical values – for 2 /2 and for 21– /2. o For example, at 95% confidence level,  = 0.05, /2 = 0.025, and 1–/2 = 0.975. For df = 3, as in the plot, 21– /2 = 0.2158 and 2 /2 = 9.3484. o Excel provides a built-in function called CHIINV(probability,df) that calculates the critical values 2 /2 and 21– /2. For example, if df = 3 and  = 0.05 (95% confidence level), the critical values are 21– /2 = CHIINV(0.975,3) = 0.2158 and 2 /2 = CHIINV(0.025,3) = 9.3484, as plotted above. 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0 1 2 3 4 5 6 7 8 9 10 11 12  2 f ( 2) Area =  /2 Area = 1 –  Confidence interval Area =  /2  2 1– /2  2 /2 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0 1 2 3 4 5 6 7 8  2 f ( 2,df) df=3 df=4 df=5 df=7 docsity.com Other PDFs, Page 7  Table of critical values for the 2 PDF – Finally, for convenience, and when a computer or calculator is not readily available, we generate a table of the critical values associated with the 2 PDF. o We use Excel’s CHIINV(probability,df) function to calculate critical values 2 /2 and 21– /2 for several values of df and confidence levels. The table is shown below. This kind of table appears in most statistics books. o The 95% confidence level case ( = 0.05) is highlighted since it is the engineering standard. o The last row shows that as df gets large, both values of critical 2 approach infinity. docsity.com
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved