Download Statistical Theory I: Probability & Cumulative Distribution Functions and more Study notes Biostatistics in PDF only on Docsity! Lecture 5 on BST 631: Statistical Theory I – Kui Zhang, 09/06/2007 Review for the previous lecture Example: how to calculate probabilities of events (using disjoint, independent, De Morgan’s Law), how to calculate the distribution of a random variable Definition: random variable, induced probability function from a random variable Chapter 1 – Probability Theory 1.5 Distribution Functions Definition 1.5.1: The cumulative distribution function or cdf of a random variables , denoted by ) , is defined by ( ) ( )F x P X x≤ , for all X (XF x X X= x . Definition and Theorem: Suppose we have a sample space 1 2{ , , , }nS s s s= with a probability function and we define random variable X with range } P 1{ , , mx x=X , then we can define a function XP on in the following way: ( ) ({ : ( ) })iP X x P s s x= = . Then X X i j j∈ =S X XP X is a probability function defined on the sample space . We call it as the induced probability function on . We can simply write X ( )X ixP X = as ( )iP X x= . This definition c be extended to countable (infinite) sample spaces (but not to uncountable sample spaces). To generalize this definition, for any set an A∈B B where is appropriate sigma algebra defined on (in the general case, is derived from the original sigma algebra defined on the sample space S ), we can define: X B ( ) ({ : ( ) })XP X A P s S X s A∈ = ∈ ∈ . 1 Lecture 5 on BST 631: Statistical Theory I – Kui Zhang, 09/06/2007 Example 1.5.2: Tossing three coins. =number of heads. Then we have: X 0 if 0 1/8 if 0 1 ( ) 4/8 if 1 2 7/8 if 2 3 1 if 3 X x x F x x x x −∞ < <⎧ ⎪ ≤ <⎪⎪= ≤ <⎨ ⎪ ≤ <⎪ ≤ < ∞⎪⎩ Theorem 1.5.3: The function ) is a cdf if and only if the following three conditions hold: (F x (1) and lim ( ) 0x F x→−∞ = lim ( ) 1x F x→∞ = . (2) ) is nondecreasing function of (F x x . (3) ) is right-continuous; that is, for every number (F x 0x , . 0 0lim ( ) ( )x x F x F x↓ = Proof of necessity: (2) x y∀ < , define { : ( ) }A s S X s x= ∈ ≤ and { : ( ) }B s S X s y= ∈ ≤ , then A B⊂ , therefore: ( ) ({ : ( ) }) ( ) ( ) ({ : ( ) )F x P s S X s x P A P B P s S X s y= ∈ ≤ = ≤ = ∈ ≤ } (1) First, , then ( ) ({ : ( ) })F x P s S X s x= ∈ ≤ 0 ( ) 1F x≤ ≤ . Define { : 1 ( ) }nA s S n X s n= ∈ − < ≤ ( ) ( 1)F n , ) , then we have , ( ,n = −∞ , 1,0,1,− ∞ nn ∞ =−∞∪ P AS A= ( )n F n= − − (nA n, and , , )= −∞ ∞ are disjoint. Therefore, )m . Because is non decreasing, we have 1 ( ) ( ) ( ( ) ( 1)) lim ( ) lim (n nn nP S P A F n F n F n F m ∞ ∞ →∞ →−∞=−∞ =−∞ = = = − − = −∑ ∑ lim ( ) lim ( )x nF x F n→∞ →∞ ( )F x = and lim ( ) lim ( )x mF x F m→−∞ →−∞= . 2 Lecture 5 on BST 631: Statistical Theory I – Kui Zhang, 09/06/2007 Example 1.5.6: A random variable is neither continuous nor discrete. Consider the following function: 1 if 0 1( ) for some 0< <1 1 if 0 1 y X y y eF x y e ε ε εε − − −⎧ <⎪⎪ += ⎨ −⎪ + ≥ ⎪ +⎩ . Definition 1.5.8: The random variables X and Y are identically distributed if, for every set 1A∈B , ( ) ( )P X A P Y A∈ = ∈ , where 1B is the smallest sigma algebra containing all the intervals of real numbers of the form (a, b), [a, b), (a, b], and [a, b]. Theorem 1.5.10: The following two statements are equivalent: 1. The random variables and are identically distributed. X Y 2. ) for every ( ) (X YF x F x= x . Example 1.5.9 (Identically distributed random variables): If a fair coin is tossed times, define the following random variables: X =number of heads observed, and Y =number of tails observed. It is easy to prove that X and Y have the same distribution but they are different. Actually we have ( ) ( )X s Y s n n + = . 1.6 Density and Mass Functions 5 Lecture 5 on BST 631: Statistical Theory I – Kui Zhang, 09/06/2007 Definition 1.6.1 The probability mass function (pmf) of a discrete random variable us given by X ( ) (X )f x P X x= = for all x . Note: The pmf gives the point probabilities of a discrete random variable . X Example 1.6.2: From Example 1.5.4, the geometric distribution has pmf given by 1(1 ) for 1,2, ( ) ( ) 0 otherwise. x X p p x f x P X x −⎧ − = = = = ⎨ ⎩ It follows then that for b , we have a ≤ 1( ) ( ) (1 )b b kXk a k aP a X b f k p p − = = ≤ ≤ = = −∑ ∑ . and in particular, if , then 1a = 1 ( ) ( ) ( ).b X XkP X b f k F b=≤ = =∑ Example: Twenty telephones have just been received at an authorized service center. 4 of these telephones are corded, 10 are cordless and 6 are cellular. Suppose we select phones one by one until we get a corded phone. Obtain the pmf of = the number of phones selected until a corded phone is selected. X Solution: The pmf of is: X 6 Lecture 5 on BST 631: Statistical Theory I – Kui Zhang, 09/06/2007 16 1 4 if 1,2 ,17 20( ) 20 ( 1) 1 0 Otherwise k k P X k k k ⎧⎛ ⎞ ⎪⎜ ⎟−⎝ ⎠⎪ =⎪= = − −⎛ ⎞⎨ ⎜ ⎟⎪ −⎝ ⎠⎪ ⎪⎩ When is a continuous random variable with cdf , how to get its pdf? X ( )XF x Since { } { }X x X X xε= ⊂ − < ≤ for any 0ε > , we have from Theorem 1.2.9(c) that ( ) ( ) ( ) ( )X XP X x P x X x F x F xε ε= ≤ − < ≤ = − − for any 0ε > , then we have ( ) 0P X x= = due to its continuity. Discrete: ( ) ( )X Xa xF x f a≤=∑ Continuous: t ( ) ( ) x X XF x f t d−∞= ∫ By the Fundamental Theorem of Calculus, we have ( ) ( ). X X d F x f x dx = Definition 1.6.3 The probability density function or pdf, ( )Xf x , of a continuous random variable X is the function that satisfies ( ) ( ) for all . x X XF x f t dt x−∞= ∫ 7