Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Conditional Probability: Definition, Rules, and Applications - Prof. Dilip S, Study notes of Statistics

An in-depth exploration of conditional probability, its definition, consistency with various models, axioms, rules, and applications. The chain rule or product rule, the theorem of total probability, and examples such as the birthday surprise problem and the theorem of total probability. It also discusses the importance of conditional probabilities in probabilistic analyses.

Typology: Study notes

2009/2010

Uploaded on 02/24/2010

koofers-user-6et
koofers-user-6et 🇺🇸

10 documents

1 / 7

Toggle sidebar

Related documents


Partial preview of the text

Download Understanding Conditional Probability: Definition, Rules, and Applications - Prof. Dilip S and more Study notes Statistics in PDF only on Docsity! ECE 313 — Probability with Engineering Applications Fall 2000 Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign 13.1 ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 1 of 39 Introduction l The conditional probability of an a event B given that event A occurred is our revised estimate of the chances that B occurred in light of partial knowledge of the outcome of the experiment, viz. knowing that A occurred l To avoid trivialities, we assume that A, sometimes called the conditioning event, has nonzero probability ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 2 of 39 Definition of conditional probability l The conditional probability of B given A is denoted by P(B|A) l Read this as “the probability of B given A” or “the probability of B conditioned on A” l Definition: If P(A) > 0, P(B |A) is defined as P(B|A) = P(AB) P(A) l P(B|A) can be larger than, smaller than, or the same as P(B) ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 3 of 39 Consistent with various models l The definition of conditional probability is consistent with nclassical approach to probability n relative frequency approach l Conditional probabilities can also be discussed for events defined in terms of random variables l P{X = k | X > n}? or P{X ≤ k | a < X < b}? ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 4 of 39 Geometric RVs are memoryless l Let X denote a geometric random variable with parameter p l For k > 0, P{X = k+r | X > r} = P{X = k} l Given that the event {X > r} has occurred, that is, the first r trials ended in a “failure”, the probability that we need to wait for an additional k trials to observe the first success is the same as P{X = k} l It’s as if the first r trials are forgotten! ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 5 of 39 Binomial random variables l Let X denote a binomial random variable with parameters (n, p) l GIven the event {X = k} has occurred, the conditional probability that the j-th trial resulted in a success is k/n, independent of the value of p l The conditional probability of successes on the i-th and j-th trials is k(k–1)/[n(n–1)] l and so on ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 6 of 39 Axioms are satisfied l Conditional probabilities are a probability measure, that is, they satisfy the axioms of probability theory l All the consequences of the axioms (rules of probability) also apply to conditional probabilities l Caveat: Everything must be conditioned on the same event. No mixing and matching allowed ECE 313 — Probability with Engineering Applications Fall 2000 Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign 13.2 ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 7 of 39 Rules? What rules? l P(Ω|A) = 1 l P(∅|A) = 0 l P(Bc|A) = 1 – P(B|A) l If B ⊂ C, then P(B|A) ≤ P(C|A) l If BC = ∅, then P((B ∪ C)|A) = P(B|A) + P(C|A) l More generally, P((B ∪ C)|A) = P(B|A) + P(C|A) – P(BC|A) ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 8 of 39 Left side versus right side l An expression such as P((B ∪ C)|(A ∪ D)) is commonly written as P(B ∪ C|A ∪ D) l Everything to the right of the vertical bar is the conditioning event; it is a single set l Everything to the left of the vertical bar is the conditioned event; it is a single set l Even if A, B, C, and D are disjoint, P(B ∪ C|A ∪ D) ≠ P(B) + P(C|A) +P(D) ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 9 of 39 Is that all there is to it? l OK, so you can update your probabilities to conditional probabilities if you know that event A occurred n Is that all there is to it? n Is the notion of conditional probability just a one-trick pony? nSurely life holds more than that? l Actually, conditional probabilities are fundamental tools in probabilistic analyses ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 10 of 39 The chain rule or product rule l P(B|A) = P(AB)/P(A) l P(AB) = P(B|A)P(A) l Note that P(AB) can also be expressed as P(A|B)P(B) l The conditional probability P(B|A) can be used to compute the joint probability P(AB) l Conditional probability P(B |A) times P(A), the probability of the conditioning event ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 11 of 39 Generalization of the chain rule l More generally, P(ABCD…)=P(A)P(B |A)P(C|AB)P(D|ABC)… l Product of first two terms is P(AB) l P(C|AB)P(AB) = P(ABC), so that the product of the first three terms is P(ABC), and so on … l For ABCD… to occur, A must occur, and if A has occurred, so must B (with probability P(B|A)); if both A and B, then C must … ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 12 of 39 Applications of the chain rule l Example: A random sample of size k is drawn without replacement from the set {1, 2, … , n}. What is the probability that the sample is exactly {1, 2, 3, … , k–1, n}? l Simple answer: There are equally likely subsets that could have been drawn, and so the desired probability is just n k n k –1 ECE 313 — Probability with Engineering Applications Fall 2000 Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign 13.5 ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 25 of 39 Applications l Example: Box I has 3 green and 2 red balls, while Box II has 2 green and 2 red balls. A ball is drawn at random from Box I and transferred to Box II. Then, a ball is drawn at random from Box II. What is the probability that the ball drawn from Box II is green? l Note that the color of the ball transferred from Box I to Box II is not known ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 26 of 39 Example (continued) l The color of the ball transferred is not known, but it’s either green or red for sure! Box I Box II ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 27 of 39 Example (continued) l Box I has 3g, 2r; Box II has 2g, 2r l After the transfer, Box II has 5 balls in it l G = event ball drawn from Box II is green l A = event ball transferred is red l P(G|A) = 2/5 l P(G|Ac) = 3/5 l P(A) = 2/5 l P(G) = P(G|A)P(A) + P(G|Ac)P(Ac) = (2/5)(2/5) + (3/5)(3/5) = 13/25 ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 28 of 39 A built-in test for checking answers l The probability of event A is the weighted average of P(A|B) and P(A|Bc) l P(A) = P(A|B)P(B) + P(A|Bc)P(Bc) = P(A|B)P(B) + P(A|Bc)[1 – P(B)] l The linear function y = a•x + b•(1 – x) has value b at x = 0 and a at x = 1 l For 0 < x < 1, y is between a and b l P(A) is between P(A|B) and P(A|Bc) ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 29 of 39 Example (checking our work) l P(G|A) = 2/5 l P(G|Ac) = 3/5 l P(G) = P(G|A)P(A) + P(G|Ac)P(Ac) = (2/5)(2/5) + (3/5)(3/5) = 13/25 P(G|A) = 2/5 ≤ P(G) = 13/25 ≤ P(G|Ac) = 3/5 l If the check is satisfied, it does not imply that your work is right; there may be other mistakes, e.g. you computed P(G) = 12/25 l But, if the check is not satisfied, … ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 30 of 39 Generalizations of the theorem I l P(A) = P(A|B)P(B) + P(A|Bc)P(Bc) l Since conditional probabilities form a probability measure, a similar result also holds for conditional probabilities l P(A|C) = P(A|BC)P(B|C)+P(A|Bc C)P(Bc|C) l All probabilities in the first equation are now conditioned on C (in addition to any previously existing conditioning) ECE 313 — Probability with Engineering Applications Fall 2000 Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign 13.6 ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 31 of 39 Example P(A|C) = P(A|BC)P(B|C) + P(A|Bc C)P(Bc|C) l A = event that a flight is late in arriving l B = event that flight is arriving at O’Hare l C = event that flight is an United Airlines l P(A|BC) = probability that a United Airlines flight is late arriving at O’Hare l P(A|BcC) = probability that a United Airlines flight is late arriving elsewhere ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 32 of 39 Example (continued) P(A|C) = P(A|BC)P(B|C) + P(A|Bc C)P(Bc|C) l A = event that a flight is late in arriving l B = event that flight is arriving at O’Hare l C = event that flight is on United Airlines l P(B|C) = probability that a flight arriving at O’Hare is a United Airlines flight l P(Bc|C) = probability that a flight arriving elsewhere is a United Airlines flight ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 33 of 39 Example (continued) P(A|C) = P(A|BC)P(B|C) + P(A|Bc C)P(Bc|C) l P(A|BC), P(A|Bc C), P(B|C), and P(Bc|C) can all be estimated (for example, via relative frequencies) by United Airlines or by the FAA l P(A|C) = probability that a United Airlines flight is late can then be computed (and published in the newspapers) ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 34 of 39 Generalizations of the theorem II Given a countable partition A1, A2 , … An, … of the sample space, P(B) = P(B|A1)P(A1) + P(B|A2)P(A2) + … + P(B|An)P(An) + … The theorem as presented originally was the finite case n = 2 of this more general result The two generalizations can also be combined: condition throughout on C! ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 35 of 39 Generalization of built-in test I l Suppose that P(B) = P(B|A1)P(A1) + P(B|A2)P(A2) + … + P(B|An)P(An) + … l This is a weighted sum of the P(B |Ai) l If P(B|Aj) is the smallest of the P(B|Ai), then replacing the P(B|Ai) by P(B|Aj) gives P(B) ≥ P(B|Aj)•[P(A1) + P(A2) + … ] = P(B|Aj) ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 36 of 39 Generalization of built-in test II l Suppose that P(B) = P(B|A1)P(A1) + P(B|A2)P(A2) + … + P(B|An)P(An) + … l If P(B|Ak) is the largest of the P(B|Ai), then replacing the P(B|Ai) by P(B|Aj) gives P(B) ≤ P(B|Ak)•[P(A1) + P(A2) + … ] = P(B|Ak) l Conclusion: P(B|Aj) ≤ P(B) ≤ P(B|Ak) ECE 313 — Probability with Engineering Applications Fall 2000 Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign 13.7 ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 37 of 39 Another Example l You and a friend (also taking ECE 313) are at a party with N–1 other people when suddenly a conga line forms. Assume that all (N+1)! orderings are possible l What is the probability that your friend is ahead of you in the conga line? l Answer: 1/2 (by symmetry) l If there was a different (correct) answer, you would be ahead with same prob ≠ 1/2 ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 38 of 39 Do it by the theorem… l Both you and your friend are equally likely to be anywhere in the conga line l P(you are in j-th position) = 1/(N + 1) l P(friend ahead|you in j-th) = (j – 1)/N l Why j–1? Why N and not N+1? l P(friend ahead) = sum of [(j–1)/N]•[1/(N+1)] = [0 + 1 + … + N]/[N•(N + 1)] = 1/2 l 1 + 2 + … + N = N•(N + 1)/2 !!!! ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 39 of 39 Summary l The chain rule or product rule allows us to compute a joint probability (i.e. probability of an intersection) as the product of various conditional probabilities l The theorem of total probability allows us to find an unconditional probability from conditional probabilities l We discussed some examples of the applications of these rules
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved