Download Bayesian Inference & Hypothesis Testing: ELEC 303 Lecture 15 at Rice Univ. - Prof. Farinaz and more Study notes Electrical and Electronics Engineering in PDF only on Docsity! 10/28/2008 1 ELEC 303 – Random Signals Lecture 15 – Bayesian Statistical Inference, Hypothesis testing, MAP, LMS Dr. Farinaz Koushanfar ECE Dept., Rice University Oct 22, 2008 Lecture outline • Reading: 8.2-8.3 • Bayesian inference and the posterior distribution • Point estimation • Hypothesis testing • Bayesian least mean square estimator 10/28/2008 2 Bayesian inference and posterior distribution • Unknown quantity of interest: Θ • Observations (or measurements, or observation vector) of X=(X1,X2,…,Xn) • We assume that we know – A prior distribution pΘ or fΘ – A conditional distribution pX|Θor fX|Θ – A complete answer is described by pΘ|X(θ|x) Observation Process Posterior Calculation Point estimates Error analysis, etc Prior pΘ Conditional pX|Θ x pΘ|X(.|X=x) Four versions of Bayes rule • Θ discrete, X discrete • Θ discrete, X continuous • Θ continuous, X discrete • Θ continuous, X continuous ∑ θ ΘΘ ΘΘ Θ θθ θθ =θ ' |X |X X| )'|x(p)'(p )|x(p)(p )x|(p ∑ θ ΘΘ ΘΘ Θ θθ θθ =θ ' |X |X X| )'|x(f)'(p )|x(f)(p )x|(p ∫ θθθ θθ =θ ΘΘ ΘΘ Θ 'd)'|x(p)'(f )|x(p)(f )x|(f |X |X X| ∫ θθθ θθ =θ ΘΘ ΘΘ Θ 'd)'|x(f)'(f )|x(f)(f )x|(f |X |X X| 10/28/2008 5 Point estimation • A point estimate is a single numerical value representing our best guess of Θ • An estimator is assumed to be a RV of the form for some function g • Different g’s corresponds to different estimators • An estimate is the value of the estimator determined by the value x of observations X • The MAP rule sets the estimate to a value that maximizes the posterior distributions • Once values x of X observed, the conditional expectation (LMS) estimator sets the to E[Θ|X=x] )X(gˆ =Θ θ̂ θ̂ θ̂ Couple of remarks on estimation • If the posterior is symmetric around its conditional mean and unimodal, the max occurs at the mean – MAP estimate is the same as conditional expectation • If Θ is continuous, the actual evaluation of MAP may be derivable analytically, e.g., using derivatives 10/28/2008 6 Example (cont’d) • Juliet late by a random amount X~U[0,θ] • θ unknown, modeled as value of RV Θ~U[0,1] • Assuming Juliet was late by x on the first date, how does Romeo calculates the posterior? • From before: • Since this is decreasing, the MAP estimate is equal to x – this can be optimistic • The conditional expectation estimator: 1 xif , |xlog|. 1 )x|(f X| ≤θ≤ θ =θΘ | xlog| x-1 |xlog|. 1 )xX|(E 1 x = θ θ==Θ ∫ Hypothesis testing • Binary hypothesis: two cases • Once the value x of X is observed, Use the Bayes rule to calculate the posterior PΘ|X(θ|x) • Select the hypothesis with the larger posterior • If gMAP(x) is the selected hypothesis, the correct decision’s probability is P(Θ= gMAP(x)|X=x) • If Si is set of all x in the MAP, the overall probability of correct decision is P(Θ= gMAP(x))=∑i P(Θ=θi,X∈Si) • The probability of error is: ∑i P(Θ≠θi,X∈Si) 10/28/2008 7 Multiple hypothesis Example – biased coin, single toss • Two biased coins, with head prob. p1 and p2 • Randomly select a coin and infer its identity based on a single toss • Θ=1 (Hypothesis 1), Θ=2 (Hypothesis 2) • X=0 (tail), X=1(head) • MAP compares PΘ(1)PX|Θ(x|1) ? PΘ(2)PX|Θ(x|2) • Compare PX|Θ(x|1) and PX|Θ(x|2) (WHY?) • E.g., p1=.46 and p2 =.52, and the outcome tail 10/28/2008 10 LMS – example 1 • Let Θ~U[4,10] • Suppose we observe Θ with noise W: X= Θ+W • Assume W~U[-1,+1] and independent of Θ • Find the LMS estimate of Θ, given X LMS – example 2 • Consider the date example, where Juliet is late by a RV X~U[0,Θ], Θ~U[0,1] • The MAP estimate: x • The LMS estimate: • Find the conditional mean squared error for MAP and the LMS estimate | xlog| x-1 |xlog|. 1 )xX|(E 1 x = θ θ==Θ ∫ 10/28/2008 11 Properties of the estimation error • The estimation error is unbiased, i.e., it has zero conditional and unconditional mean: • The estimation error is uncorrelated with the estimate • The variance of Θ can be decomposed as Θ ~ 0] ~ [E =Θ xallfor ,0]xX| ~ [E ==Θ Θ ~ Θ̂ 0) ~ ,ˆ(Cov =ΘΘ ) ~ var()ˆvar()var( Θ+Θ=Θ Uninformative observation • Let us say that the observation X is uninformative if the mean squared error is the same as var(Θ), the unconditional variance of Θ • When is this the case? ) ~ var(] ~ [E 22 Θ=Θ 10/28/2008 12 Case of multiple observations • The same discussions apply for a vector (multidimensional RV) • For all estimators g(X1,…,Xn): E[(Θ-E(Θ|X1,…,Xn) 2] ≤ E[(Θ-g(X1,…,Xn)) 2] • This is often difficult to implement – The joint PDF of Θ,X1,…,Xn is hard to compute – E(Θ|X1,…,Xn) 2] can be a very complicated function of X1,…,Xn Case of multiple parameters • multiple parameters Θ1,…, Θm need estimation • The LMS criterion commonly used is • This is equivalent to finding for each i, the estimator that minimizes so that we are dealing with m decoupled estimation problems, one for each unknown • Yielding ])ˆ[(E...])ˆ[(E 2mm 2 11 Θ−Θ++Θ−Θ iΘ̂ ]) ˆ[(E 2ii Θ−Θ ]X,...,X|[Eˆ,i n1ii Θ=Θ∀