Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Lecture Slides on Information Theory - Introduction to Cryptography | CIS 400, Study notes of Computer Science

Material Type: Notes; Class: Selected Topics; Subject: Computer & Info Science; University: Syracuse University; Term: Spring 2005;

Typology: Study notes

Pre 2010

Uploaded on 08/09/2009

koofers-user-8el
koofers-user-8el 🇺🇸

10 documents

1 / 21

Toggle sidebar

Related documents


Partial preview of the text

Download Lecture Slides on Information Theory - Introduction to Cryptography | CIS 400 and more Study notes Computer Science in PDF only on Docsity! INFORMATION THEORY CIS 400/628 — Spring 2005 Introduction to Cryptography This is based on Chapter 15 of Trappe and Washington SHANNON’S INFORMATION THEORY I Late 1940s. I Concerned with the amount of information, not whether it is informative. I Typical Problem: How much can we compress a message and be able to reconstruct from the compressed version? I Focus is on collections of messages and probabilities on them. So common messages get short encodings and uncommon ones get longer encodings. — 1 — PROBABILITY REVIEW CONTINUED DEFINITION Suppose (X, pX) is a probability space and S : X → Y . S is called a Y -valued random variable on X & for y ∈ Y pS(y) =def pX({ x ∈ X S(x) = y }) = Prob[S = y] DEFINITION Suppose (X, pX) is a probability space and S : X → Y and T : X → Z are random variables. Then pS,T (y, z) =def pX({ x ∈ X S(x) = y & T (x) = z }) = Prob[S = y, T = z]. EXAMPLE X = { 1, . . . , 6 } S : X → { 0, 1 } T : X → { 0, 1 }. S(x) = { 1, if x is even; 0, if x is odd. T (x) = { 1, if x < 3; 0, if x ≥ 3. — 4 — STILL MORE PROBABILITY DEFINITION S : X → Y and T : X → Z are independent iff for all y ∈ Y, z ∈ Z: Prob[S = y, T = z] = Prob[S = y] · Prob[T = z]. EXAMPLE I S : { 1, . . . , 6 } → { 0, 1 }. S(x) = 1 ⇐⇒ x is even. I T : { 1, . . . , 6 } → { 0, 1 }. T (x) = 1 ⇐⇒ x < 3. I U : { 1, . . . , 6 } → { 0, 1 }. U(x) = 1 ⇐⇒ x is prime. S and T are independent. S and U are not independent. DEFINITION Suppose S : X → Y , T : X → Z, and Prob[T = z] > 0. Then the conditional probability of y given z is Prob[S = y|T = z] =def Prob[S = y, T = z] Prob[T = z] . Sometimes Prob[S = y|T = z] is written pY (y|z). — 5 — BAYES’S THEOREM Note: If S and T are independent, then Prob[S = y|T = z] = Prob[S = y]. Bayes’s Theorem If Prob[S = y] > 0 and Prob[T = z] > 0, then Prob[S = y|T = z] = Prob[S = y] · Prob[T = z|S = y] Prob[T = z] . proof on board — 6 — EXAMPLE APPLICATIONS EXAMPLE: A fair coin X = ({ heads, tails }, p(heads) = p(tails) = 1 2 . H(X) = −1 · (1 2 log2 1 2 + 1 2 log2 1 2 ) = −(−1 2 − 1 2 ) = 1. It takes 1 bit to descibe the outcome. Example: An unfair coin Suppose 0 < p < 1. Prob[heads] = p Prob[tails] = 1 − p. H(unfair coin toss) = −p · log2 p − (1 − p) · log2(1 − p). Example: A fair n-sided die H(a roll) = −1 n log2 1 n − · · · − 1 n log2 1 n = log2 n. Example: Flipping two fair coins Heads: no points. Tails: 1 point. Two flips: sum points. Outcomes: 0, 1, 2 with probabilities: 1 4 , 1 2 , 1 4 H(two coin flips) = −1 4 log2 1 4 − 1 2 log2 1 2 − 1 4 log2 1 4 = 3 2 . = the avg. number of yes/no quesions needed to tell the result Is there exactly one head? Are there two heads? — 9 — JOINT AND CONDITIONAL ENTROPY Suppose S : X → Y, T : X → Z, and U : X → Y × Z where U(x) = (S(x), T (x)). H(S, T ) =def − ∑ x∈X ∑ y∈Y pX,Y (x, y) · log2 pX,Y (x, y). This is just the entropy of U . We define conditional entropy of T given S by: H(T |S) =def ∑ y pS(y) · H(T |S = y) = − ∑ y pS(y) (∑ z pT (z|y) · log2 pT (z|y) ) = − ∑ y ∑ z pS,T (y, z) log2 pT (z|y) (since pS,T (y, z) = pT (z|y)pS(y)). = the uncertainty of T given S — 10 — JOINT AND CONDITIONAL ENTROPY, CONTINUED CHAIN RULE THEOREM. H(X, Y ) = H(X) + H(Y |X). The uncertainty of (X, Y ) = the uncertainty of X + the uncertainty of Y , given that X happened. THEOREM. a. H(X) ≤ log2 |X| — equal iff all elms of X equally likely You are most uncertain under uniform distrs. b. H(X, Y ) ≤ H(X) + H(Y ). The info in (X, Y ) is at most the info of X + the info of Y . c. H(Y |X) ≤ H(Y ). Knowing X cannot make you less certain about Y . = only if X, Y independent. Proof of c. By the Chain Rule: H(X) + H(Y |X) = H(X, Y ). By b: H(X, Y ) ≤ H(X) + H(Y ). So, H(X) + H(Y |X) ≤ H(X) + H(Y ). — 11 — PERFECT SECRECY GOAL: Use inf. theory to explain how one-time pads provide “perfect secrecy”. P: plaintexts each with a certain probability C: ciphertexts induced probabilities K: keys assume independent of choice of plaintext EXAMPLE P = { a, b, c } Prob[a] = 0.5 Prob[b] = 0.3 Prob[c] = 0.2 K = { k1, k2 } Prob[k1] = 0.5 Prob[k2] = 0.5 C = { U, V, W }. eK(x) a b c k1 U V W k2 U W V Prob[U ] = 0.5 Prob[V ] = 0.25 Prob[W ] = 0.25 What can Eve learn from an intercepted ciphertext? — 14 — PERFECT SECRECY, CONTINUED DEFINITION A cryptosystem has perfect secrecy iff H(P |C) = H(P ). THEOREM The one-time pad has perfect secrecy. Proof Setup I z = size of alphabet, e.g., 2, 26, 256, etc. I P = strings of length L (zL many) I K = (s1, . . . , sL) = vector of shifts, each key k, pK(k) = z−L. I C = P I c ∈ C I pC(c) = ∑ { ProbP (x) · ProbK(k) : x ∈ P, k ∈ K, ek(x) = c } (Since P and K are independent, Prob[P = x, K = k] = ProbP (x) · ProbK(k).) — 15 — PROOF CONTINUED pC(c) = ∑ { ProbP(x) · ProbK(k) : x ∈ P, k ∈ K, ek(x) = c } = z−L ∑ { ProbP(x) : x ∈ P, k ∈ K, ek(x) = c } Obs: Given p and c, there is only one k such that ek(x) = c. So:∑ { ProbP(x) : x ∈ P, k ∈ K, ek(x) = c } = 1. Therefore, PC(c) = z−L. H(K) = H(C) = log2(z L). H(P, K, C) = H(P, K) = H(P ) + H(K). P and K indep. H(P, K, C) = H(P, C) = H(P |C) + H(C). H(P ) + H(K) = H(P |C) + H(C). ∴ H(P ) = H(P |C). QED For RSA, H(P |C) = 0. Why? — 16 — THE ENTROPY OF ENGLISH, III How to compute: HEnglish = limn→∞ H(L n)/n? Shannon’s Idea I First suppose you had an optimal “next letter guesser.” • Given a prefix, it ranks (from 1 to 26) the letters as being most likely to be next. i t i s s u n n y t o d a y 2 1 1 1 4 3 2 1 4 1 1 1 1 1 • Run a text through it and record what it guesses each letter corresponds to. • From the predictor + “21114321411111” we can recover the text. I Use a native English speaker the “next letter predictor” and gather stats (assume determinism). — 19 — THE ENTROPY OF ENGLISH, IV I Given a text + the sequence of guesses, let qi = the frequency of # i. I Shannon: .72 ≈ ∑26 i=1 i · (qi − qi+1) · log2 i ≤ HEnglish ≤ − ∑26 i=1 qi · log2 q1 ≈ 1.42. I Since Hrandom text = 4.18, (info in English):(info random text)::1:4 I So English is about 75% redundant. — 20 —
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved