Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Lecture Notes on Decoding of Codes | ENEE 739, Study notes of Electrical and Electronics Engineering

Material Type: Notes; Professor: Barg; Subject: Electrical & Computer Engineering; University: University of Maryland; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-hfj-1
koofers-user-hfj-1 🇺🇸

10 documents

1 / 7

Toggle sidebar

Related documents


Partial preview of the text

Download Lecture Notes on Decoding of Codes | ENEE 739 and more Study notes Electrical and Electronics Engineering in PDF only on Docsity! ENEE 739C: Advanced Topics in Signal Processing: Coding Theory Instructor: Alexander Barg Lecture 3 (draft; 9/1/03). Decoding of codes. Our task of code construction is not complete if we cannot decode the codes we are studying. Decoding another major task in coding theory studied in a variety of scenarios. Definition 1. Given a code C ∈ H nq and a point y ∈ H nq , maximum likelihood, or complete decoding takes y to (one of) the closest code vectors by the Hamming distance. This procedure may be actually called minimum distance decoding, the name “maximum likelihood” is justified in a short while. Intuitively is seems natural to assume that this decoding is the best we can hope for in terms of minimizing the error rate. Note if a vector x ∈ C (say a binary code) is sent over a binary symmetric channel W , the probability that a vector y is received on its output P (y|x) = Wn(yx) = pw(1 − p)n−w where w = d(x,y). If c is the decoding result, the probability of a decoding error equals 1−P (c|y). Therefore, the mapping ψ that minimizes the probability of error for the codeword ci is given by ψ(y) = ci ⇐⇒ P (ci|y) > P (cj |y) ∀j 6= i. Letting P (y) = ∑ c∈CW n(y|c) we have Wn(y|c) = P (c|y)P (y) P (c) . If we assume that P (c) = 1/M for every c ∈ C then maximum likelihood decoding is defined equivalently as ψ(y) = ci ⇐⇒ Wn(y|ci) > Wn(y|cj) ∀j 6= i. Finally note that maximizing Wn(y|c) is equivalent to minimizing the distance d(y, c), so minimum distance decoding is indeed max-likelihood or ML decoding. We know from ENEE722 that max-likelihood decoding of a linear binary [n, k, d] code C can be imple- mented by the standard array (or the syndrome trellis or by other equivalent means). The problem is that all these methods for codes of large length and rate not very close to 0 or 1 are very computationally involved. Therefore coding theory is also concerned with restricted decoding procedures. One of the most commonly used is bounded distance decoding. This name is used loosely for all procedures that find the closest code- word to the received vector in a sphere of some fixed radius t if this sphere contains codewords, and returns “erasure” if it does not. For t = b(d − 1)/2c such a codeword (if it exists) is always unique; for greater t the sphere will sometimes contain more than one codeword. In the case of a multitude of codewords in the sphere it is sometimes desirable to output them all as a decoding result: this procedure is called list decoding and the decoding result a list. Clearly, minimum distance decoding is the same as bounded distance decoding with t = r(C), the covering radius of the code. In general, intuition about bounded distance decoding is as follows: shrinking the radius increases the probability of erasure and decreases the probability of undetected error. Another extreme (compared to minimum distance decoding, called (pure) error detection is considered toward the end of this lecture. To describe ML decoding in geometric terms, define a Voronoi region D(c, C) of a code point c ∈ C as follows: D(c, C) = {y ∈H nq : d(y, c) < d(y, c′) ∀c′ ∈ C\{c}}. This defines a partition of H nq into decoding (Voronoi) regions of the code points (with ties broken arbi- trarily). We have Pe(c) = Pr[H nq \D(c, C)] where Pr is computed with respect to the binomial probability distribution defined by the channel and with “center” at c. What to remember: ML decoding is optimal but generally infeasible (except for short codes, or codes with very few codewords, or codes of rate close to one). 1 2 Define the average and maximum error probability for the code C as Pe(C) = 1 M ∑ c∈C Pe(c) Pe,m(C) = max c∈C Pe(c). The quantity Pe(C) is one of the main parameters of the code, with lots of attention devoted to it in the literature. For communication applications the error probability is more important than the minimum distance which does not provide much information about the decoding performance except for low noise. For bounded distance decoding the probabilities of undetected error (miscorrection) and erasure can be defined analogously: Pe,t(C) = 1 M ∑ c∈C Pr [ ⋃ c′∈C\{c} Bt(c′) ∣∣c] (undetected error) Px,t(C) = 1 M ∑ c∈C Pr [ H nq \ ⋃ c′∈C Bt(c′) ∣∣c] (erasure) Remark. For a linear code C Voronoi regions of all code points are congruent (the set of correctable errors for any code point is the same), and so Pe(C) = Pe(c), Pe,t(C) = Pe,t(c), ... for any code point c. Computing Pe(C) generally is a difficult task. For instance, for a linear code C Pe(C) = ∑ x a coset non-leader ( p q − 1 )wt(x)(1− p)n−wt(x). However for most codes no reasonable bounds for the weight distribution of coset leaders are known (a few exceptions, apart from the trivial cases like the Hamming code, are the duals of the 2-error-correcting BCH codes and a few related cases [1, 2]). Hence we are faced with a search for bounds on Pe(C) from both sides. This is one of the main topics of coding theory with hundreds of papers devoted to it; this is also the main topic of such books as [4, 9, 3]. Bounds on the error probability Pe(C) of complete decoding Bounding Pe(C) is a difficult task. Essentially the only technique available relies upon the distance distribution of the code C; it is likely that this is the best way to estimate Pe in the general case. Traditionally, upper bounds on Pe(C) receive more attention than lower bounds. This is justifiable because upper bounds give some level of confidence in estimating performance of communication systems. It is important to realize right away that, with all the literature devoted to these bounds, the mainstream technique relies on a combination of several trivial observations presented in this section. One simple idea, the union bound principle, is to upper bound the error probability as follows: (1) Pe(C) ≤ 1 M ∑ c∈C ∑ c′∈C P (c→ c′), where P (c→ c′) := Pr{x ∈H nq : d(x, c′) ≤ d(x, c)} is the probability of the half-space cut out by the median hyperplane between c and c′. It is clear that this approach is generally not optimal because the probability of large parts of H nq is counted many times. Theorem 1. Let C ⊂H nq be a code with distance d used over the qSC with crossover probability p/(q − 1). Then the error probability of complete decoding (2) Pe(C) ≤ n∑ w=d Bw n∑ e=dw/2e π(e) e∑ s=0 pwe,s, where pwe,s is the intersection number of H n q and π(e) = ( p q−1 ) e(1− p)n−e. 5 Theorem 5. Let C = log2 q−hq(p). For any rate 0 ≤ R ≤ C there exists a sequence of codes Cn ⊂H nq , n = 1, 2, . . . of length n such that R(Cn)→ R as n→∞ and (6) Pe(Cn) < exp(−n(E(R, p)− o(1))), where E(R, p) > 0 is a convex, monotone decreasing function of R. Conversely, for R > C the error probability Pe(C) of any sequence of codes approaches one. This theorem is a simple corollary of Lemma 2 and the Bhattacharyya bound if one substitutes in it the weight profile α0 of a random linear code. However I prefer another proof which develops geometric intuition about the decoding process (one of the next lectures). What to remember: There are code sequences which under complete (ML decoding) have error probability that falls exponen- tially with the code length for any code rate below capacity. For instance, random linear codes have this property (and achieve the best known error exponent), and generally a good code sequence is expected to have this property. Analogous results hold for a large class of information transmission channels. The quantity C = 1−h2(p), which depends only on p, is called the capacity of the channel. For the Gaussian channel with signal-to-noise ratio A (7) C = 1 2 ln (1 +A). Though Theorem 5 says that there are codes of sufficiently large length for which Pe falls as long as R(C) < C and that for any code Pe → 1 if R(C) > C , it does not imply any conclusions for a particular code C. However, a similar result is still possible. We begin with an auxiliary statement. Theorem 6. Suppose a code C ⊂ H nq is used for transmission over a qSC with transition probability p, 0 < p < 1 − q−1. Then the error probability of complete decoding Pe(C) = Pe(C, p) is a continuous monotone increasing function of p and Pe(C, 0) = 0, Pe(C, 1− q−1) > 1/2. Proof : Only the inequality Pe(C, 1− q−1) > 1/2 is nonobvious. Let C = (c1, c2, . . . , cM ) be a code and let D(ci, C) be the Voronoi region of the codeword ci. We will prove that |D(ci, C)| < qn/2 for any 1 ≤ i ≤ M if M ≥ 3. First let M = 3. The claim is true by inspection for n = 2. Suppose it is true for any code of length n − 1. Consider a code of length n. Puncturing it on any fixed coordinate, we obtain a code C′ for which D(c′i, C′) < qn−1/2 for all i = 1, 2, 3. Consider the way the Voronoi regions change in transition from C′ to C. Every vector y′ ∈ D(c′i, C′) can be augmented in q ways to a vector y of length n; even if all these vectors are in D(ci, C), then |D(ci, C)| ≤ q|D(c′i, C′)| < qn−1 2 q = qn/2. Thus the fact |D(ci, C)| < qn/2 is justified by induction on n for M = 3. Now perform induction on M. For the code C\cM the claim is true by the induction hypothesis; adding a codeword can only decrease the Voronoi regions of c1, . . . , cM−1. Thus the full claim is justified by symmetry. Now observe that the qSC with transition probability 1 − q−1 induces on H nq a uniform distribution. Together with our assumption that decoding ψ(x) always ends in error if there are two or more codewords at an equal distance from x we obtain Pe(C) = 1 M ∑ c∈C (1− Pc(c)) = 1 M ∑ c∈C q−n(qn − |D(c, C)|) > q−n q n 2 = 1/2, where Pc(c) is the probability of correct decoding conditioned on the fact that c is transmitted. This theorem enables us to give the following definition. Definition 2. Suppose a code C is used over a qSC with complete decoding. The transition probability θ is called the threshold probability of C if Pe(C, θ) = 1/2. The term “threshold” suggests that θ separates in some way the values of Pe(C, p). This is indeed the case as shown by the following theorem [8], stated for binary codes. 6 Theorem 7. Let C be a binary linear code of any length and distance d used over a BSC with crossover probability p. Then for any codeword c ∈ C Pe(c) ≤ 1− Φ [√ d (√ − ln(1− θ)− √ − ln(1− p) )] for 0 < p < θ Pe(c) ≥ 1− Φ [√ d (√ − ln(1− θ)− √ − ln(1− p) )] for θ < p < 1, where Φ(x) = ∫ x −∞ exp(−z 2/2)dz/ √ 2π. To see that Pe(C) exhibits a threshold behavior, recall the asymptotics 1− Φ(x) ∼ 1√ 2πx e−x 2/2 (x→ +∞). Let s = √ − ln(1− θ)− √ − ln(1− p). Now assume that d→∞, then Pe(C) . 1√ 2πs e−ds 2/2. Similarly, for θ < p, the error probability Pe(C) ≥ 1− o(1). Note that if the relative distance of a code is separated from zero then for a very low noise the error probability Pe(C) will fall as an exponential function of n. Therefore in this case the nontrivial part of this theorem is the fact that Pe exhibits a threshold behavior for large n, jumping from almost 0 to almost 1 for a very small change in the input noise level. However, the result is even more general: it shows that for any code sequence such that lnn = o(d) the error probability still has a threshold behavior. The proof is difficult and will not be given. It involves a very interesting general method of modern probabilistic combinatorics, so you are encouraged to consult the original. What to remember: Every code sequence with relative distance separated from zero has a sharp threshold. Error detection Finally, consider a very simple decoding scheme called error detection. The vector y received from a qSC(p) is checked for containment in the code: if y ∈ C then this is the decoding result, if not, the decoder detects an error. This is particularly simple for linear codes: decoding amounts to computing |bfHyt and comparing the result to zero. The probability of undetected error for various classes of codes was studied extensively in the literature [5]. Given a code C with a distance enumerator B(x, y) this probability can be written as follows: Pu(C, p) = B( p q − 1 , 1− p)− (1− p)n Proof : Let π(i) = ( pq−1 i)(1− p)n−i. We compute Pu(C, p) = 1 M ∑ c∈C ∑ c′∈C π(d(c, c′)) = 1 M ∑ c∈C n∑ i=1 Bi(c)π(i) = n∑ i=1 Bi(C)π(i) = B ( 1− p, p q − 1 ) − (1− p)n. Let us use this result together with random linear codes to derive a lower bound on the maximum attainable exponent of the probability of undetected error for binary codes of rate R used over a BSC(p). 7 Let us first give a formal definition of the exponent: Eu(n,R, p) = max C an (n, nR) code − 1 n log2 Pu(C, p) Eu(R, p) = lim n→∞ Eu(n,R, p) The existence of this limit again is an unknown fact, so this definition should be handled similarly to the definition of R(δ). Theorem 8. [6] Let T (δGV(R), p) = h2(δGV(R)) +D(δGV(R)‖p). We have Eu(R, p) ≥ T (δGV(R), p) 0 ≤ R ≤ 1− h2(p) Eu(R, p) ≥ 1−R 1− h2(p) < R ≤ 1. Proof : Take a random [n, k = Rn, d] linear code C with weight distribution given by A0 = 1, Aw ≤ n2 ( n w ) 2k−n, w ≥ d. For large n the asymptotic behavior of the weight profile is given by Corollary 5 of Lecture 1. Substituting this into the expression for Pu, we obtain Pu(C) ≤ n∑ w=d n2 ( n w ) 2k−n(1− p)n−wpw. Switching to exponents, we obtain Eu(R, p) ≥ max ω≥δGV(R) (1−R+D(ω‖p)). Since D′ω(ω‖p) = log (1−ω)p ω(1−p) , the unrestricted maximum in the exponent is attained for ω = p. Thus if p > δGV(R), or equivalently, R ≥ 1−h2(p), we can substitute ω = p and obtain 1−R in the exponent (since D(p‖p) = 0. If p ≤ δGV(R) the dominating term in the sum for Pu is the first one, i.e., w = d → nδGV(R). The number of codewords of minimum weight in the code is nonexponential in n and the exponent of Pu is equal to the exponent of pd(1− p)n−d which is T (δGV(R), p). References 1. P. Charpin, Tools for coset weight enumerators of some codes, Finite fields: theory, applications, and algorithms (Las Vegas, NV, 1993), Contemp. Math., vol. 168, Amer. Math. Soc., Providence, RI, 1994, pp. 1–14. 2. , Weight distributions of cosets of two-error-correcting binary BCH codes, extended or not, IEEE Trans. Inform. Theory 40 (1994), no. 5, 1425–1442. 3. I. Csiszár and J. Körner, Information theory. Coding theorems for discrete memoryless channels, Akadémiai Kiadó, Bu- dapest, 1981. 4. R. G. Gallager, Information theory and reliable communication, John Wiley & Sons, New York e.a., 1968. 5. T. Kløve and V. I. Korzhik, Error detecting codes, Kluwer, Boston e. a., 1995. 6. V. I. Levenshtein, Bounds on the probability of undetected error, Problems of Information Transmission 13 (1977), no. 1, 3–18. 7. C. E. Shannon, Probability of error for optimal codes in a Gaussian channel, Bell Syst. Techn. Journ. 38 (1959), no. 3, 611–656. 8. J.-P. Tillich and G. Zémor, Discrete isoperimetric inequalities and the probability of decoding error, Combinatorics, Proba- bility and Computing 9 (2000), 465–479. 9. A. J. Viterbi and J. K. Omura, Principles of digital communication and coding, McGraw-Hill, 1979.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved