Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Optimal Coding: Calculating Codeword Lengths with Self-Information Redistribution, Study notes of Digital Signal Processing

A new method for optimally encoding a given source using a first-order model. The technique involves redistributing the self-information of symbols based on their probabilities at each stage of encoding, resulting in lower computational cost compared to the huffman code. The method performs equally well for arbitrary order of symbol probabilities.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-e2k
koofers-user-e2k 🇺🇸

10 documents

1 / 16

Toggle sidebar

Related documents


Partial preview of the text

Download Optimal Coding: Calculating Codeword Lengths with Self-Information Redistribution and more Study notes Digital Signal Processing in PDF only on Docsity! ART code I: A New Method Of Optimal Coding Artyom M. Grigoryan Department of Electrical and Computer Engineering The University of Texas at San Antonio amgrigoryan@utsa.edu Notes for class 4663 from the presentation in the International IEEE conference ITCC-2002 . . . A new technique for optimally encoding a given source, statistical properties of which are de- scribed by the first-order model is introduced. The calculation of a minimum length of code- words is based on the consecutive redistribu- tion of the self-information of symbols in ac- cordance with their probabilities at each stage of the encoding. The proposed method performs equally well for an arbitrary order of symbol probabilities. While codewords are generated by a separate combinatorial procedure, the overall computa- tional cost of the proposed method is lower than that for the Huffman code. 1 . . . The self-information ε2 of a2 becomes ε′2 = ε2 − D1∑ n>1 pn · p2 A new length l′2 of codeword c2 will be calcu- lated as l′2 = ε′2 p2 and [l′2] bits will be assigned for the codeword of letter a2. The remainder of the self-information of the letter a2 D2 = ε̃2 − ε ′ 2 = p2[l ′ 2]− ε ′ 2 is distributed among the remaining (m−2) let- ters a3, a4, . . . , am in proportion to their proba- bilities. At following steps, the letters a3, a4, . . . , am are processed similarly. 4 . . . The algorithm results in the set of the numbers of bits [l1], [l2], ..., [lm] which are supposed to be used for encoding the corresponding letters a1, a2, ..., am. If the Kraft-McMillan condition holds 1 2[l1] + 1 2[l2] + · · · + 1 2[lm] > 1 there exists an uniquely decodable procedure for encoding the alphabet Am, for which [li] bits will be used to obtain codewords c(ai), i = 1, . . . , m. 5 Example: The alphabet A5 = {a1, a2, ..., a5} whose elements have the probabilities p1 = 0.4, p2 = p3 = 0.2, and p4 = p5 = 0.1. The entropy rate of A5 is ε = 2.122bits/letter. Step 1: Letter is a1, p1 = 0.4, l1 = − log p1 = 1.3219, ε1 = 0.52877, and [l1] = 2 bits are assigned to encode a1. The self information of the letter a1 increases by the value D1 = p1[l1]− ε1 = 0.4 · 2 − 0.52877 = 0.27123 This amount of the self-information is sub- tracted from the self-information of the re- maining letters ε2, ε3, ε4, and ε5, in accordance with their probabilities, i.e. respectively in pro- portions D1/3, D1/3, D1/6, and D1/6. 6 Tables : Steps 3,4, and 5. A pi ε () i Di ε (4) i l ′ i [l ′ i] 1 .4 .52877 .27123 .8 1.3219 2 2 .2 .37398 .02602 .4 1.8698 2 3 .2 .36096 .03904 .4 1.8048 2 4 .1 .28047 −1/2· .26096 5 .1 .28047 −1/2· .26096 A pi ε () i Di ε (5) i l ′ i [l ′ i] 1 .4 .52877 .27123 .8 1.3219 2 2 .2 .37398 .02602 .4 1.8698 2 3 .2 .36096 .03904 .4 1.8048 2 4 .1 .26096 .03904 .3 2.6096 3 5 .1 .26096 −1· .22192 A pi ε () i Di ε̃i l ′ i [l ′ i] 1 .4 .52877 .27123 0.8 1.3219 2 2 .2 .37398 .02602 0.4 1.8698 2 3 .2 .36096 .03904 0.4 1.8047 2 4 .1 .26096 .03904 0.3 2.6096 3 5 .1 .22192 .07808 0.3 2.2192 3 9 . . . The last remainder D5 of the self-information of a5 is equal to the redundancy of the code R = 5∑ k=1 pk[l ′ k]− ε = D5 = 0.07808 bits/letter The lengths of the codewords are {2,2,2,3,3}. Therefore 12 bits are required for encoding the alphabet A5 by using the proposed method. Indeed, the Kraft-McMillan inequality is ful- filled and the following code can be consid- ered: c(a1) = 00, c(a2) = 01, c(a3) = 10, c(a4) = 110, c(a5) = 111. Other codes are {11,00,01,100,101}, {10,11,00,010,011}, and {01,10,11,000,001}. The variance of the length for this codes is 0.23664 and the average length for codeword is 2.20 bits/letter. That is, the encoding pro- cedure is optimal. 10 . . . The letter probabilities are in increasing order: Table 6. A pi ε () i Di ε̃i l ′ i [l ′ i] 5 .1 .33219 .06781 .4 3.32193 4 4 .1 .32466 .07534 .4 3.24659 4 3 .2 .43048 .16952 .6 2.15241 3 2 .2 .37398 .02602 .4 1.86988 2 1 .4 .32192 .07808 .4 0.80480 1 The sequence of lengths for codewords is {4,4, 3,2,1} which requires 14 bits for encoding the alphabet A5. The variance of the codeword length is 0.58652 which is greater than the variance obtained in the previous example, but the average code- word length is the same, 2.20 bits/letter. 11 Since, p2 < p1, ε2 < ε1, and l ′ 2 = 2.32193, we have to assign only two bits for the code- word c(a2) and add the remainder of its self- information, −D2 = ε2 − 2 · 0.2 = 0.06439, to the letters a1, a4, a3, a5 in accordance with their probabilities. In other words, we consider that l2 = [l2] + m2, where 0 ≤ m2 < 1. Table 10. A pi εi ε () i Di ε̃i 2 .2 .46439 .46439 −.06439 .4 1 .4 .52877 .56097 .23904 .8 4 .1 .33219 .28048 .01952 .3 3 .2 .46439 .34795 .05205 .4 5 .1 .33219 .22192 .07808 .3 l′i [l ′ i] 2.32193 2 1.15241 2 2.84383 3 1.07309 2 2.21920 3 The codeable sequence of lengths is 2,2,2,3,3. . . . Conclusions A new approach was presented for computing the optimal lengths for codewords of a given source which has the first-order model. The main idea of this approach is based on transferring and redistributing the self-informa- tions of encoded symbols to the remaining sym- bols to be encoded, at each stage of the cal- culation. The algorithm is simple in comparison with the Huffman code and provides the optimal encod- ing of the source irrespective of the ordering of symbol probabilities. Due to the simplic- ity of the proposed method, it can be used in real-time applications, as well as in applications that demand fixed transmission rates. 14
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved