Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Huffman Coding, Study notes of Design and Analysis of Algorithms

Lecture 17 Greedy algorithms: Huffman. Coding. Junzhou Huang, Ph.D. ... encoding of another character, for example, we could not.

Typology: Study notes

2021/2022

Uploaded on 09/07/2022

adnan_95
adnan_95 🇮🇶

4.3

(38)

921 documents

1 / 39

Toggle sidebar

Related documents


Partial preview of the text

Download Huffman Coding and more Study notes Design and Analysis of Algorithms in PDF only on Docsity! CSE5311 Design and Analysis of Algorithms 1Dept. CSE, UT Arlington CSE 5311 Lecture 17 Greedy algorithms: Huffman Coding Junzhou Huang, Ph.D. Department of Computer Science and Engineering Design and Analysis of Algorithms CSE5311 Design and Analysis of Algorithms 2Dept. CSE, UT Arlington • Suppose we have 1000000000 (1G) character data file that we wish to include in an email. • Suppose file only contains 26 letters {a,…,z}. • Suppose each letter a in {a,…,z} occurs with frequency fa. • Suppose we encode each letter by a binary code • If we use a fixed length code, we need 5 bits for each character • The resulting message length is • Can we do better? Data Compression  zba fff 5 CSE5311 Design and Analysis of Algorithms 5Dept. CSE, UT Arlington How to decode? • At first it is not obvious how decoding will happen, but this is possible if we use prefix codes CSE5311 Design and Analysis of Algorithms 6Dept. CSE, UT Arlington Prefix Codes • No encoding of a character can be the prefix of the longer encoding of another character, for example, we could not encode t as 01 and x as 01101 since 01 is a prefix of 01101 • By using a binary tree representation we will generate prefix codes provided all letters are leaves CSE5311 Design and Analysis of Algorithms 7Dept. CSE, UT Arlington Prefix codes • A message can be decoded uniquely. • Following the tree until it reaches to a leaf, and then repeat! • Draw a few more tree and produce the codes!!! CSE5311 Design and Analysis of Algorithms 10Dept. CSE, UT Arlington Greedy Algorithms • Many optimization problems can be solved using a greedy approach – The basic principle is that local optimal decisions may may be used to build an optimal solution – But the greedy approach may not always lead to an optimal solution overall for all problems – The key is knowing which problems will work with this approach and which will not • We will study – The problem of generating Huffman codes CSE5311 Design and Analysis of Algorithms 11Dept. CSE, UT Arlington Greedy algorithms • A greedy algorithm always makes the choice that looks best at the moment – My everyday examples: Driving in Los Angeles, NY, or Boston for that matter Playing cards Invest on stocks Choose a university – The hope: a locally optimal choice will lead to a globally optimal solution – For some problems, it works • Greedy algorithms tend to be easier to code CSE5311 Design and Analysis of Algorithms 12Dept. CSE, UT Arlington David Huffman’s idea • A Term paper at MIT • Build the tree (code) bottom-up in a greedy fashion • Origami aficionado CSE5311 Design and Analysis of Algorithms 15Dept. CSE, UT Arlington Building the Encoding Tree CSE5311 Design and Analysis of Algorithms 16Dept. CSE, UT Arlington Building the Encoding Tree CSE5311 Design and Analysis of Algorithms 17Dept. CSE, UT Arlington Building the Encoding Tree CSE5311 Design and Analysis of Algorithms 20Dept. CSE, UT Arlington Lemma 16.2 • Without loss of generality, assume f[a]f[b] and f[x]f[y] • The cost difference between T and T’ is 0 ))()(])([][( )(][)(][)(][)(][ )(][)(][)(][)(][ )()()()()'()( '' '       xdadxfaf xdafadxfadafxdxf adafxdxfadafxdxf cdcfcdcfTBTB TT TTTT TTTT Cc T Cc T B(T’’)  B(T), but T is optimal, B(T)  B(T’’)  B(T’’) = B(T) Therefore T’’ is an optimal tree in which x and y appear as sibling leaves of maximum depth CSE5311 Design and Analysis of Algorithms 21Dept. CSE, UT Arlington Correctness of Huffman’s Algorithm •Observation: B(T) = B(T’) + f[x] + f[y]  B(T’) = B(T)-f[x]-f[y] –For each c C – {x, y}  dT(c) = dT’(c) f[c]dT(c) = f[c]dT’(c) –dT(x) = dT(y) = dT’(z) + 1 –f[x]dT(x) + f[y]dT(y) = (f[x] + f[y])(dT’(z) + 1) = f[z]dT’(z) + (f[x] + f[y]) CSE5311 Design and Analysis of Algorithms 22Dept. CSE, UT Arlington B(T’) = B(T)-f[x]-f[y] B(T) = 45*1+12*3+13*3+5*4+9*4+16*3 z:14 B(T’) = 45*1+12*3+13*3+(5+9)*3+16*3 = B(T) - 5 - 9 CSE5311 Design and Analysis of Algorithms 25Dept. CSE, UT Arlington Example: Huffman Coding • We then pick the nodes with the smallest frequency and combine them together to form a new node – The selection of these nodes is the Greedy part • The two selected nodes are removed from the set, but replace by the combined node • This continues until we have only 1 node left in the set CSE5311 Design and Analysis of Algorithms 26Dept. CSE, UT Arlington e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1 Example: Huffman Coding CSE5311 Design and Analysis of Algorithms 27Dept. CSE, UT Arlington e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1 2 Example: Huffman Coding CSE5311 Design and Analysis of Algorithms 30Dept. CSE, UT Arlington e,3 d,2 u,2 l,2 sp,2 k,1 i,1 s,1 2 b,1 v,1 2 34 Example: Huffman Coding CSE5311 Design and Analysis of Algorithms 31Dept. CSE, UT Arlington e,3 d,2 u,2 l,2 sp,2 k,1 i,1 s,1 2 b,1 v,1 2 344 Example: Huffman Coding CSE5311 Design and Analysis of Algorithms 32Dept. CSE, UT Arlington e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 44 5 Example: Huffman Coding CSE5311 Design and Analysis of Algorithms 35Dept. CSE, UT Arlington e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 44 5 7 9 16 Example: Huffman Coding CSE5311 Design and Analysis of Algorithms 36Dept. CSE, UT Arlington • Now we assign codes to the tree by placing a 0 on every left branch and a 1 on every right branch • A traversal of the tree from root to leaf give the Huffman code for that particular leaf character • Note that no code is the prefix of another code Example: Huffman Coding CSE5311 Design and Analysis of Algorithms 37Dept. CSE, UT Arlington Example: Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 44 5 7 9 16 e 00 d 010 u 011 l 100 sp 101 i 1100 s 1101 k 1110 b 11110 v 11111
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved