Download Huffman Coding-Advance Analysis Design-Lecture Slides and more Slides Design and Analysis of Algorithms in PDF only on Docsity! Lecture No. 26 Huffman Coding Dr Nazir A. Zafar Advanced Algorithms Analysis and Design docsity.com • Huffman Problem • Problem Analysis – Binary coding techniques – Prefix codes • Algorithm of Huffman Coding Problem • Time Complexity • Road Trip Problem – Analysis and Greedy Algorithm • Conclusion Dr Nazir A. Zafar Advanced Algorithms Analysis and Design Today Covered docsity.com Refinement in Text Encoding Dr Nazir A. Zafar Advanced Algorithms Analysis and Design • Now a better code is given by the following encoding: ‹space› = 000, A = 0010, E = 0011, s = 010, c = 0110, g = 0111, h = 1000, i = 1001, l = 1010, t = 1011, e = 110, n = 111 • Then we encode the phrase as 0010 (A) 111 (n) 000 ( ) 0011 (E) 111 (n) 0111 (g) 1010 (l) 1001 (i) 010 (s) 1000 (h) 000 ( ) 010 (s) 110 (e) 111 (n) 1011 (t) 110 (e) 111 (n) 0110 (c) 110 (e) • This requires 65 bits ≈ 9 bytes. Much improvement. • The technique behind this improvement, i.e., Huffman coding which we will discuss later on. docsity.com There are many ways to represent a file of information. Binary Character Code (or Code) – each character represented by a unique binary string. • Fixed-Length Code – If = {0, 1} then – All possible combinations of two bit strings x = {00, 01, 10, 11} – If there are less than four characters then two bit strings enough – If there are less than three characters then two bit strings not economical Major Types of Binary Coding Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design docsity.com • Fixed-Length Code – All possible combinations of three bit strings x x = {000, 001, 010, 011, 100, 101, 110, 111} – If there are less than nine characters then three bit strings enough – If there are less than five characters then three bit strings not economical and can be considered two bit strings – If there are six characters then needs 3 bits to represent, following could be one representation. a = 000, b = 001, c = 010, d = 011, e = 100, f = 101 Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design Fixed Length Code docsity.com Why are prefix codes? Dr Nazir A. Zafar Advanced Algorithms Analysis and Design • Encoding simple for any binary character code; • Decoding also easy in prefix codes. This is because no codeword is a prefix of any other. Example 1 • If a = 0, b = 101, and c = 100 in prefix code then the string: 0101100 is coded as 0·101·100 Example 2 • In code words: {0, 1, 10, 11}, receiver reading “1” at the start of a code word would not know whether – that was complete code word “1”, or – prefix of the code word “10” or of “11” docsity.com • Tree representation of prefix codes Prefix codes and binary trees Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design A 00 B 010 C 0110 D 0111 E 10 F 11 A 0 B 0 0 0 0 1 1 1 1 1 C D E F docsity.com • In Huffman coding, variable length code is used • Data considered to be a sequence of characters. • Huffman codes are a widely used and very effective technique for compressing data – Savings of 20% to 90% are typical, depending on the characteristics of the data being compressed. • Huffman’s greedy algorithm uses a table of the frequencies of occurrence of the characters to build up an optimal way of representing each character as a binary string. • Now let us see an example to understand the concepts used in Huffman coding Huffman Codes Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design docsity.com • Given a tree T corresponding to a prefix code. For each character c in the alphabet C, – let f (c) denote the frequency of c in the file and – let dT(c) denote the depth of c’s leaf in the tree. – dT(c) is also the length of the codeword for character c. – The number of bits required to encode a file is – which we define as the cost of the tree T. Cost of Tree Corresponding to Prefix Code Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design Cc T cdcfTB )()()( docsity.com Huffman (C) 1 n ← |C| 2 Q ← C 3 for i ← 1 to n - 1 4 do allocate a new node z 5 left[z] ← x ← Extract-Min (Q) 6 right[z] ← y ← Extract-Min (Q) 7 f [z] ← f [x] + f [y] 8 Insert (Q, z) 9 return Extract-Min(Q) Return root of the tree. Algorithm: Constructing a Huffman Codes Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design docsity.com Example: Constructing a Huffman Codes Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design a:45b:13c:12 d:16e:9f:5 The initial set of n = 6 nodes, one for each letter. Number of iterations of loop are 1 to n-1 (6-1 = 5) Q: docsity.com Constructing a Huffman Codes Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design for i ← 3 Allocate a new node z left[z] ← x ← Extract-Min (Q) = z:14 right[z] ← y ← Extract-Min (Q) = d:16 f [z] ← f [x] + f [y] (14 + 16 = 30) Insert (Q, z) Q: a:45 0 25 c:12 b:13 1 0 30 14 f:5 e:9 d:16 10 1 Q: a:45d:16 0 14 f:5 e:9 1 0 25 c:12 b:13 1 docsity.com Constructing a Huffman Codes Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design for i ← 4 Allocate a new node z left[z] ← x ← Extract-Min (Q) = z:25 right[z] ← y ← Extract-Min (Q) = z:30 f [z] ← f [x] + f [y] (25 + 30 = 55) Insert (Q, z) Q: a:45 0 55 25 30 14 f:5 e:9 c:12 b:13 d:16 10 1010 1 Q: a:45 0 25 c:12 b:13 1 0 30 14 f:5 e:9 d:16 10 1 docsity.com Constructing a Huffman Codes Dr. Nazir A. Zafar Advanced Algorithms Analysis and Design for i ← 5 Allocate a new node z left[z] ← x ← Extract-Min (Q) = a:45 right[z] ← y ← Extract-Min (Q) = z:55 f [z] ← f [x] + f [y] (45 + 55 = 100) Insert (Q, z) Q: 0 100 55 25 30 14 f:5 e:9 c:12 b:13 d:16 a:45 10 10 1010 1 docsity.com