Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Huffman Codes and Algorithms: Lecture 22 & 23, Study notes of Algorithms and Programming

Huffman codes, a greedy algorithm used for data compression. How to store a chromosome of 130 million characters using 2 bits per character instead of 1 byte, and introduces the concept of prefix codes. The document also provides the huffman encoding algorithm and its properties, as well as a greedy algorithm for implementing it.

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-uid
koofers-user-uid 🇺🇸

10 documents

1 / 24

Toggle sidebar

Related documents


Partial preview of the text

Download Huffman Codes and Algorithms: Lecture 22 & 23 and more Study notes Algorithms and Programming in PDF only on Docsity! Design and analysis of algorithms Lecture 22& 23 Edyta Szymańska edyta@cc.gatech.edu CS3510 A, Fall 2005 – p. 1/?? Huffman codes - another greedy algorithm Store a map of a chromosome, i.e. a string of 130 million characters : A, C, G, T. CS3510 A, Fall 2005 – p. 2/?? Huffman codes - another greedy algorithm Store a map of a chromosome, i.e. a string of 130 million characters : A, C, G, T. How to do this? Default: 1 byte per character ? NO ! 2 bits per character suffice: A : 00, C : 01, G : 10, T : 11, total= 260 Megabits used CS3510 A, Fall 2005 – p. 2/?? Huffman codes - another greedy algorithm Store a map of a chromosome, i.e. a string of 130 million characters : A, C, G, T. How to do this? Default: 1 byte per character ? NO ! 2 bits per character suffice: A : 00, C : 01, G : 10, T : 11, total= 260 Megabits used Extra information: the characters appear in the string with different frequencies, namely f [A] = 70 · 106, f [C] = 3 · 106, f [G] = 20 · 106, f [T ] = 37 · 106 Thus, it should be worth assigning a shorter bit string to A than to C. CS3510 A, Fall 2005 – p. 2/?? Huffman codes - another greedy algorithm Store a map of a chromosome, i.e. a string of 130 million characters : A, C, G, T. How to do this? Default: 1 byte per character ? NO ! 2 bits per character suffice: A : 00, C : 01, G : 10, T : 11, total= 260 Megabits used Extra information: the characters appear in the string with different frequencies, namely f [A] = 70 · 106, f [C] = 3 · 106, f [G] = 20 · 106, f [T ] = 37 · 106 Thus, it should be worth assigning a shorter bit string to A than to C. CS3510 A, Fall 2005 – p. 2/?? Prefix codes prefix codes: no codeword can be a prefix of a different codeword (to avoid ambiguity in decoding). The following is a proper prefix code: A : 0, C : 110, G : 111, T : 10 Total number of bits used: 213 · 106, improvement = 17% G[20] CS3510 A, Fall 2005 – p. 3/?? Prefix codes prefix codes: no codeword can be a prefix of a different codeword (to avoid ambiguity in decoding). The following is a proper prefix code: A : 0, C : 110, G : 111, T : 10 Total number of bits used: 213 · 106, improvement = 17% Representation: binary tree 1 1 1 0 0 0 [60] [23] G[20]C[3] T[37] A[70] CS3510 A, Fall 2005 – p. 3/?? Huffman encoding algorithm The tree representation provides also a decoding scheme. CS3510 A, Fall 2005 – p. 4/?? Huffman encoding algorithm Properties of the optimal solution: CS3510 A, Fall 2005 – p. 5/?? Huffman encoding algorithm Properties of the optimal solution: The optimal solution is represented by a full binary tree. CS3510 A, Fall 2005 – p. 5/?? Huffman encoding algorithm Properties of the optimal solution: The optimal solution is represented by a full binary tree. The two characters with smallest frequencies must be together at the bottom of the tree, as children of the lowest internal node of the tree. CS3510 A, Fall 2005 – p. 5/?? Huffman encoding algorithm Greedy algorithm: CS3510 A, Fall 2005 – p. 6/?? Huffman encoding algorithm Greedy algorithm: HUFFMAN(C) Q := C for i := 1 to n − 1 do allocate a new node z left[z] := x :=EXTRACT MIN(Q) right[z] := y :=EXTRACT MIN(Q) f [z] := f [x] + f [y] INSERT(Q, z) return EXTRACT MIN(Q) CS3510 A, Fall 2005 – p. 6/?? Huffman encoding algorithm Greedy algorithm: HUFFMAN(C) Q := C for i := 1 to n − 1 do allocate a new node z left[z] := x :=EXTRACT MIN(Q) right[z] := y :=EXTRACT MIN(Q) f [z] := f [x] + f [y] INSERT(Q, z) return EXTRACT MIN(Q) Example: CS3510 A, Fall 2005 – p. 6/??
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved