Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Comp 411 Computer Organization: Problem Set #1 Solutions - Prof. Leona Mcmillan, Assignments of Computer Architecture and Organization

Solutions to problem set #1 in the comp 411 computer organization course, covering topics such as nucleic acid representation, codon representation, and error detection in binary matrices. It includes calculations and explanations for each problem.

Typology: Assignments

Pre 2010

Uploaded on 03/10/2009

koofers-user-wbr
koofers-user-wbr 🇺🇸

10 documents

1 / 6

Toggle sidebar

Related documents


Partial preview of the text

Download Comp 411 Computer Organization: Problem Set #1 Solutions - Prof. Leona Mcmillan and more Assignments Computer Architecture and Organization in PDF only on Docsity! Comp 411 Computer Orginization Fall 2007 Problem Set #1 Solutions Problem 1 a) If all nucleic acids are equally likely, a 3-nucleic acid sequence can be represented with 6 bits, as such: log2( 4∗4∗4 1 ) = log2( 64 1 ) = 6 Given that there are 20 amino acids and 1 stop code, the minimal number of bits to encode a single item in a protein chain is: log2( 21 1 ) ≈ 4.39 The numbers do not agree. Reasons why should include something along the lines of less information being stored in amino acid representation, the smaller set size of the amino acids, or such. b) Since there are 64 codons, and 61 are amino acids. Since the only information given is that the codon represents an amino acid, the bits conveyed are: log2( 64 61 ) ≈ 0.07 Similarly for the 3 stop codes: log2( 64 3 ) ≈ 4.42 Since there are 6 possible codons for Serine: log2( 64 6 ) ≈ 3.42 c) There are 37 codons that contain the T nucleotide and 64 possible codons. Thus, the bits conveyed are: log2( 64 37 ) ≈ 0.79 To figure out how many bits of information are added by knowing that the codon is a stop code, subtract the original amount of information conveyed from the new amount of information conveyed. log2( 64 3 )− log2( 64 37 ) ≈ 4.41− 0.79 ≈ 3.62 You could also simply use log2( 37 3 ) to find the amount of additional information. 1 d) There are 20 amino acids and 3 bases: log2( 20 3 ) ≈ 2.74 There are 64 possible codons and 10 of them encode to bases: log2( 64 10 ) ≈ 2.68 There are 2 ways to encode Lysine, so: log2( 64 2 )− log2( 64 10 ) ≈ 5− 2.68 ≈ 2.32 Again, you could also simply evaluate log2( 10 3 ) to find the amount of additional information. e) Across the set of 64 codons, there are 32 available transitions. Thus, let us consider each position in a codon and its influence on the final value. For the right-most position, there is only one transition that changes the resulting amino acid: Isoleucine ↔ Methionine. For the middle position, all 32 of the possible 32 transitions change the protein. For the right-most position, 30 of the 32 transitions change the protein (TTA ↔ CTA and TTG ↔ CTG do not). Thus, the number of bits conveyed is: log2( 32+32+32 1+32+30 ) = log2( 96 63 ) ≈ 0.61 f) In the case of Glycine, since the first 2 nuclic acids are all that is needed to identify the resulting amino acid, no bits of information are conveyed in the last nucleic acid. g) Using the entropy formula given in the lecture − ∑ i pi log2(pi), the entropy of the codes is: −(0.24 ∗ log2(0.24) + 0.14 ∗ log2(0.14) + 0.12 ∗ log2(0.12) + 0.5 ∗ log2(0.5)) ≈ −(0.24(−2.06) + 0.14(−2.84) + 0.12(−3.06) + 0.5(−1)) ≈ −(−0.49− 0.4− 0.38− 0.5) ≈ 1.77 The bits wasted using a fixed length scheme are: 2− 1.77 ≈ 0.23 h) The string 0011011101010 can be decoded as follows (only the last acid in the codon is shown): 0︸︷︷︸ C 0︸︷︷︸ C 110︸︷︷︸ A 111︸︷︷︸ T 0︸︷︷︸ C 10︸︷︷︸ G 10︸︷︷︸ G 10︸︷︷︸ G i) The expected length is: 1000(0.5 ∗ 1 + 0.24 ∗ 2 + 0.14 ∗ 3 + 0.12 ∗ 3) = 1760 Since GGT and GGA have an encoded length of 3, the worst case is: 1000(3) = 3000 Answers will vary, but generally when compared to 1000 ∗ log2( 41 ) = 2000, the worst case of 3000 (a 50% increase) seems very poor. 2 3. Error at 2,2 Corrected matrix: 000 101 10 4. Error in row 1 parity bit Corrected matrix: 0110 1001 0110 100 c) Answers will vary. Should include mention of 2 errors in a single row or column. d) 13 is the only index that is present in p0, p2, and p3. Thus the bit at index 13 has the error. Since p0, p2, and p3 marked the error, e0 = 1, e1 = 0, e2 = 1, e3 = 1, and e4 = 0. The binary representation of e0, e1, e2, e3, and e4 is 01101. Note that e0 represents the least significant bit. e) Consider that the index with the error is 13 and the binary representation of checked error bits is 01101. The error bits (01101) encode the index of the error (13) in binary form. f) Answers will vary, but they should mention that certain combinations of double bit errors are not detectable. Solutions may include such things as a parity bit that checks the other parity bits. Problem 4 a) Since there are 10 possibilities: log2( 10 1 ) ≈ 3.32 b) f(d) encodes as follows: f(d) =  1 d = 0 2 1 ≤ d ≤ 2 4 3 ≤ d ≤ 5 8 d ≥ 6 The probabilities of f(d) is: p(f(d)) =  1 10 1 2 10 2 3 10 4 4 10 8 5 In the case of 1, the information about d is: log2( 10 1 ) ≈ 3.32 For 8, the information conveyed is: log2( 10 4 ) ≈ 1.32 c) The average using the formula ∑ i pi log2( 1 pi ) is: 1 10 ∗ log2( 10 1 ) + 1 5 ∗ log2( 5 1 ) + 3 10 ∗ log2( 10 3 ) + 4 10 ∗ log2( 10 4 ) ≈ 1 10 (3.32) + 1 5 (2.32) + 3 10 (1.74) + 4 10 (1.32) ≈ 1.85 d) Since step 2 always operates on a pair of elements, it will repeat until there is a single element left in S. Thus, it will iterate n− 1 times. e) Answers will vary. Should be similar to this: 1   H HH H 6 10   H HH 3 10 (4) 3 10  HH 2 10 (2) 1 10 (1) 4 10 (8) Element Encoding Probability 1 011 110 2 010 210 4 00 310 8 1 510 6
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved