Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Comp 411 Computer Organization: Problem Set #1 Solutions - Prof. Leona Mcmillan, Assignments of Computer Architecture and Organization

University of North Carolina (UNC) - Chapel Hill Computer Architecture and Organization

Prof. Leona Mcmillan

Solutions to problem set #1 in the comp 411 computer organization course, covering topics such as nucleic acid representation, codon representation, and error detection in binary matrices. It includes calculations and explanations for each problem.

Typology: Assignments

Pre 2010

Uploaded on 03/10/2009

koofers-user-wbr 🇺🇸

10 documents

1 / 6

Partial preview of the text

Download Comp 411 Computer Organization: Problem Set #1 Solutions - Prof. Leona Mcmillan and more Assignments Computer Architecture and Organization in PDF only on Docsity! Comp 411 Computer Orginization Fall 2007 Problem Set #1 Solutions Problem 1 a) If all nucleic acids are equally likely, a 3-nucleic acid sequence can be represented with 6 bits, as such: log2( 4∗4∗4 1 ) = log2( 64 1 ) = 6 Given that there are 20 amino acids and 1 stop code, the minimal number of bits to encode a single item in a protein chain is: log2( 21 1 ) ≈ 4.39 The numbers do not agree. Reasons why should include something along the lines of less information being stored in amino acid representation, the smaller set size of the amino acids, or such. b) Since there are 64 codons, and 61 are amino acids. Since the only information given is that the codon represents an amino acid, the bits conveyed are: log2( 64 61 ) ≈ 0.07 Similarly for the 3 stop codes: log2( 64 3 ) ≈ 4.42 Since there are 6 possible codons for Serine: log2( 64 6 ) ≈ 3.42 c) There are 37 codons that contain the T nucleotide and 64 possible codons. Thus, the bits conveyed are: log2( 64 37 ) ≈ 0.79 To figure out how many bits of information are added by knowing that the codon is a stop code, subtract the original amount of information conveyed from the new amount of information conveyed. log2( 64 3 )− log2( 64 37 ) ≈ 4.41− 0.79 ≈ 3.62 You could also simply use log2( 37 3 ) to find the amount of additional information. 1 d) There are 20 amino acids and 3 bases: log2( 20 3 ) ≈ 2.74 There are 64 possible codons and 10 of them encode to bases: log2( 64 10 ) ≈ 2.68 There are 2 ways to encode Lysine, so: log2( 64 2 )− log2( 64 10 ) ≈ 5− 2.68 ≈ 2.32 Again, you could also simply evaluate log2( 10 3 ) to find the amount of additional information. e) Across the set of 64 codons, there are 32 available transitions. Thus, let us consider each position in a codon and its influence on the final value. For the right-most position, there is only one transition that changes the resulting amino acid: Isoleucine ↔ Methionine. For the middle position, all 32 of the possible 32 transitions change the protein. For the right-most position, 30 of the 32 transitions change the protein (TTA ↔ CTA and TTG ↔ CTG do not). Thus, the number of bits conveyed is: log2( 32+32+32 1+32+30 ) = log2( 96 63 ) ≈ 0.61 f) In the case of Glycine, since the first 2 nuclic acids are all that is needed to identify the resulting amino acid, no bits of information are conveyed in the last nucleic acid. g) Using the entropy formula given in the lecture − ∑ i pi log2(pi), the entropy of the codes is: −(0.24 ∗ log2(0.24) + 0.14 ∗ log2(0.14) + 0.12 ∗ log2(0.12) + 0.5 ∗ log2(0.5)) ≈ −(0.24(−2.06) + 0.14(−2.84) + 0.12(−3.06) + 0.5(−1)) ≈ −(−0.49− 0.4− 0.38− 0.5) ≈ 1.77 The bits wasted using a fixed length scheme are: 2− 1.77 ≈ 0.23 h) The string 0011011101010 can be decoded as follows (only the last acid in the codon is shown): 0︸︷︷︸ C 0︸︷︷︸ C 110︸︷︷︸ A 111︸︷︷︸ T 0︸︷︷︸ C 10︸︷︷︸ G 10︸︷︷︸ G 10︸︷︷︸ G i) The expected length is: 1000(0.5 ∗ 1 + 0.24 ∗ 2 + 0.14 ∗ 3 + 0.12 ∗ 3) = 1760 Since GGT and GGA have an encoded length of 3, the worst case is: 1000(3) = 3000 Answers will vary, but generally when compared to 1000 ∗ log2( 41 ) = 2000, the worst case of 3000 (a 50% increase) seems very poor. 2 3. Error at 2,2 Corrected matrix: 000 101 10 4. Error in row 1 parity bit Corrected matrix: 0110 1001 0110 100 c) Answers will vary. Should include mention of 2 errors in a single row or column. d) 13 is the only index that is present in p0, p2, and p3. Thus the bit at index 13 has the error. Since p0, p2, and p3 marked the error, e0 = 1, e1 = 0, e2 = 1, e3 = 1, and e4 = 0. The binary representation of e0, e1, e2, e3, and e4 is 01101. Note that e0 represents the least significant bit. e) Consider that the index with the error is 13 and the binary representation of checked error bits is 01101. The error bits (01101) encode the index of the error (13) in binary form. f) Answers will vary, but they should mention that certain combinations of double bit errors are not detectable. Solutions may include such things as a parity bit that checks the other parity bits. Problem 4 a) Since there are 10 possibilities: log2( 10 1 ) ≈ 3.32 b) f(d) encodes as follows: f(d) =  1 d = 0 2 1 ≤ d ≤ 2 4 3 ≤ d ≤ 5 8 d ≥ 6 The probabilities of f(d) is: p(f(d)) =  1 10 1 2 10 2 3 10 4 4 10 8 5 In the case of 1, the information about d is: log2( 10 1 ) ≈ 3.32 For 8, the information conveyed is: log2( 10 4 ) ≈ 1.32 c) The average using the formula ∑ i pi log2( 1 pi ) is: 1 10 ∗ log2( 10 1 ) + 1 5 ∗ log2( 5 1 ) + 3 10 ∗ log2( 10 3 ) + 4 10 ∗ log2( 10 4 ) ≈ 1 10 (3.32) + 1 5 (2.32) + 3 10 (1.74) + 4 10 (1.32) ≈ 1.85 d) Since step 2 always operates on a pair of elements, it will repeat until there is a single element left in S. Thus, it will iterate n− 1 times. e) Answers will vary. Should be similar to this: 1 H HH H 6 10 H HH 3 10 (4) 3 10 HH 2 10 (2) 1 10 (1) 4 10 (8) Element Encoding Probability 1 011 110 2 010 210 4 00 310 8 1 510 6

Documents

questions

Comp 411 Computer Organization: Problem Set #1 Solutions - Prof. Leona Mcmillan, Assignments of Computer Architecture and Organization

Related documents

Partial preview of the text