Download Database Systems Homework 3: Block and Index Management and more Assignments Deductive Database Systems in PDF only on Docsity! CS411 Database Systems Fall 2008 HW#3 Due: 3:15pm CST, 10/30/08 Note: Print your name and NetID in the upper right corner of every page of your submission. Hand in your stapled homework to Donna Coleman in 2120 SC. In case Donna is not in office, slide your homework under the door. Expect to lose points if your handwritten answer is unclear or misread by the grader. 1 Problem 1 Secondary storage management (20 points, each part 10 points) (a) Suppose that we have a 4096-byte block in which we store records of 100 bytes. The block header contains an offset table, which is an array of 2-byte pointers to records within the block. In the offset table, pointers are packed as tightly as possible; there is no empty space between pointers. We also ignore the space for other information in the block header. On an average day, two records per block are inserted, and one record is deleted. A deleted record must have its pointer replaced by a ”tombstone” because there may be dangling pointers to it. We assume that the deletion on any day always occurs after the insertions. If the block is initially empty, after how many days will there be no room to insert any more records? Answer: We store one record (100 bytes) and two pointers (2 × 2) each day. The first pointer refers to the record and the second one is a ”tombstone”. Therefore, after 4096/104 = 39 days there will not be any room to insert a new record in the block. (b) Relational database systems have always preferred to use fixed-length records (tuples) if possible. Give three reasons for this preference. Answer: (a) It is faster to locate a fixed-length record in a block. (b) The offset table for fixed-length records need less space as the system need to know only their relative position inside the block. Also, they do not require any record header information. (c) The update operation is simpler and faster for fixed-length records, while if a record size is variable system has to do almost the same operations as it does for insertion and deletion. 2 01010,42 00011,67 11000,56 10000,48 1 1 0 1 Figure 2: Extensible hash table Problem 4 Hash Table (24 points, each part 12 points) Consider indexing the following key values using an extensible hash table. Suppose that we insert the keys in the order of: 56,48,42,67,44,71,60,24 The hash function h(n) for key n is h(n) = n mod 32; i.e., the hash function is the remainder after the key value is divided by 32. Thus, the hash value is a 5-bit value. Assume that each bucket can hold 2 data items. (a) Draw a hash table, which contains both the array of pointers in main memory and the buckets (i.e., data blocks) in secondary storage, after the first four keys are inserted. Show the keys along with their hash values in the buckets. Be sure to indicate the number of bits in the hash value that are used in the array. (We referred to this value with a variable i in the lecture slides.) Also, indicate the “nub” value of each block. Answer: i = 1. The table is shown in figure 2. (b) Suppose that we use a linear hash table instead. Draw a hash table in the similar way, after the first four keys are inserted. You do not have to specify the ”nub” value of each block in this question. Note that an extension of the table is necessary when the average number of records per block exceeds 80% of the number of records that fill one block. Answer: i = 2, n = 3, r = 4. The value of r/n must be less than 2 × 0.8 = 1.6. The table is shown in figure 3. 5 11000,56 10000,46 01010,42 00011,67 00 01 10 Figure 3: Linear hash table 6