Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Database Systems Homework 3: Problem Set, Assignments of Deductive Database Systems

Problems related to database systems, including merge-sort for large relations, pointer swizzling for data storage, data representation for student records, indexing on sequential files, and b+tree operations. Problems involve calculating minimum memory size, expected pointer following cost, record layout, minimal number of tuples for an index, and b+tree modifications.

Typology: Assignments

Pre 2010

Uploaded on 03/10/2009

koofers-user-ctn-1
koofers-user-ctn-1 🇺🇸

10 documents

1 / 4

Toggle sidebar

Related documents


Partial preview of the text

Download Database Systems Homework 3: Problem Set and more Assignments Deductive Database Systems in PDF only on Docsity! CS411 Database Systems Fall 2005 HW#3 Due: 11:30am CST, 11/18/05 Note: Print your name and NetID in the upper right corner of every page of your submission. Problem 1 Merge-Sort Suppose we have a relation with 15000 tuples and each tuple requires 10 bytes. We have a machine whose main memory and disk-block size (80 bytes) are sufficient to sort the 15000 tuples using TPMMS (Two-Phase, Multiway Merge-Sort). What is the minimum size of the main memory? Problem 2 Pointer Swizzling Suppose that the important actions related to data storage take the following times, in some arbitrary time units: • On-demand swizzling of a pointer: 30; • Automatic swizzling of pointers: 20 per pointer; • Following a swizzled pointer: 1; • Following a unswizzled pointer: 10. Suppose we design a pointer-swizzling control scheme like the following. At the beginning, we automatically swizzle 20% of the pointers and leave the rest unswizzled. Once a pointer is followed, we swizzle it by probability 0.5. If an unswizzled pointer has been followed twice, we swizzle it. Suppose there are 100 pointers in our data. The number of times that they are followed by a program is distributed according to the following histogram. times of being followed 0 1 2 3 number of pointers 10 40 30 20 What’s the expected cost of this program in terms of pointer following? Problem 3 Data Representation A student record consists of the following fixed-length fields: date of birth and SSN, each 8 bytes long. It also has the following variable-length fields: name and office. The records are augmented by an additional repeating field that represents exams. Each exam requires a date (12 bytes) and an integer result (4 bytes) of the exam. Pointers within a record require 4 bytes, and the record length is a 4-byte integer. You may assume that no alignment of fields is required. (a) Show the layout of students records if: 1 (1) The variable-length fields and repeating exams are kept within the record itself. (2) The variable-length fields and repeating exams are stored outside of the record, with pointers to them in the record. (b) The exam results of a student has a probability p of being inqueried. There are n test results per student in average. The average lengths of name and office are 8 bytes and 12 bytes, respectively. Suppose the cost function of the above scheme (1) is len, where len is the record length. The cost of scheme (2) is len + 100 × p, for penalizing accessing exam results on seperate blocks. What is the minimal value of n as a function of p, so that the above scheme (2) is better than scheme (1). Problem 4 Indexes on Sequential Files Given a relation of 200,000 tuples. Suppose each block could hold 10 tuples or 20 key-pointer pairs. The index is built on the key field of the relation (thus no duplicate search keys) and the file is sorted according to the key. Answer the following questions: a) How many blocks do we need for a dense index of this relation? b) How many blocks do we need for a sparse index of this relation? c) How many blocks do we need for a 2-level index of this relation (given that the first level is dense)? d) Suppose we use a 3-level index whose first level is sparse. If at most 12000 blocks can be used to store the index, what is the maximal number of tuples that the relation can have? Problem 5 B+tree Execute the following operations on Fig. 13.23 (page 635 of the textbook). Show the detailed steps, using Example 13.23 and Example 13.24 as examples. (a) Lookup the record with key 29. (b) Lookup the record with key 38. (c) Lookup all records in the range 25 to 38. (d) Lookup all records with keys less than 25. Problem 6 B+tree Consider the B+ tree index of degree 2 shown in the figure below. For each problem below, you only need to show the final resulting B+ Tree. a) Show the tree that would result from inserting a data entry with key 40 into this tree. b) Show the B+ tree that would result from inserting a data entry with key 30 into the original tree. c) Show the B+ tree that would result from deleting the data entry with key 45 from the original tree. 2
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved