Download The Relational Data Model: Understanding Relations, Schemas, and States and more Study Guides, Projects, Research Design in PDF only on Docsity! CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky The Relational Data Model (ALL the Vocabulary) Lecture 2 January 9, 2018 The Relational Data Model 1 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky A Quick Reminder • One of the key features of a DBMS is use of data models to support “data independence” – The conceptual representation is independent of underlying storage and/or operation implementation January 9, 2018 The Relational Data Model 2 ER Diagrams Relations CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Motivation • A formal mathematical basis for databases – Set theory and first-order predicate logic – Allows scientists to advance theoretically • A foundation for efficient and usable database management systems – Allows companies/developers to advance end- user products • Note: some aspects of the model are not adhered to by modern RDBMSs January 9, 2018 The Relational Data Model 5 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Relational Database A database consists of… i. a set of relations (tables) ii. a set of integrity constraints A database is in a valid state if it satisfies all integrity constraints (else invalid state) January 9, 2018 The Relational Data Model 6 Pop Quiz: What is a set? CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky A Relation A relation consists of… i. its schema, describing structure ii. its state, or current populated data January 9, 2018 The Relational Data Model 7 Name SSN Phone Address Age GPA Ben Bayer 305-61-2435 555-1234 1 Foo Lane 19 3.21 Chung-cha Kim 422-11-2320 555-9876 2 Bar Court 25 3.53 Barbara Benson 533-69-1238 555-6758 3 Baz Blvd 19 3.25 STUDENT Schema State CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Exercise Diagrammatically produce a relation HAT according to the following schema; the relation state should have at least three tuples HAT(Team, Size, Color) • dom(Team) = { RedSox, Bruins, Celtics, Patriots, Revolution } • dom(Size) = { S, M, L, XL } • dom(Color) = { Black, Blue, White, Red, Green, Yellow } How many tuples are possible in this relation? January 9, 2018 The Relational Data Model 10 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Answer |dom(Team)|⇥ |dom(Size)|⇥ |dom(Color)| 5⇥ 4⇥ 6 120 January 9, 2018 The Relational Data Model 11 Team Size Color RedSox M Red Revolution S White Bruins XL Yellow HAT CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Tuples: Theory vs. Implementation • Relation state is formally defined as a set of tuples, implying… – No inherent order – No duplicates • In real database systems, the rows on disk will have an ordering, but the relation definition sets no preference as to this ordering – We will discuss later in physical design how to establish an ordering to improve query efficiency • Additionally, real database systems implement a bag of tuples, allowing duplicate rows January 9, 2018 The Relational Data Model 12 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Violation of 1NF: Composite January 9, 2018 The Relational Data Model 15 SSN Dorm Room 305-61-2435 555 Huntington 1 422-11-2320 Baker 2 533-69-1238 555 Huntington 3 DORMDorm Info SSN Dorm Room 305-61-2435 555 Huntington 1 422-11-2320 Baker 2 533-69-1238 555 Huntington 3 DORM vs. CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Violation of 1NF: Multi-Valued January 9, 2018 The Relational Data Model 16 SSN Class 305-61-2435 COMP355 422-11-2320 COMP355 533-69-1238 MATH650 305-61-2435 MATH650 422-11-2320 BIOL110 CLASS SSN Class 305-61-2435 COMP355, MATH 650 422-11-2320 COMP355, BIOL110 533-69-1238 MATH650 CLASS vs. CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Model Constraints Categories of restrictions on data in a relational database 1. Inherent in the data model (implicit) 2. Schema-based (explicit) 3. Application-based (or triggers/assertions) 4. Data dependencies Relates to “goodness” of database design; we will revisit in normalization January 9, 2018 The Relational Data Model 17 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Key Constraints A key is a set of attribute(s) satisfying two properties: 1. Two distinct tuples in any state of the relation cannot have identical values for all the attributes of the key (superkey) 2. No attribute can be removed from the key and still have #1 hold (minimal superkey) A relation may have multiple keys (each is a candidate key). Relations commonly have a primary key (underlined, PK; typically small number of attributes, used to identify tuples), and may also have some number of additional unique key(s). January 9, 2018 The Relational Data Model 20 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Exercise Is the following a valid state of DOCTOR? January 9, 2018 The Relational Data Model 21 Number First Last 1 William Hartnell 2 Patrick Troughton 3 Jon Pertwee 4 Tom Baker 5 Peter Davison 6 Colin Baker 7 Sylvester McCoy 8 Paul McGann DOCTOR Number First Last 9 Christopher Eccleston 10 David Tennant 11 Matt Smith 12 Peter Capaldi 13 Jodie Whittaker CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Answer Is the following a valid state of DOCTOR? January 9, 2018 The Relational Data Model 22 Underline = primary key Req #1: Two distinct tuples cannot have identical values for all the attributes of the key – NOT TRUE! Number First Last 1 William Hartnell 2 Patrick Troughton 3 Jon Pertwee 4 Tom Baker 5 Peter Davison 6 Colin Baker 7 Sylvester McCoy 8 Paul McGann Number First Last 9 Christopher Eccleston 10 David Tennant 11 Matt Smith 12 Peter Capaldi 13 Jodie Whittaker DOCTOR CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Entity Integrity In a tuple, no attribute that is part of the PK can be NULL Basic justification: if PK is used to identify a tuple, then none of its component parts can be left unknown January 9, 2018 The Relational Data Model 25 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Exercise List all candidate key(s) for the current state of DOCTOR. January 9, 2018 The Relational Data Model 26 Number First Last 1 William Hartnell 2 Patrick Troughton 3 Jon Pertwee 4 Tom Baker 5 Peter Davison 6 Colin Baker 7 Sylvester McCoy 8 Paul McGann DOCTOR Number First Last 9 Christopher Eccleston 10 David Tennant 11 Matt Smith 12 Peter Capaldi 13 Jodie Whittaker 14 NULL NULL CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Answer List all candidate key(s) for the current state of DOCTOR. January 9, 2018 The Relational Data Model 27 PK = { Number } Number First Last 1 William Hartnell 2 Patrick Troughton 3 Jon Pertwee 4 Tom Baker 5 Peter Davison 6 Colin Baker 7 Sylvester McCoy 8 Paul McGann DOCTOR Number First Last 9 Christopher Eccleston 10 David Tennant 11 Matt Smith 12 Peter Capaldi 13 Jodie Whittaker 14 NULL NULL CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Exercise Given the above relational schema, for which attribute(s) that refer to STUDENT(SSN), if any, is it permissible to have a value of NULL? January 9, 2018 The Relational Data Model 30 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Answer Given the above relational schema, for which attribute(s) that refer to STUDENT(SSN), if any, is it permissible to have a value of NULL? January 9, 2018 The Relational Data Model 31 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Chinook January 9, 2018 The Relational Data Model 32 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Delete Referential Integrity • Tuple being deleted is referenced by foreign keys from other tuples Possible actions • Reject deletion • Cascade (propagate deletion) • Set default/NULL referencing attribute values (careful with primary key) January 9, 2018 The Relational Data Model 35 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Update • If modifying neither part of primary key nor foreign key, need only check… – Domain • Modifying primary key… – Like Delete then Insert • Modifying foreign key… – Like Insert Actions typically similar to Delete with separate options. January 9, 2018 The Relational Data Model 36 CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky Transactions A transaction is a sequence of database operations, including retrieval and update(s) START Read or write Read or write Read or write … COMMIT or ROLLBACK January 9, 2018 The Relational Data Model 37