Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Input/Output Systems: Disk Systems, Dependability, and RAID Technologies - Prof. Jiang Li, Study notes of Computer Architecture and Organization

This document, authored by dr. Jiang li from the dept. Of systems & computer science at howard university, provides an in-depth exploration of various aspects of input/output (i/o) systems, focusing on disk systems, dependability, and redundant array of independent disks (raid) technologies. Topics such as bytes/sec and transfers/sec, i/o bus connections, response time, diversity of devices, faults and service interruption, mean time to repair (mttr), mean time between failures (mtbf), availability, disk storage, disk latency, flash storage, flash types, raid 1, 2, 3, 4, 5, and 6, bus types, bus signals and synchronization, interrupts, i/o data transfer, and file system & web benchmarks.

Typology: Study notes

Pre 2010

Uploaded on 08/18/2009

koofers-user-hv0
koofers-user-hv0 🇺🇸

10 documents

1 / 44

Toggle sidebar

Related documents


Partial preview of the text

Download Input/Output Systems: Disk Systems, Dependability, and RAID Technologies - Prof. Jiang Li and more Study notes Computer Architecture and Organization in PDF only on Docsity! Jiang LiDept. of Systems & Computer Science, Howard Univ. 1 Input/Output, Disk Systems (8.1, 8.2, 8.4 ~ 8.7, 8.9) Dr. Jiang Li Slides adapted from various sources (e.g. VT, RPI, UCSB etc) Jiang LiDept. of Systems & Computer Science, Howard Univ. 2 Introduction  I/O devices can be characterized by  Behaviour: input, output, storage  Partner: human or machine  Data rate: bytes/sec, transfers/sec  I/O bus connections Jiang LiDept. of Systems & Computer Science, Howard Univ. 5 Dependability Measures  Reliability: mean time to failure (MTTF)  A measure of the continuous service accomplishment  Service interruption: mean time to repair (MTTR)  Mean time between failures  MTBF = MTTF + MTTR  Availability = MTTF / (MTTF + MTTR)  A measure of the service accomplishment with respect to the alternation between accomplishment and interruption.  Improving Availability  Increase MTTF: fault avoidance, fault tolerance, fault forecasting  Reduce MTTR: improved tools and processes for diagnosis and repair Jiang LiDept. of Systems & Computer Science, Howard Univ. 6 Disk Storage  Nonvolatile, rotating magnetic storage Jiang LiDept. of Systems & Computer Science, Howard Univ. 7 Magnetic Disks  A magnetic disk consists of 1-12 platters (metal or glass disk covered with magnetic recording material on both sides), with diameters between 1- 3.5 inches  Each platter is comprised of concentric tracks (5- 30K) and each track is divided into sectors (100 – 500 per track, each about 512 bytes)  Each sector records  Sector ID, data (512 bytes, 4096 bytes proposed), error correcting code (ECC, Used to hide defects and recording errors), synchronization fields and gaps  A movable arm holds the read/write heads for each disk surface and moves them all in tandem – a cylinder of data is accessible at a time Jiang LiDept. of Systems & Computer Science, Howard Univ. 10 Disk Access Time Example  Average seek time: 6ms  Transfer rate: 50 MB/sec  Controller overhead is 0.2ms  What is the average time to read or write a 512- byte sector for a disk of 10000RPM? Average disk access time = Average seek time + Average rotational delay + Transfer time + Controller overhead = 6.0ms + 0.5/10000RPM/(60000ms/min) + 0.5KB/50MB/sec/1000 + 0.2ms = 9.2ms Jiang LiDept. of Systems & Computer Science, Howard Univ. 11 Disk Performance Issues  Manufacturers quote average seek time  Based on all possible seeks  Locality and OS scheduling lead to smaller actual average seek times  Smart disk controller allocate physical sectors on disk  Present logical sector interface to host  Disk/motherboard interface  SCSI, ATA, SATA  Disk drives include caches  Prefetch sectors in anticipation of access  Avoid seek and rotational delay Jiang LiDept. of Systems & Computer Science, Howard Univ. 12 Flash Storage  Nonvolatile semiconductor storage  100× – 1000× faster than disk  Smaller, lower power, more robust  But more $/GB (between disk and DRAM) Jiang LiDept. of Systems & Computer Science, Howard Univ. 15 RAID 1 & 2  RAID 1: Mirroring  N + N disks, replicate data Write data to both data disk and mirror disk On disk failure, read from mirror  RAID 2: Error correcting code (ECC)  N + E disks (e.g., 10 + 4)  Split data at bit level across N disks  Generate E-bit ECC  Too complex, not used in practice Jiang LiDept. of Systems & Computer Science, Howard Univ. 16 RAID 3: Bit-Interleaved Parity  N + 1 disks  Data striped across N disks at byte level  Redundant disk stores parity  For example: with 9 disks, bit 0 is in disk-0, bit 1 is in disk-1, …, bit 7 is in disk-7; disk-8 maintains parity for all 8 bits  Read access  Read all disks  Write access Generate new parity and update all disks  On failure  Use parity to reconstruct missing data  Not widely used Jiang LiDept. of Systems & Computer Science, Howard Univ. 17 RAID 4: Block-Interleaved Parity  N + 1 disks  Data striped across N disks at block level  Redundant disk stores parity for a group of blocks  Read access  Read only the disk holding the required block  Write access  Just read disk containing modified block, and parity disk  Calculate new parity, update data disk and parity disk  On failure  Use parity to reconstruct missing data  Not widely used Jiang LiDept. of Systems & Computer Science, Howard Univ. 20 RAID 6: P + Q Redundancy  N + 2 disks  Like RAID 5, but two lots of parity  Greater fault tolerance through more redundancy  Multiple RAID  More advanced systems give similar fault tolerance with better performance Jiang LiDept. of Systems & Computer Science, Howard Univ. 21 RAID Summary  RAID can improve performance and availability  RAID 1-5 can tolerate a single fault – mirroring (RAID 1) has a 100% overhead, while parity (RAID 3, 4, 5) has modest overhead  Can tolerate multiple faults by having multiple check functions – each additional check can cost an additional disk (RAID 6)  RAID 6 and RAID 2 (memory-style ECC) are not commercially employed  High availability requires hot swapping  Assumes independent disk failures  Too bad if the building burns down!  See “Hard Disk Performance, Quality and Reliability”  http://www.pcguide.com/ref/hdd/perf/index.htm Jiang LiDept. of Systems & Computer Science, Howard Univ. 22 Interconnecting Components  Need interconnections between  CPU, memory, I/O controllers  Bus: shared communication channel  Parallel set of wires for data and synchronization of data transfer  Can become a bottleneck  Performance limited by physical factors  Wire length, number of connections  More recent alternative: high-speed serial connections with switches  Like networks Jiang LiDept. of Systems & Computer Science, Howard Univ. 25 I/O Bus Examples Firewire USB 2.0 PCI Express Serial ATA Serial Attached SCSI Intended use External External Internal Internal External Devices per channel 63 127 1 1 4 Data width 4 2 2/lane 4 4 Peak bandwidth 50MB/s or 100MB/s 0.2MB/s, 1.5MB/s, or 60MB/s 250MB/s/lane 1×, 2×, 4×, 8×, 16×, 32× 300MB/s 300MB/s Hot pluggable Yes Yes Depends Yes Yes Max length 4.5m 5m 0.5m 1m 8m Standard IEEE 1394 USB Implementers Forum PCI-SIG SATA-IO INCITS TC T10 Jiang LiDept. of Systems & Computer Science, Howard Univ. 26 P4 Processor Memory Controller Hub (North Bridge) I/O Controller Hub (South Bridge) Main Memory Graphics output 1 Gb Ethernet CD/DVD Tape Disk System bus 800 MHz, 6.4 GB/sec 266 MB/sec DDR 400 3.2 GB/sec 2.1 GB/sec 266 MB/sec Serial ATA 150 MB/s USB 2.0 60 MB/s 100 MB/s 100 MB/s Typical x86 PC I/O System Jiang LiDept. of Systems & Computer Science, Howard Univ. 27 I/O Management  I/O is mediated by the OS  Multiple programs share I/O resources  Need protection and scheduling  I/O causes asynchronous interrupts  Same mechanism as exceptions  I/O programming is fiddly OS provides abstractions to programs Jiang LiDept. of Systems & Computer Science, Howard Univ. 30 Polling  Periodically check I/O status register  If device ready, do operation  If error, take action  Common in small or low-performance real-time embedded systems  Predictable timing  Low hardware cost  In other systems, wastes CPU time Jiang LiDept. of Systems & Computer Science, Howard Univ. 31 Interrupts  When a device is ready or error occurs  Controller interrupts CPU  Interrupt is like an exception  But not synchronized to instruction execution  Can invoke handler between instructions  Cause information often identifies the interrupting device  Priority interrupts  Devices needing more urgent attention get higher priority  Can interrupt handler for a higher priority interrupt Jiang LiDept. of Systems & Computer Science, Howard Univ. 32 I/O Data Transfer  Polling and interrupt-driven I/O  CPU transfers data between memory and I/O data registers  Time consuming for high-speed devices  Direct memory access (DMA)  OS provides starting address in memory  I/O controller transfers to/from memory autonomously  Controller interrupts on completion or error Jiang LiDept. of Systems & Computer Science, Howard Univ. 35 Measuring I/O Performance  I/O performance depends on  Hardware: CPU, memory, controllers, buses  Software: operating system, database management system, application  Workload: request rates and patterns  I/O system design can trade-off between response time and throughput  Measurements of throughput often done with constrained response-time Jiang LiDept. of Systems & Computer Science, Howard Univ. 36 Transaction Processing Benchmarks  Transactions  Small data accesses to a DBMS  Interested in I/O rate, not data rate  Measure throughput  Subject to response time limits and failure handling  ACID (Atomicity, Consistency, Isolation, Durability)  Overall cost per transaction  Transaction Processing Performance Council (TPC) benchmarks (www.tpc.org)  TPC-APP: B2B application server and web services  TPC-C: on-line order entry environment  TPC-E: on-line transaction processing for brokerage firm  TPC-H: decision support — business oriented ad-hoc queries Jiang LiDept. of Systems & Computer Science, Howard Univ. 37 File System & Web Benchmarks  SPEC System File System (SFS)  Synthetic workload for NFS server, based on monitoring real systems  Results  Throughput (operations/sec)  Response time (average ms/operation)  SPEC Web Server benchmark  Measures simultaneous user sessions, subject to required throughput/session  Three workloads: Banking, Ecommerce, and Support Jiang LiDept. of Systems & Computer Science, Howard Univ. 40 I/O System Design Example  A CPU sustains 3 billion instructions per second  Average 100000 instructions in the OS per I/O operation  The user program runs 200000 instructions per I/O operation  A memory backplane bus capable of sustaining a transfer rate of 1GB/sec  SCSI Ultra320 controllers with a transfer rate of 320MB/sec and accommodating up to 7 disks  Disk drives with a read/write bandwidth of 75MB/sec and an average seek plus rotational latency of 6ms  The workload consists of 64KB reads (the blocks are sequential on a track), i.e. each I/O transfers 64KB  What is the max sustainable I/O rate and the number of disks and SCSC controllers required? Jiang LiDept. of Systems & Computer Science, Howard Univ. 41 I/O System Design Example (cont’d)  Max I/O rate of CPU = 3  109 / (200000+100000) = 10000 I/Os per sec  Max I/O rate of bus = 109 / (64  103) = 15625 I/Os per sec  The CPU is the bottleneck  Time per I/O at disk = 6 ms + 64KB / (75MB/sec)  6.9 ms  Each disk an complete 1000 / 6.9  146 I/Os per sec, so we need 10000 / 146  69 disks Jiang LiDept. of Systems & Computer Science, Howard Univ. 42 I/O System Design Example (cont’d)  Transfer rate required for the SCSI controller 64KB / 6.9 ms  7  64.9 MB/sec < 320 MB/sec  We need 69/7  10 SCSI controllers
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved