Download Error Detection and Correction: Hamming Code and Single Error Correction - Prof. David L. and more Study notes Computer Architecture and Organization in PDF only on Docsity! 1 Error Detection & Correction – Page 1 of 18CSCI 4717 – Computer Architecture CSCI 4717/5717 Computer Architecture Topic: Error Detection & Correction Reading: Stallings, Section 5.2 Error Detection & Correction – Page 2 of 18CSCI 4717 – Computer Architecture Error Correction in Memory • Types of errors: hard or soft • Hard Failure – Permanent defect caused by – Harsh environmental abuse (including static electricity) – Manufacturing defect – Wear such as trace erosion • Soft Error – Random, non-destructive – Caused by electrical or EM/radioactive glitches – No permanent damage to memory Error Detection & Correction – Page 3 of 18CSCI 4717 – Computer Architecture Error Detection & Correction • Additional information must be stored to detect these errors • When M bits of data are stored, they are run through function f where a K bit code is created • M+K bits are then stored in memory • When data is read out, it is once again run through function f and the resulting K bits of code are compared with the stored K bits of code • In some cases, the code can be corrected (error correcting codes) • In all cases, and error code is generated Error Detection & Correction – Page 4 of 18CSCI 4717 – Computer Architecture Error Correcting Code Function Error Detection & Correction – Page 5 of 18CSCI 4717 – Computer Architecture Hamming Error Correction Code • One way to detect specific bit errors is to use multiple parity bits, each bit responsible for the parity of a smaller, overlapping portion of the data • A flipped bit in the data would show up as a parity error in the overlapping groups of which it was a member and not in the other groups • This would handle single-bit corrections Error Detection & Correction – Page 6 of 18CSCI 4717 – Computer Architecture 4-bit Hamming Code • Below is an example of a 4-bit word broken into 3 groups; each group has a parity bit to generate even parity. • Dn represent data bits while Pn represent parity bits 0101Group C 1111Group B 0101Group A P2P1P0D0=1D1=1D2=0D3=1 2 Error Detection & Correction – Page 7 of 18CSCI 4717 – Computer Architecture 4-bit Hamming Code (continued) Can be represented graphically using three intersecting circles. Error Detection & Correction – Page 8 of 18CSCI 4717 – Computer Architecture 4-bit Hamming Code (continued) • Areas are defined as: – A and B, but not C – A and C, but not B – B and C, but not A – A and B and C • Each non-intersecting area contains a parity bit to make it and the three intersecting areas in a single circle have even parity. • A change in only one area will make parity odd in 2 or all 3 of the circles indicating which intersection changed. Error Detection & Correction – Page 9 of 18CSCI 4717 – Computer Architecture General Single-Bit Error Correction • The mechanics of the typical error correction/detection system are created with XOR gates – Odd number of ones input to an XOR 1 output – Even number of ones input to an XOR 0 output • Upon data retrieval, two K-bit values are generated: – The stored K-bit value – The K-bit value generated from the stored data • A bit-by-bit comparison is performed on these two values generating a K-bit result – 0’s in bit positions where there is no error – 1’s in bit positions where two bits disagree – K-bit result is called a syndrome word Error Detection & Correction – Page 10 of 18CSCI 4717 – Computer Architecture Generation of Syndrome Word Error Detection & Correction – Page 11 of 18CSCI 4717 – Computer Architecture Syndrome Word • All zeros means that the data was successfully retrieved • For data with M bits and K code bits, then there are M+K possible single bit errors, i.e., there could be an error in the data OR the K-bit code • For a K bit syndrome word, there are 2K-1 (minus one for the no error case) possible values to represent single-bit errors • Therefore, for the system to uniquely identify bit errors, 2K-1 > M+K Error Detection & Correction – Page 12 of 18CSCI 4717 – Computer Architecture Single Error Correcting (SEC) Code Example • Assume M=8 • First, how big does K have to be? K=3: 23-1 > 8+3? (7 is not > 11) K=4: 24-1 > 8+4? (15 is > 12)