Download Hashing: A Comprehensive Approach with Java at University of Maryland, College Park - Prof and more Study notes Computer Science in PDF only on Docsity! 1 CMSC 132: Object-Oriented Programming II Hashing Department of Computer Science University of Maryland, College Park Overview Hashing Scattering Hash Values Hash Function Hash Tables Open Addressing Chaining 2 Hashing Approach Use hash function to convert key into number (hash value) used as index in hash table Hashing Hash Table Array indexed using hash values Hash table A with size N Indices of A range from 0 to N-1 Store in A[ hashValue % N] 5 Art and Magic of hashCode( ) There is no “right” hashCode function Art involved in finding good hashCode function Also for finding hashCode to hashBucket function From java.util.HashMap static int hashBucket(Object x, int N) { int h = x.hashCode(); h += ~(h << 9); h ^= (h >>> 14); h += (h << 4); h ^= (h >>> 10); return Math.abs(h % N); Hash Function Example hashCode("apple") = 5 hashCode("watermelon") = 3 hashCode("grapes") = 8 hashCode("kiwi") = 0 hashCode("strawberry") = 9 hashCode("mango") = 6 hashCode("banana") = 2 Perfect hash function Unique values for each key kiwi banana watermelon apple mango grapes strawberry 0 1 2 3 4 5 6 7 8 9 6 Hash Function Suppose now hashCode("apple") = 5 hashCode("watermelon") = 3 hashCode("grapes") = 8 hashCode("kiwi") = 0 hashCode("strawberry") = 9 hashCode("mango") = 6 hashCode("banana") = 2 hashCode(“orange") = 3 Collision Same hash value for multiple keys kiwi banana watermelon apple mango grapes strawberry 0 1 2 3 4 5 6 7 8 9 Types of Hash Tables Open addressing Store objects in each table entry Chaining (bucket hashing) Store lists of objects in each table entry 7 Open Addressing Hashing Approach Hash table contains objects Probe ⇒ examine table entry Collision Move K entries past current location Wrap around table if necessary Find location for X 1. Examine entry at A[ key(X) ] 2. If entry = X, found 3. If entry = empty, X not in hash table 4. Else increment location by K, repeat Open Addressing Hashing Approach Linear probing K = 1 May form clusters of contiguous entries Deletions Find location for X If X inside cluster, leave non-empty marker Insertion Find location for X Insert if X not in hash table Can insert X at first non-empty marker 10 Efficiency of Open Hashing Load factor = entries / table size Hashing is efficient for load factor < 90% Chaining (Bucket Hashing) Approach Hash table contains lists of objects Find location for X Find hash code key for X Examine list at table entry A[ key ] Collision Multiple entries in list for entry 11 Chaining Example Hash codes H(A) = 6 H(C) = 6 H(B) = 7 H(D) = 7 Hash table Size = 8 elements Λ = empty entry 1 2 3 4 5 6 7 8 Λ Λ Λ Λ Λ Λ Λ Λ Chaining Example Operations Insert A, Insert B, Insert C 1 2 3 4 5 6 7 8 Λ Λ Λ Λ Λ Λ Λ A 1 2 3 4 5 6 7 8 Λ Λ Λ Λ Λ Λ Λ Λ A B 1 2 3 4 5 6 7 8 Λ Λ Λ Λ Λ Λ Λ Λ C B A 12 Chaining Example Operations Find B, Find A 1 2 3 4 5 6 7 8 Λ Λ Λ Λ Λ Λ Λ Λ C B A 1 2 3 4 5 6 7 8 Λ Λ Λ Λ Λ Λ Λ Λ C B A Efficiency of Chaining Load factor = entries / table size Average case Evenly scattered entries Operations = O( load factor ) Worse case Entries mostly have same hash value Operations = O( entries )