Download Design Patterns: Brute-Force and Greedy Approaches to Counting Change and Pattern Matching and more Study notes Algorithms and Programming in PDF only on Docsity! Outline Today: • Design Patterns: Brute-Force and Greedy • Counting Change Problem • Pattern Matching Algorithms as Examples of 2 Design Patterns • Brute Force • Simplified Boyer-Moore Sugih Jamin (jamin@eecs.umich.edu) Design Patterns What is a design pattern? Design patterns we look at in this course: • brute force • divide and conquer • recursive • amortized • greedy, usually involving “heuristics” • branch and bound • backtracking • dynamic programming Sugih Jamin (jamin@eecs.umich.edu) Solution: Brute-force Approach Try all subsets of P : • S1 = 1,0,1,0,0,1, . . . • S2 = 0,1,1,1,0,0, . . . • S3 = 1,0,1,1,1,0, . . . • . . . • How many possible subsets are there? Feasible solution set: all Si’s for which ∑ dci = A Objective function: the Si that minimizes ∑ si What is the time complexity to compute the sums? Total time complexity of this approach: • worst case: • best case: Sugih Jamin (jamin@eecs.umich.edu) Bruce-force Algorithm Solves a problem in the most simple, direct, or obvious way • does not take advantage of structure or pattern in the problem • usually involves exhaustive search of the solution space • pro: often simple to implement • con: usually not the most efficient way Sugih Jamin (jamin@eecs.umich.edu) Greedy Approach Pick coin with largest denomination first: • return largest coin pi from P such that dpi ≤ A • A− = dpi • find next largest coin What is the time complexity of the algorithm? Solution not necessarily optimal: • consider A = 20 and D = {15,10,10,1,1,1,1,1} • greedy returns 6 coins, optimal requires only 2 coins! Solution not guaranteed: • consider A = 20 and D = {15,10,10} • greedy picks 15 and finds no solution! Sugih Jamin (jamin@eecs.umich.edu) Alphabet Space The string doesn’t have to consist only of alphabets in a human language Alphabet space Σ: • English language: “The quick brown fox jumped over the lazy dog” • DNA sequence: “cagacagacagata” • binary data: “10111100001010111000111100101” Alphabet size, |Σ|: • English language: 26 alphabets • DNA sequence: 4 characters (‘c’, ‘g’, ‘a’, ‘t’) • binary data: 2 digits (‘1’, ‘0’) Sugih Jamin (jamin@eecs.umich.edu) Pattern Matching Algorithms • Brute-force • Simplified Boyer-Moore: greedy, but falls back to brute-force • Knuth-Morris-Pratt: memoized • Original Boyer-Moore: memoized Sugih Jamin (jamin@eecs.umich.edu) Brute-force Pattern Matching T: a a b c b d a a a a b c a c b a a c P: a c b a a c a c b a a c 1 2 11 12 a c b a a c a c b a a c 3 4 13 14 a c b a a c a c b a a c 5 15 16 a c b a a c a c b a a c 6 17 a c b a a c a c b a a c 7 18 a c b a a c a c b a a c 8 19 . . . . 24 a c b a a c 9 10 int // index of matching start in T bfmatch(char *T, char *P) // T text, P pattern What is the time complexity of the algorithm? - best case: - worst case: Sugih Jamin (jamin@eecs.umich.edu) Simplified-BM Example T: a a b c b d a a a a b c a c b a a c P: a c b a a c ; d != c by SBM-0 a c b a a c ; by SBM-1 a c b a a c ; by SBM-2.1 a c b a a c ; by SBM-2.2 a c b a a c ; by SBM-2.1 -------------------------------------------------------- T: a a b c b d a a a a b c a c b a a c P: a c b a a c ; last[d] = -1 a c b a a c ; last[b] = 2 < 4 a c b a a c ; last[c] = 5 > 3 a c b a a c ; last[b] = 2 < 5 a c b a a c ; match! Sugih Jamin (jamin@eecs.umich.edu) Simplified-BM last[] Computation How do you determine l? Just as in the greedy count-change algorithm, pre-compute the information (heuristics) you need In this case, pre-compute l for every letter of the alphabet, store these in last[]: • initialize each member of last[|Σ|] to -1 • go thru P in reverse to determine the last occurrence of each alphabet For the example P in previous slide, last[]: a b c d 4 2 5 -1 Sugih Jamin (jamin@eecs.umich.edu) Simplified BM Time Complexity What is the worst-case time complexity of the algorithm? What is the average-case time complexity? Works well for large alphabet, longish pattern with few different characters; empirically, for English words, SBM requires about 0.3n CMPs Sugih Jamin (jamin@eecs.umich.edu)