Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Mergesort: Implementation, Complexity, and Comparison with Other Sorts, Schemes and Mind Maps of Design

An in-depth look at the Mergesort algorithm, including its implementation in Java, the number of compares and array accesses it uses, and its comparison with other sorting algorithms such as quicksort and insertion sort. The document also includes a proof of the divide-and-conquer recurrence for Mergesort and a discussion on practical improvements like using insertion sort for small subarrays and eliminating the copy to the auxiliary array.

Typology: Schemes and Mind Maps

2021/2022

Uploaded on 09/12/2022

pauleen
pauleen 🇬🇧

3.5

(9)

213 documents

1 / 57

Toggle sidebar

Related documents


Partial preview of the text

Download Mergesort: Implementation, Complexity, and Comparison with Other Sorts and more Schemes and Mind Maps Design in PDF only on Docsity! ROBERT SEDGEWICK | KEVIN WAYNE F O U R T H E D I T I O N Algorithms http://algs4.cs.princeton.edu Algorithms ROBERT SEDGEWICK | KEVIN WAYNE 2.2 MERGESORT ‣ mergesort ‣ bottom-up mergesort ‣ sorting complexity ‣ comparators ‣ stability 2 Two classic sorting algorithms: mergesort and quicksort Critical components in the world’s computational infrastructure. ・Full scientific understanding of their properties has enabled us to develop them into practical system sorts. ・Quicksort honored as one of top 10 algorithms of 20th century in science and engineering. Mergesort. [this lecture] Quicksort. [next lecture] ... ... 5 Abstract in-place merge demo E E G M R A C E R T lo mid mid+1 hi a[] sorted sorted Goal. Given two sorted subarrays a[lo] to a[mid] and a[mid+1] to a[hi], replace with sorted subarray a[lo] to a[hi]. 6 Abstract in-place merge demo A C E E E G M R R Ta[] sorted lo hi Goal. Given two sorted subarrays a[lo] to a[mid] and a[mid+1] to a[hi], replace with sorted subarray a[lo] to a[hi]. 7 Merging: Java implementation A G L O R H I M S T A G H I L M i j k lo himid aux[] a[] private static void merge(Comparable[] a, Comparable[] aux, int lo, int mid, int hi) { for (int k = lo; k <= hi; k++) aux[k] = a[k]; int i = lo, j = mid+1; for (int k = lo; k <= hi; k++) { if (i > mid) a[k] = aux[j++]; else if (j > hi) a[k] = aux[i++]; else if (less(aux[j], aux[i])) a[k] = aux[j++]; else a[k] = aux[i++]; } } copy merge Mergesort: animation 10 http://www.sorting-algorithms.com/merge-sort 50 random items in order current subarray algorithm position not in order Mergesort: animation 11 http://www.sorting-algorithms.com/merge-sort 50 reverse-sorted items in order current subarray algorithm position not in order 12 Mergesort: empirical analysis Running time estimates: ・Laptop executes 108 compares/second. ・Supercomputer executes 1012 compares/second. Bottom line. Good algorithms are better than supercomputers. insertion sort (N2) mergesort (N log N) computer thousand million billion thousand million billion home instant 2.8 hours 317 years instant 1 second 18 min super instant 1 second 1 week instant instant instant Proposition. If D (N) satisfies D (N) = 2 D (N / 2) + N for N > 1, with D (1) = 0, then D (N) = N lg N. Pf 2. [assuming N is a power of 2] ・Base case: N = 1. ・Inductive hypothesis: D (N) = N lg N. ・Goal: show that D (2N) = (2N) lg (2N). 15 Divide-and-conquer recurrence: proof by induction D (2N) = 2 D (N) + 2N = 2 N lg N + 2N = 2 N (lg (2N) – 1) + 2N = 2 N lg (2N) given inductive hypothesis algebra QED Proposition. Mergesort uses ≤ 6 N lg N array accesses to sort an array of length N. Pf sketch. The number of array accesses A (N) satisfies the recurrence: A (N) ≤ A (⎡N / 2⎤) + A (⎣N / 2⎦) + 6 N for N > 1, with A (1) = 0. Key point. Any algorithm with the following structure takes N log N time: Notable examples. FFT, hidden-line removal, Kendall-tau distance, … 16 Mergesort: number of array accesses public static void linearithmic(int N) { if (N == 0) return; linearithmic(N/2); linearithmic(N/2); linear(N); } solve two problems of half the size do a linear amount of work 17 Mergesort analysis: memory Proposition. Mergesort uses extra space proportional to N. Pf. The array aux[] needs to be of length N for the last merge. Def. A sorting algorithm is in-place if it uses ≤ c log N extra memory. Ex. Insertion sort, selection sort, shellsort. Challenge 1 (not hard). Use aux[] array of length ~ ½ N instead of N. Challenge 2 (very hard). In-place merge. [Kronrod 1969] A C D G H I M N U V A B C D E F G H I J M N O P Q R S T U V B E F J O P Q R S T two sorted subarrays merged result 20 Mergesort: practical improvements Stop if already sorted. ・Is largest item in first half ≤ smallest item in second half? ・Helps for partially-ordered arrays. A B C D E F G H I J A B C D E F G H I J M N O P Q R S T U V M N O P Q R S T U V private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi) { if (hi <= lo) return; int mid = lo + (hi - lo) / 2; sort (a, aux, lo, mid); sort (a, aux, mid+1, hi); if (!less(a[mid+1], a[mid])) return; merge(a, aux, lo, mid, hi); } 21 Mergesort: practical improvements Eliminate the copy to the auxiliary array. Save time (but not space) by switching the role of the input and auxiliary array in each recursive call. private static void merge(Comparable[] a, Comparable[] aux, int lo, int mid, int hi) { int i = lo, j = mid+1; for (int k = lo; k <= hi; k++) { if (i > mid) aux[k] = a[j++]; else if (j > hi) aux[k] = a[i++]; else if (less(a[j], a[i])) aux[k] = a[j++]; else aux[k] = a[i++]; } } private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi) { if (hi <= lo) return; int mid = lo + (hi - lo) / 2; sort (aux, a, lo, mid); sort (aux, a, mid+1, hi); merge(a, aux, lo, mid, hi); } merge from a[] to aux[] switch roles of aux[] and a[] assumes aux[] is initialize to a[] once, before recursive calls Basic algorithm for sorting objects = mergesort. ・Cutoff to insertion sort = 7. ・Stop-if-already-sorted test. ・Eliminate-the-copy-to-the-auxiliary-array trick. 22 Java 6 system sort http://www.java2s.com/Open-Source/Java/6.0-JDK-Modules/j2me/java/util/Arrays.java.html Arrays.sort(a) Bottom line. Simple and non-recursive version of mergesort. 25 Bottom-up mergesort: Java implementation public class MergeBU { private static void merge(...) { /* as before */ } public static void sort(Comparable[] a) { int N = a.length; Comparable[] aux = new Comparable[N]; for (int sz = 1; sz < N; sz = sz+sz) for (int lo = 0; lo < N-sz; lo += sz+sz) merge(a, aux, lo, lo+sz-1, Math.min(lo+sz+sz-1, N-1)); } } but about 10% slower than recursive, top-down mergesort on typical systems 26 Mergesort: visualizations top-down mergesort (cutoff = 12) bottom-up mergesort (cutoff = 12) Idea. Exploit pre-existing order by identifying naturally-occurring runs. Tradeoff. Fewer passes vs. extra compares per pass to identify runs. 27 Natural mergesort 1 5 10 16 3 4 23 9 13 2 7 8 12 14 first run 1 5 10 16 3 4 23 9 13 2 7 8 12 14 second run 1 3 4 5 10 16 23 9 13 2 7 8 12 14 merge two runs 1 5 10 16 3 4 23 9 13 2 7 8 12 14 input http://algs4.cs.princeton.edu ROBERT SEDGEWICK | KEVIN WAYNE Algorithms ‣ mergesort ‣ bottom-up mergesort ‣ sorting complexity ‣ comparators ‣ stability 2.2 MERGESORT 31 Computational complexity. Framework to study efficiency of algorithms for solving a particular problem X. Model of computation. Allowable operations. Cost model. Operation count(s). Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of all algorithms for X. Optimal algorithm. Algorithm with best possible cost guarantee for X. Example: sorting. ・Model of computation: decision tree. ・Cost model: # compares. ・Upper bound: ~ N lg N from mergesort. ・Lower bound: ・Optimal algorithm: lower bound ~ upper bound can access information only through compares (e.g., Java Comparable framework) Complexity of sorting 32 Decision tree (for 3 distinct keys a, b, and c) b < c yes no a < c yes no a < c yes no a c b c a b b a ca b c b < c yes no b c a c b a height of tree = worst-case number of compares a < b yes no code between compares (e.g., sequence of exchanges) each leaf corresponds to one (and only one) ordering; (at least) one leaf for each possible ordering 35 Complexity of sorting Model of computation. Allowable operations. Cost model. Operation count(s). Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of all algorithms for X. Optimal algorithm. Algorithm with best possible cost guarantee for X. Example: sorting. ・Model of computation: decision tree. ・Cost model: # compares. ・Upper bound: ~ N lg N from mergesort. ・Lower bound: ~ N lg N. ・Optimal algorithm = mergesort. First goal of algorithm design: optimal algorithms. 36 Complexity results in context Compares? Mergesort is optimal with respect to number compares. Space? Mergesort is not optimal with respect to space usage. Lessons. Use theory as a guide. Ex. Design sorting algorithm that guarantees ½ N lg N compares? Ex. Design sorting algorithm that is both time- and space-optimal? Lower bound may not hold if the algorithm can take advantage of: ・The initial order of the input. Ex: insert sort requires only a linear number of compares on partially- sorted arrays. ・The distribution of key values. Ex: 3-way quicksort requires only a linear number of compares on arrays with a constant number of distinct keys. [stay tuned] ・The representation of the keys. Ex: radix sort requires no key compares — it accesses the data via character/digit compares. 37 Complexity results in context (continued) Sort countries by total medals NOC = United States (USA) GM China (CHN)§ ma Russia (RUS)§ 43 Great Britain (GBR)* Will Germany (GER) @ Japan (JPN) (a Australia (AUS) Bf §j France (FRA) ‘e; South Korea (KOR) Bf Bj italy (ta) 46 38 24 29 11 7 7 11 13 8 29 28 25 17 19 14 16 11 8 9 29 22 32 19 14 17 12 12 11 104 88 81 65 44 38 35 34 28 28 40 Sort music library by artist 18 20 31 33 34 35 36 37 20 Name M Let It Be @ Take My Breath Away @ Circle Of Friends @ Dancing With Myself @ Rebel Yell @ Piano Man @ Pressure M@ The Longest Time @ Atomic @ Sunday Girl @ Call Me M@ Dreaming @ Hurricane @ The Times They Are A-Changin' @ Livin’ On A Prayer @ Beds Of Roses @ Runaway @ Rasputin (Extended Mix) @ Have You Ever Seen The Rain @ Total Eclipse Of The Heart @ Straight From The Heart @ Holding Out For A Hero ™ Dancing In The Dark @ Thunder Road @ Born To Run @ Jungleland ee ee ee = ae aR etn ee bet Tot By salle t yc) Artist The Beatles BERLIN Better Than Ezra Billy Idol Billy Idol Billy Joel Billy Joel Billy Joel Blondie Blondie Blondie Blondie Bob Dylan Bob Dylan Bon Jovi Bon Jovi Bon Jovi Boney M Bonnie Tyler Bonnie Tyler Bonnie Tyler Bonny Tyler Bruce Springsteen Bruce Springsteen Bruce Springsteen The Duvecte a Time 4:03 4:13 3:27 4:43 4:49 5:36 3:16 3:36 3:50 3:15 3:33 3:06 8:32 3:17 4:11 6:35 3:53 5:50 4:10 7:02 3:41 5:49 4:51 4:30 9:34 2.07 Album Let It Be Top Gun - Soundtrack Empire Records Don't Stop Rebel Yell Greatest Hits Vol. 1 Greatest Hits, Vol. Il (1978 - 1985) (Disc 2) Greatest Hits, Vol. Il (1978 - 1985) (Disc 2) Atomic: The Very Best Of Blon Atomic: The Very Best Of Blondie Atomic: The Very Best Of Blondie Atomic: The Very Best Of Blondie Desire Greatest Hits Cross Road Cross Road Cross Road Greatest Hits Faster Than The Speed Of Night Faster Than The Speed Of Night Faster Than The Speed Of Night Meat Loaf And Friends Sa cui > ee Born To Run Born To Run Born To Run Eneenct Comm The Cormdtencle (Mice 2) 41 Sort music library by song name Cross Road b= Toya ly Name 4 Artist Time Album 1 M Alive Pearl Jam 5:41 Ten 2 @ All Over The World Pixies 5:27 Bossanova a @ All Through The Night Cyndi Lauper 4:30 She's So Unusual 4 @ Allison Road Gin Blossoms 3:19 New Miserable Experience as) M Ama, Ama, Ama Y Ensancha El... Extremoduro 2:34 Deltoya (1992) 6 @ And We Danced Hooters 3:50 Nervous Night 7 @ As | Lay Me Down Sophie B. Hawkins 4:09 Whaler 8 @ Atomic Blondie 3:50 Atomic: The Very Best Of Blondie 9 @ Automatic Lover Jay-Jay Johanson 4:19 Antenna 10 @ Baba O'Riley The Who 5:01 Who's Better, Who's Best ll @ Beautiful Life Ace Of Base 3:40 The Bridge SVM ned tty [eel (> ee ecrta rts) 13 @ Black Pearl Jam 5:44 Ten 14 @ Bleed American Jimmy Eat World 3:04 Bleed American 15 @ Borderline Madonna 4:00 The Immaculate Collection 16 @ Born To Run Bruce Springsteen 4:30 Born To Run 17 @ Both Sides Of The Story Phil Collins 6:43 Both Sides 18 @ Bouncing Around The Room Phish 4:09 A Live One (Disc 1) 19 @ Boys Don't Cry The Cure 2:35 Staring At The Sea: The Singles 1979-1985 20 @ Brat Green Day 1:43 Insomniac 2. @ Breakdown Deerheart 3:40 Deerheart 22 @ Bring Me To Life (Kevin Roen Mix) Evanescence Vs. Pa... 9:48 23 ¥ Californication Red Hot Chili Pepp... 1:40 24 @ Call Me Blondie 3:33 Atomic: The Very Best Of Blondie 25 @ Can't Get You Out Of My Head Kylie Minogue 3:50 Fever 26 @ Celebration Kool & The Gang 3:45 Time Life Music Sounds Of The Seventies - C : 27 Shae Chobe Caste dos tinale Bak eee Deo oeee <> 45 Comparator interface: system sort To use with Java system sort: ・Create Comparator object. ・Pass as second argument to Arrays.sort(). Bottom line. Decouples the definition of the data type from the definition of what it means to compare two objects of that type. String[] a; ... Arrays.sort(a); ... Arrays.sort(a, String.CASE_INSENSITIVE_ORDER); ... Arrays.sort(a, Collator.getInstance(new Locale("es"))); ... Arrays.sort(a, new BritishPhoneBookOrder()); ... uses alternate order defined by Comparator<String> object uses natural order 46 Comparator interface: using with our sorting libraries To support comparators in our sort implementations: ・Use Object instead of Comparable. ・Pass Comparator to sort() and less() and use it in less(). public static void sort(Object[] a, Comparator comparator) { int N = a.length; for (int i = 0; i < N; i++) for (int j = i; j > 0 && less(comparator, a[j], a[j-1]); j--) exch(a, j, j-1); } private static boolean less(Comparator c, Object v, Object w) { return c.compare(v, w) < 0; } private static void exch(Object[] a, int i, int j) { Object swap = a[i]; a[i] = a[j]; a[j] = swap; } insertion sort using a Comparator To implement a comparator: ・Define a (nested) class that implements the Comparator interface. ・Implement the compare() method. public class Student { private final String name; private final int section; ... public static class ByName implements Comparator<Student> { public int compare(Student v, Student w) { return v.name.compareTo(w.name); } } public static class BySection implements Comparator<Student> { public int compare(Student v, Student w) { return v.section - w.section; } } } 47 Comparator interface: implementing this trick works here since no danger of overflow 50 Stability A typical application. First, sort by name; then sort by section. @#%&@! Students in section 3 no longer sorted by name. A stable sort preserves the relative order of items with equal keys. Selection.sort(a, new Student.ByName()); Andrews 3 A 664-480-0023 097 Little Battle 4 C 874-088-1212 121 Whitman Chen 3 A 991-878-4944 308 Blair Fox 3 A 884-232-5341 11 Dickinson Furia 1 A 766-093-9873 101 Brown Gazsi 4 B 766-093-9873 101 Brown Kanaga 3 B 898-122-9643 22 Brown Rohde 2 A 232-343-5555 343 Forbes Selection.sort(a, new Student.BySection()); Furia 1 A 766-093-9873 101 Brown Rohde 2 A 232-343-5555 343 Forbes Chen 3 A 991-878-4944 308 Blair Fox 3 A 884-232-5341 11 Dickinson Andrews 3 A 664-480-0023 097 Little Kanaga 3 B 898-122-9643 22 Brown Gazsi 4 B 766-093-9873 101 Brown Battle 4 C 874-088-1212 121 Whitman Q. Which sorts are stable? A. Need to check algorithm (and implementation). 51 Stability Chicago 09:00:00 Phoenix 09:00:03 Houston 09:00:13 Chicago 09:00:59 Houston 09:01:10 Chicago 09:03:13 Seattle 09:10:11 Seattle 09:10:25 Phoenix 09:14:25 Chicago 09:19:32 Chicago 09:19:46 Chicago 09:21:05 Seattle 09:22:43 Seattle 09:22:54 Chicago 09:25:52 Chicago 09:35:21 Seattle 09:36:14 Phoenix 09:37:44 Chicago 09:00:00 Chicago 09:00:59 Chicago 09:03:13 Chicago 09:19:32 Chicago 09:19:46 Chicago 09:21:05 Chicago 09:25:52 Chicago 09:35:21 Houston 09:00:13 Houston 09:01:10 Phoenix 09:00:03 Phoenix 09:14:25 Phoenix 09:37:44 Seattle 09:10:11 Seattle 09:10:25 Seattle 09:22:43 Seattle 09:22:54 Seattle 09:36:14 Chicago 09:25:52 Chicago 09:03:13 Chicago 09:21:05 Chicago 09:19:46 Chicago 09:19:32 Chicago 09:00:00 Chicago 09:35:21 Chicago 09:00:59 Houston 09:01:10 Houston 09:00:13 Phoenix 09:37:44 Phoenix 09:00:03 Phoenix 09:14:25 Seattle 09:10:25 Seattle 09:36:14 Seattle 09:22:43 Seattle 09:10:11 Seattle 09:22:54 sorted by time sorted by location (not stable) sorted by location (stable) no longer sorted by time still sorted by time Stability when sorting on a second key 52 Stability: insertion sort Proposition. Insertion sort is stable. Pf. Equal items never move past each other. public class Insertion { public static void sort(Comparable[] a) { int N = a.length; for (int i = 0; i < N; i++) for (int j = i; j > 0 && less(a[j], a[j-1]); j--) exch(a, j, j-1); } } i j 0 1 2 3 4 0 0 B1 A1 A2 A3 B2 1 0 A1 B1 A2 A3 B2 2 1 A1 A2 B1 A3 B2 3 2 A1 A2 A3 B1 B2 4 4 A1 A2 A3 B1 B2 A1 A2 A3 B1 B2 55 Stability: mergesort Proposition. Mergesort is stable. Pf. Suffices to verify that merge operation is stable. public class Merge { private static void merge(...) { /* as before */ } private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi) { if (hi <= lo) return; int mid = lo + (hi - lo) / 2; sort(a, aux, lo, mid); sort(a, aux, mid+1, hi); merge(a, aux, lo, mid, hi); } public static void sort(Comparable[] a) { /* as before */ } } Proposition. Merge operation is stable. Pf. Takes from left subarray if equal keys. 56 Stability: mergesort private static void merge(...) { for (int k = lo; k <= hi; k++) aux[k] = a[k]; int i = lo, j = mid+1; for (int k = lo; k <= hi; k++) { if (i > mid) a[k] = aux[j++]; else if (j > hi) a[k] = aux[i++]; else if (less(aux[j], aux[i])) a[k] = aux[j++]; else a[k] = aux[i++]; } } 0 1 2 3 4 A1 A2 A3 B D 5 6 7 8 9 10 A4 A5 C E F G 57 Sorting summary inplace? stable? best average worst remarks selection insertion shell merge timsort ? ✔ ½ N 2 ½ N 2 ½ N 2 N exchanges ✔ ✔ N ¼ N 2 ½ N 2 use for small N or partially ordered ✔ N log3 N ? c N 3/2 tight code; subquadratic ✔ ½ N lg N N lg N N lg N N log N guarantee; stable ✔ N N lg N N lg N improves mergesort when preexisting order ✔ ✔ N N lg N N lg N holy sorting grail
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved