Download Average case Analysis of Quicksort - Design and Analysis - Study Notes and more Study notes Digital Systems Design in PDF only on Docsity! Lecture No.14 4.3.5 Average-case Analysis of Quicksort We will now show that in the average case, quicksort runs in Θ(n log n) time. Recall that when we talked about average case at the beginning of the semester, we said that it depends on some assumption about the distribution of inputs. However, in the case of quicksort, the analysis does not depend on the distribution of input at all. It only depends upon the random choices of pivots that the algorithm makes. This is good, because it means that the analysis of the algorithm’s performance is the same for all inputs. In this case the average is computed over all possible random choices that the algorithm might make for the choice of the pivot index in the second step of the QuickSort procedure above. To analyze the average running time, we let T(n) denote the average running time of QuickSort on a list of size n. It will simplify the analysis to assume that all of the elements are distinct. The algorithm has n random choices for the pivot element, and each choice has an equal probability of 1/n of occurring. So we can modify the above recurrence to compute an average rather than a max, giving: The time T(n) is the weighted sum of the times taken for various choices of q. I.e., T(n) = [ 1/n ( T(0) + T(n - 1) + n ) + 1/n ( T(1) + T(n - 2) + n ) + 1/n ( T(2) + T(n - 3 ) + n ) + · · · + 1/n (T(n - 1) + T(0) + n)I ] We have not seen such a recurrence before. To solve it, expansion is possible but it is rather tricky. We will attempt a constructive induction to solve it. We know that we want a _(n log n). Let us assume that T(n) ≤ cn log n) for n ≥ 2 where c is a constant. For the base case n = 2 we have We want this to be at most c2 log 2, i.e., Docsity.com T(2) ≤ c2 log 2 or 4 ≤ c2 log 2 therefore c ≥ 4/(2 log 2) ≈ 2.88. For the induction step, we assume that n ≥ 3 and The induction hypothesis is that for any n′ < n, we have T(n′) ≥ c n′ log n′. We want to prove that it is true for T(n). By expanding T(n) and moving the factor of n outside the sum we have Observe that the two sums add up the same values. One counts up and other counts down. Thus we can replace them with 2∑ − = 1 0 )( n q qT . We will extract T(0) and T(1) and treat them specially. These two do not follow the formula. We will apply the induction hypothesis for q < n we have Docsity.com