Docsity
Docsity

Prepara tus exámenes
Prepara tus exámenes

Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity


Consigue puntos base para descargar
Consigue puntos base para descargar

Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium


Orientación Universidad
Orientación Universidad

Analysis of Randomized Algorithms: Expectation and High Probability Bounds, Resúmenes de Inglés

The analysis of randomized algorithms, focusing on the concepts of expectation and high probability bounds. It covers the max-two problem and the select algorithm, providing an expected work analysis and high probability span analysis for quicksort. The document also mentions the connection between quicksort and Treaps.

Tipo: Resúmenes

2020/2021

Subido el 16/11/2021

ena-stale
ena-stale 🇪🇸

1 documento

1 / 18

Toggle sidebar

Documentos relacionados


Vista previa parcial del texto

¡Descarga Analysis of Randomized Algorithms: Expectation and High Probability Bounds y más Resúmenes en PDF de Inglés solo en Docsity! Chapter 11 Randomized Algorithms randomized The theme of this chapter is waderdrizo algorithms. These are algorithms that make use of ran- domness in their computation. You might know of quicksort, which is efficient on average when it uses a random pivot, but can be bad for any pivot that is selected without randomness. Even though analyzing randomized algorithms can be difficult, randomization turns out to be a crucial technique for algorithm design, making the extra effort well worth. For example, for some problems randomized algorithms are simpler or faster than non-randomized algo- rithms. The problem of primality testing (PT), which is to determine if an integer is prime, is a good example. In the late 70s Miller and Rabin developed a famous and simple random- ized algorithm for the problem that only requires polynomial work. For over 20 years it was not known whether the problem could be solved in polynomial work without randomization. Eventually a polynomial time algorithm was developed, but it is much more complicated and computationally more costly than the randomized version. Hence in practice everyone still uses the randomized version. There are many other problems in which a randomized solution is simpler or cheaper than the best non-randomized solution. In this chapter, after covering the prerequisite background, we will consider some such problems. The first we will consider is the following simple prob- lem: Question: How many comparisons do we need to find the top two largest numbers in a sequence of n distinct numbers? Without the help of randomization, there is a trivial algorithm for finding the top two largest numbers in a sequence that requires about 2n — 3 comparisons. We show, however, that if the order of the input is randomized, then the same algorithm uses only n + O(log n) comparisons in expectation (on average). This matches a more complicated deterministic version based on tournaments. Randomization plays a particularly important role in developing parallel algorithms, and analyzing such algorithms introduces some new challenges. In this chapter we will look at two randomized algorithms with significant parallelism: one for finding the k** order statistics of a 183 184 CHAPTER 11. RANDOMIZED ALGORITHMS sequences, and the other is quicksort. In future chapters we will cover many other randomized algorithms. In this book we require that randomized algorithms always return the correct answer, but their costs (work and span) will depend on random choices. Such algorithms are sometimes called Las Vegas algorithms. Algorithms that run in a fixed amount of time, but may or may not return the correct answer, depending on random choices, are called Monte Carlo algorithms. 11.1 Expectation versus High Probability In analyzing costs for a randomized algorithms there are two types of bounds that are useful: expected bounds, and high-probability bounds. Expected bounds tell us about the average cost across all random choices made by the algo- rithm. For example if an algorithm has O(n) expected work, it means that on averaged over all random choices it makes in all runs, the algorithm performs O(n) work. Since expected bounds are averaged over all random choices in all possible runs, there can be runs that require more or less work. For example once in every 1/n tries the algorithm might require O(n?) work, and (or) once in every y/n tries the algorithm might require O(n?/2) work. High-probability bounds on the other hand tell us that it is very unlikely that the cost will be above some bound. For a problem of size n we say that some property is true with high probability if it is true with probability 1 — 1/n* for some constant k > 1. This means the inverse is true with very small probability 1/n*. Now if we had n experiments each with inverse probability 1/n* we can use the union bound to argue that the total inverse probability isn-1/n* =1/n*"!. This means that for k > 2 the probability 1 — 1/n%7! is still true with high probability. High-probability bounds are typically stronger than expectation bounds. Expected bounds are quite convenient when analyzing work (or running time in traditional sequential algorithms). This is because the linearity of expectations (Chapter 10) allows adding expectations across the components of an algorithm to get the overall expected work. For example, if the algorithm performs n tasks each of which take on average 2 units of work, then the total work on average across all tasks will be n x 2 = 2n units. Unfortunately this kind of composition does not work when analyzing the span of an algorithm, because this requires taking the maximum of random variables, rather than their sum. For example, if we had n tasks each of which has expected span of 2 units of time, we cannot say that the expected span across all tasks is 2 units. It could be that most of the time each task has a span of 2 units, but that once with probability 1/n, the task requires n units. The expected span for each task is still close to 2 units but if we have n tasks chances are high that one task will take n units and the expected maximum will be close to n rather than 2. We therefore cannot compose the expected span from each task by taking a maximum. Unlike expected bounds, high-probability bounds can allow us to bound span. For exam- ple, lets say we know that every task finishes in 2 units of time with probability 1 — 1/n*, or equivalently that each task takes more than 2 units of time with probability 1/n? and takes at most n units of time otherwise. Now with n tasks the probability that there will be at least one January 16, 2018 (DRAFT, PPAP) 11.2. FINDING THE TWO LARGEST 187 that permutation. This can be defined as n Y=_1_ + n-2 +) 7Xi. Line 7 Line 4 e5 We are interested in computing the expected value of Y. By linearity of expectation, we have E[Y] = E 1+ (n -2+ Ex 1=3 = 1+(n-2) +) E[Xi]. 1=3 Our tasks therefore boils down to computing E[X;] for i=3,..., n. To compute this expecta- tion, we ask ourselves: What is the probability that a; > m2? A moment's thought shows that the condition a; > ma holds exactly when a, is either the largest element or the second largest element in (a1,..., a). So ultimately we're asking: what is the probability that a; is the largest or the second largest element in randomly-permuted sequence of length ¿2 To compute this probability, we note that each element in the sequence is equally likely to be anywhere in the permuted sequence (we chose a random permutation. In particular, if we look at the k-th largest element, it has 1/i chance of being at a,. (You should also try to work it out using a counting argument.) Therefore, the probability that a; is the largest or the second largest element in (a1,..., as) is 1 + 1 = 2 so E[X]] =1-2=32/i. Plugging this into the expression for E [Y], we obtain n E[Y] = 14 (n- 2) +) E[X;] i=3 =n-—4+2H,, where HH, is the n-th Harmonic number. But we know that H,, < 1+ lgn, so we get E [Y] < n —2 + 21gn. We can also use the following bound on Harmonic sums: H(n) = O(lgn+ 1), or more precisely 1 1 Ha 145 d+ 4 = lan 4 y 4 en, January 16, 2018 (DRAFT, PPAP) 188 CHAPTER 11. RANDOMIZED ALGORITHMS where -y is the Euler-Mascheroni constant, which is approximately 0.57721---, and £n E which tends to O as n approaches oo. This shows that the summation and integral of 1/i are almost identical (up to a small adative constant and a low-order vanishing term). 11.3 Order statistics In statistics, computing the order statistics of sample, which we may represent as a sequence, has many important applications. We can precisely state the problem as follows. Problem 11.3. [Order statistics] Given an a sequence and an integer k where 0 < k < Jal, and a comparison < defining a total ordering over the elements of the sequence, find the kt? order statistics, .e., k** smallest element, in the sequences. We can solve this problem by sorting first and selecting the k*'* element but this would require O(n log n) work, assuming that comparisons require constant work. We wish to do better; in particular we would like to achieve linear work and still achieve O(log? n) span. For the purposes of simplicity, let's assume that sequences consist of unique elements and consider the following simple algorithm. Based on the contraction design technique, the algorithm uses randomization to contract the problem to a smaller instance. Algorithm 11.4. [contracting k** smallest] select a k= let p=al0] l=(x€a|r<p) r=(€b|x>p) in if (k<|(]) then select £ k else if (% < la] —|r|) then p else select r (k-— (la| — [r|)) January 16, 2018 (DRAFT, PPAP) 11.3. ORDER STATISTICS 189 Example 11.5. Example runs of select illustrated by a “pivot tree.” For illustrative purposes, we show all possible recursive calls being explored down to singleton se- quences. In reality, the algorithm explores only one path. The path highlighted with red is the path of recursive calls taken by select when searching for the first-order statistics, k = 0. The path highlighted with brown is the path of recursive calls taken by select when searching for the fifth-order statistics, k = 4. The path highlighted with green is the path of recursive calls taken by select when searching for the eight-order statistics, k =7. Keys <3,2,6,1,5,7,4,8> Example Run <21> <6,5,7,4,8> The algorithm divides the input into left and right sequences, l and r, and figures out the side kth smallest must be in, and recursively explores that side. When exploring the right side, r, the parameter k needs to be adjusted by since all elements less or equal to the pivot p are being thrown out: there are [a| — |r| such elements. As written the algorithm picks as pivot the first key in the sequence instead of a random key. As with the two-largest problem, we can add randomness by first randomly permuting a sequence t to generate the input sequence a and then applying select on a. This is equivalent to randomly picking a pivot at each step of contraction. Let's analyze the work and span of the randomized algorithm where we pick pivots uni- formly randomly. Let n = ja] and define X(n) = max(|£|, [r|)/[a], which is the fractional size of the larger side. Notice that X is an upper bound on the fractional size of the side the algo- rithm actually recurs into. Now since lines 3 and 4 are simply two filter calls, we have the following recurrences: W(n) < W(X(m)-n) + O(m) Sín) < S(X(n)-n) + O(logn) Let's first look at the work recurrence. Specifically, we are interested in E [W(n)]. First, let's try to get a sense of what happens in expectation. The key quantity in bounding th expectation is bounding E [X(n)]. To this end, let's none first that all pivots are equally likely. We can thus draw the following plot of the size of £ and January 16, 2018 (DRAFT, PPAP) 192 CHAPTER 11. RANDOMIZED ALGORITHMS In applying Markov's inequality, we choose 1, because we know that the algorithm terminates for that input size. By increasing the constant factor from 10 to 20 would decrease the proba- bility to n77-18, which is extremely unlikely: for n = 10% this is 107%, We have therefore shown that the number of steps is O(log n) with high probability. Each step has span O(log n) so the overall span is O(log? n) with high probability. Using the high probability bound, we can bound the expected span by using the total ex- pectation theorem. For brevity let the random variable Y be defined as Y = Y¡01gn, E[S] = L,Pr(y)E[S|Y =y]. E ya Pr(9) ElS| Y =3] +2,21 Pr(9) E[S| Y =0] (1=n3)0(8?n) 4 n73%0(n) O(lg?n). IIA Il The expected bound follows by the fact that with high probability the depth of the recursive calls is O(lg n) and that each recursive call has O(lgn) span, because it requires a sequences filter. The span for the case when the span is not greater that 10lgn contributes only a constant value to the expectation as long as itis a polynomial that is less that n92, In summary, we have shown than the select algorithm on input of size n does O(n) work in expectation and has O(log? n) span with high probability. As mentioned at the start of the chapter, we will typically be analyzing work using expectation and span using high probability. 11.4 Quicksort Moving on to a more complex algorithm, let's analyze the work and span of the randomized quicksort algorithm. In later chapters we will see that the analysis of quicksort presented here isis effectively identical to the analysis of a certain type of balanced tree called Treaps. Itis also the same as the analysis of “unbalanced” binary search trees under random insertion. Consider the quicksort algorithm given in Algorithm 11.7. In this algorithm, we intention- ally leave the pivot-choosing step unspecified because the property we are discussing holds regardless of the choice of the pivot. There is plenty of parallelism in this version quicksort. There is both parallelism due to the two recursive calls and in the fact that the filters for selecting elements greater, equal, and less than the pivot can be parallel. Note that each call to quicksort either makes no recursive calls (the base case) or two re- cursive calls. The call tree is therefore binary. We will often find it convenient to map the run of a quicksort to a binary-search tree (BST) representing the recursive calls along with the pivots chosen. We will sometimes refer to this tree as the call tree or pivot tree. We will use this call-tree representation to reason about the properties of quicksort, e.g., the comparisons performed, its span. An example is shown in Example 11.8. Let's consider some strategies for picking a pivot. e Always pick the first element: If the sequence is sorted in increasing order, then picking the first element is the same as picking the smallest element. We end up with a lopsided January 16, 2018 (DRAFT, PPAP) 11.4. QUICKSORT 193 Algorithm 11.7. [Quicksort] quicksort a = if la] =0 then a else let p = pick a pivot from a a=(x€ea|x<p) a =(x€ea|x=p) a=(€a|x>p) (s1,s3) = (sort al || sort az) in $1 ++ da ++ 83 end Example 11.8. An example run of quicksort along with its pivot tree. Keys <7,4,2,3, 5, 8, 1, 6> Example Run <2,1> <7, 4, 5, 8, 6> <1> recursion tree of depth n. The total work is O(n?) since n—i keys will remain at level ¿and hence we will do n —¿—1 comparisons at that level for a total of 1 (n—i¿—1). Similarly, if the sequence is sorted in decreasing order, we will end up with a recursion tree that is lopsided in the other direction. In practice, it is not uncommon for a sort function input to be a sequence that is already sorted or nearly sorted. e Pick the median of three elements: Another strategy is to take the first, middle, and the last elements and pick the median of them. For sorted lists the split is even, so each side contains half of the original size and the depth of the tree is O(log n). Although this January 16, 2018 (DRAFT, PPAP) 194 CHAPTER 11. RANDOMIZED ALGORITHMS strategy avoids the pitfall with sorted sequences, it is still possible to be unlucky, and in the worst-case the costs and tree depth are the same as the first strategy. This is the strategy used by many library implementations of quicksort. Can you think of a way to slow down a quicksort implementation that uses this strategy by picking an adversarial input? e Pick an element randomly: It is not immediately clear what the depth of this is, but intuitively, when we choose a random pivot, the size of each side is not far from n/2 in expectation. This doesn't give us a proof but it gives us hope that this strategy will result in a tree of depth O(log n) in expectation or with high probability. Indeed, picking a random pivot gives us expected O(nlogn) work and O(log? n) span for quicksort and an expected O (log n)-depth tree, as we will show. Analysis of Quicksort To develop some intuition for the span analysis, let's consider the probability that we split the input sequence more or less evenly. If we select a pivot that is greater than t,,/, and less than t3n/4 then X(n) is at most 3n/4. Since all keys are equally likely to be selected as a pivot this probability is dera = 1/2. The figure below illustrates this. n/a E 3n/a o o <— n/4 keys > n/2 keys ———»— n/4 keys > This observations implies that at each level of the call tree (every time a new pivot is se- lected), the size of the input to both calls decrease by a constant fraction (of 3/4). At every two levels, the probability that the input size decreases by 3/4 is the probability that it decreases at either step, which is at least 1 — 3 . 3 = i etc. More generally, after m such steps, the prob- ability that the input size decreases by a factor of 3/4is 1— 5”. Thus the probability that the input size decreases by a factor of 3/4 approaches 1 quickly. For example if m = 10 then this probability is 0.999. Thus we can conclude that quicksort behaves like a balanced divide and conquer algorithm and it should complete after clogn levels for some constant c. We now make this intuition more precise. There are many methods of analysis that we can use. In the rest of this section, we consider one in detail, which is based on counting, and outline another, which is based establishing a recurrence, which can then be solved. For the analysis, we assume a priority-based selection technique for pivots. At the start of the algorithm, we assign each key a random priority uniformly at random from the real interval [0, 1] such that each key has a unique priority. We then pick in Line 5 the key with the highest priority. Notice that once the priorities are decided, the algorithm is completely deterministic. In addition, we assume a version of quicksort that compares the pivot p to each key in S once (instead of 3 times, once to generate each of ax, az, and az). January 16, 2018 (DRAFT, PPAP) 11.4. QUICKSORT 197 + The pivot is less than t; or greater than t;. Then t; and t; are either both in a, or both in az, respectively. Whether t; and t; are compared will be determined in some later recursive call to quicksort. With this intuition in mind, we can establish the following claim. Claim 11.12. For ¡ < j, t, and t; are compared if and only if p; or pj has the highest priority among (p;, Pi+1)--->Pj)- Proof. Assume first that t; (t;) has the highest priority. In this case, all the elements in the subsequence t;... .t¿ will move together in the call tree until £, (t,) is selected as pivot. When it is selected as pivot, t; and t; will be compared. This proves the first half of the claim. For the second half, assume that t; and t; are compared. For the purposes of contradiction, assume that there is a key tp,i < k < ¿with a higher priority between them. In any collection of Keys that include t; and t;, t,, will become a pivot before either of them. Since t; < ty, < tj it will separate t; and t; into different buckets, so they are never compared. This is a contradiction; thus we conclude there is no such t;. O Therefore, for t; and t; to be compared, p, or p; has to be bigger than all the priorities in between. Since there are j — i + 1 possible keys in between (including both ¿and ¿) and each has equal probability of being the highest, the probability that either ¿ or ¿ is the highest is 2/(1 —i +1). Therefore, E[X:¡] =Pr[X:,=1] = Pr [p; or p, is the maximum among (p;,. ..,p3)) _ 2 INETESE The bound indicates that the closer two keys are in the sorted order (t) the more likely it is that they are compared. For example, the keys t; is compared to t;,1 with probability 1. It is easy to understand why if we consider the corresponding pivot tree. One of t; and t;,, must be an ancestor of the other in the pivot tree: there is no other element that could be the root of a subtree that has t; in its left subtree and t;+1 in its right subtree. Regardless, t; and t;,1 will be compared. If we consider t; and t;+» there could be such an element, namely t;+1, which could have t; in its left subtree and t;,2 in its right subtree. That is, with probability 1/3, ti41 has the highest probability of the three and t; is not compared to t;+», and with probability 2/3 one of t; and t;+2 has the highest probability and, the two are compared. In general, the probability of two elements being compared is inversely proportional to the number of elements between them when sorted. The further apart the less likely they will be compared. Analogously, the further apart the less likely one will be the ancestor of the other in the related pivot tree. January 16, 2018 (DRAFT, PPAP) 198 CHAPTER 11. RANDOMIZED ALGORITHMS Hence, the expected number of comparisons made in randomized quicksort is n-1 n ElY(m)] < Y) Y Elx] i=1 j=i4+1 n-1 n 2 _ 2 A j-i+1 n-1n-i+1 2 VN i=l k=2 n—1 < 2n y Hna i=1 =2nH,, € O(n log n). Note that in the derivation of the asymptotic bound, we used the fact that H,, = Inn + O(1). Indirectly, we have also shown that the average work for the basic deterministic quicksort (always pick the first element) is also O(n log n). Just shuffle the data randomly and then apply the basic qui cksort algorithm. Since shuffling the input randomly results in the same input as picking random priorities and then reordering the data so that the priorities are in decreas- ing order, the basic qui cksort on that shuffled input does the same operations as randomized quicksort on the input in the original order. Thus, if we averaged over all permutations of the input the work for the basic qui cksort is O(n log n) on average. Expected span of Quicksort. We now analyze the span of quicksort. All we really need to calculate is the depth of the pivot tree, since each level of the tree has span O(logn)—needed for the filter. We argue that the depth of the pivot tree is O(logn) by relating it to the number of contraction steps of the randomized select we considered in Section ??. We refer to the qn node of the pivot tree as the node corresponding to the ¿* smallest key. This is also the i* node in an in-order traversal. Claim 11.13. The path from the root to the ¿** node of the pivot tree is the same as the steps of select on k = ¡. Thatis to the say that the distribution of pivots selected along the path and the sizes of each problem is identical. The reason this is true, is that select is the same as quicksort except that it only goes down one of the two recursive branches—the branch that contains the k* key. Recall that for select, we showed that the length of the path is more than 101gn with probability at most 1/n?15, This means that the length of any path being longer that 101gn is tiny. This does not suffice to conclude, however, that there are no paths longer than 10 lg n, because there are many paths in the pivot tree, and because we only need one to be long to impact the span. Luckily, we don't have too many paths to begin with. We can take advantage of this property by using the union bound, which says that the probability of the union of a collection of events is at most the sum of the probabilities of the events. To apply the union bound, consider the event that January 16, 2018 (DRAFT, PPAP) 11.4. QUICKSORT 199 the depth of a node along a path is larger 101g n, whichis 1/n**. The total probability that any of the n leaves have depth larger than 101g n is Pr [depth of quicksort pivot tree > 101gn] < == = 215: We thus have our high probability bound on the depth of the pivot tree. The overall span of randomized quicksort is therefore O(log?n) with high probability. As in select, we can establish an expected bound by using the total expectation theorem. We leave this as an exercise to the reader. Alternative Analysis. Another way to analyze the work of quicksort is to write a recut- rence for the expected work (number of comparisons) directly. This is the approach taken by Tony Hoare in his original paper. For simplicity we assume there are no equal keys (equal keys just reduce the cost). The recurrence for the number of comparisons Y (n) done by quicksort is then: Y(n) = Y(X(n)+Y(n-X(n) -1)+n-1 where the random variable Y (n) is the size of the set as (we use X(n) instead of Y,, to avoid double subscripts). We can now write an equation for the expectation of X (n). E [Y (n)] E[Y(X(n)) +Y(n- X(n) -1)+n-1] E[Y(X(n))] +E[Y (n-— X(n) - 1) +n-1 n—1 = z Y EY) +E[Y (m1) +n-1 i=0 where the last equality arises since all positions of the pivot are equally likely, so we can just take the average over them. This can be by guessing the answer and using substitution. It gives the same result as our previous method. We leave this as exercise. We can use a similar strategy to analyze span. Recall that in randomized quicksort, at each recursive call, we partition the input sequence a of length n into three subsequences a, az, and az, such that elements in the subsequences are less than, equal, and greater than the pivot, respectfully. Let the random variable X(n) = max(Jax], [a2|), which is the size of larger subsequence. The span of quicksort is determined by the sizes of these larger subsequences. For ease of analysis, we will assume that |a»| = 0, as more equal elements will only decrease the span. As this partitioning uses filter we have the following recurrence for span for input size n S(n) = S(X(n)) + O(logn). For the analysis, we shall condition the span on the random variable denoting the size of the maximum half and apply the total expectation theorem. January 16, 2018 (DRAFT, PPAP)
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved