Algorithms/Randomization

From testwiki
Revision as of 20:16, 6 December 2007 by 89.136.30.19 (talk) (find-max)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Template:Computer science:AlgorithmsTOC

As deterministic algorithms are driven to their limits when one tries to solve hard problems with them, a useful technique to speed up the computation is randomization. In randomized algorithms, the algorithm has access to a random source, which can be imagined as tossing coins during the computation. Depending on the outcome of the toss, the algorithm may split up its computation path.

There are two main types of randomized algorithms: Las Vegas algorithms and Monte-Carlo algorithms. In Las Vegas algorithms, the algorithm may use the randomness to speed up the computation, but the algorithm must always return the correct answer to the input. Monte-Carlo algorithms do not have the former restriction, that is, they are allowed to give wrong return values. However, returning a wrong return values must have a small probability, otherwise that Monte-Carlo algorithm would not be of any use.

Many approximation algorithms use randomization.

Ordered Statistics

Before covering randomized techniques, we'll start with a deterministic problem that leads to a problem that utilitizes randomization. Suppose you have an unsorted array of values and you want to find

  • the maximum value,
  • the minimum value, and
  • the median value.

In the immortal words of one of our former computer science professors, "How can you do?"

find-max

First, it's relatively straightforward to find the largest element:

// find-max -- returns the maximum element
function find-max(array vals[1..n]): element
  let result := vals[1]
  for i from 2 to n:
    result := max(result, vals[i])
  repeat
  
  return result
end

As assignment of to result will work as well but this is a useless call to the max function since the first element compared gets set to result. By initializing result as such the function only requires n-1 comparisons. (Moreover, in languages capable of metaprogramming, the data type may not be strictly numerical and there might be no good way of assigning ; using vals[1] is type-safe.)

A similar routine to find the minimum element can be done by calling the min function instead of the max function.

find-min-max

But now suppose you want to find the min and the max at the same time; here's one solution:

// find-min-max -- returns the minimum and maximum element of the given array
function find-min-max(array vals): pair
  return pair {find-min(vals), find-max(vals)}
end

Because find-max and find-min both make n-1 calls to the max or min functions (when vals has n elements), the total number of comparisons made in find-min-max is 2n2.

However, some redundant comparisons are being made. These redundancies can be removed by "weaving" together the min and max functions:

// find-min-max -- returns the minimum and maximum element of the given array
function find-min-max(array vals[1..n]): pair
  let min := 
  let max := 
  
  if n is odd:
    min := max := vals[1]
    vals := vals[2,..,n]          // we can now assume n is even
    n := n - 1
  fi
  
  for i:=1 to n by 2:             // consider pairs of values in vals
    if vals[i] < vals[i + 1]:
      let a := vals[i]
      let b := vals[i + 1]
    else:
      let a := vals[i + 1]
      let b := vals[i]            // invariant: a <= b
    fi
    
    if a < min: min := a fi
    if b > max: max := b fi
  repeat
  
  return pair {min, max}
end

Here, we only loop n/2 times instead of n times, but for each iteration we make three comparisons. Thus, the number of comparisons made is (3/2)n=1.5n, resulting in a 3/4 speed up over the original algorithm.

Only three comparisons need to be made instead of four because, by construction, it's always the case that ab. (In the first part of the "if", we actually know more specifically that a<b, but under the else part, we can only conclude that ab.) This property is utilized by noting that a doesn't need to be compared with the current maximum, because b is already greater than or equal to a, and similarly, b doesn't need to be compared with the current minimum, because a is already less than or equal to b.

In software engineering, there is a struggle between using libraries versus writing customized algorithms. In this case, the min and max functions weren't used in order to get a faster find-min-max routine. Such an operation would probably not be the bottleneck in a real-life program: however, if testing reveals the routine should be faster, such an approach should be taken. Typically, the solution that reuses libraries is better overall than writing customized solutions. Techniques such as open implementation and aspect-oriented programming may help manage this contention to get the best of both worlds, but regardless it's a useful distinction to recognize.

find-median

Finally, we need to consider how to find the median value. One approach is to sort the array then extract the median from the position vals[n/2]:

// find-median -- returns the median element of vals
function find-median(array vals[1..n]): element
  assert (n > 0)
  
  sort(vals)
  return vals[n / 2]
end

If our values are not numbers close enough in value (or otherwise cannot be sorted by a radix sort) the sort above is going to require O(nlogn) steps.

However, it is possible to extract the nth-ordered statistic in O(n) time. The key is eliminating the sort: we don't actually require the entire array to be sorted in order to find the median, so there is some waste in sorting the entire array first. One technique we'll use to accomplish this is randomness.

Before presenting a non-sorting find-median function, we introduce a divide and conquer-style operation known as partitioning. What we want is a routine that finds a random element in the array and then partitions the array into three parts:

  1. elements that are less than or equal to the random element;
  2. elements that are equal to the random element; and
  3. elements that are greater than or equal to the random element.

These three sections are denoted by two integers: j and i. The partitioning is performed "in place" in the array:

// partition -- break the array three partitions based on a randomly picked element
function partition(array vals): pair{j, i}

Note that when the random element picked is actually represented three or more times in the array it's possible for entries in all three partitions to have the same value as the random element. While this operation may not sound very useful, it has a powerful property that can be exploited: When the partition operation completes, the randomly picked element will be in the same position in the array as it would be if the array were fully sorted!

This property might not sound so powerful, but recall the opitimzation for the find-min-max function: we noticed that by picking elements from the array in pairs and comparing them to each other first we could reduce the total number of comparisons needed (because the current min and max values need to be compared with only one value each, and not two). A similar concept is used here.

While the code for partition is not magical, it has some tricky boundary cases:

// partition -- break the array into three ordered partitions from a random element
function partition(array vals): pair{j, i}
  let m := 0
  let n := vals.length - 1
  let irand := random(m, n)   // returns any value from m to n
  let x := vals[irand]
  
  TODO: Still working on the code! (Testing it in C and Python, and I *sigh* have bugs)
end

We can use partition as a subroutine for a general find operation:

// find -- moves elements in vals such that location k holds the value it would when sorted
function find(array vals, integer k)
  assert (0 <= k < vals.length)        // k it must be a valid index
  if vals.length <= 1:
    return
  fi
  
  let pair (j, i) := partition(vals)
  if k <= i:
    find(a[0,..,i], k)
  else-if j <= k:
    find(a[j,..,n], k - j)
  fi
  TODO: debug this!
end

Which leads us to the punch-line:

 // find-median -- returns the median element of vals
function find-median(array vals): element
  assert (vals.length > 0)
  
  let median_index := vals.length / 2;
  find(vals, median_index)
  return vals[median_index]
end

One consideration that might cross your mind is "is the random call really necessary?" For example, instead of picking a random pivot, we could always pick the middle element instead. Given that our algorithm works with all possible arrays, we could conclude that the running time on average for all of the possible inputs is the same as our analysis that used the random function. The reasoning here is that under the set of all possible arrays, the middle element is going to be just as "random" as picking anything else. But there's a pitfall in this reasoning: Typically, the input to an algorithm in a program isn't random at all. For example, the input has a higher probability of being sorted than just by chance alone. Likewise, because it is real data from real programs, the data might have other patterns in it that could lead to suboptimal results.

To put this another way: for the randomized median finding algorithm, there is a very small probability it will run suboptimally, independent of what the input is; while for a deterministic algorithm that just picks the middle element, there is a greater chance it will run poorly on some of the most frequent input types it will receive. This leads us to the following guideline:

Randomization Guideline:
If your algorithm depends upon randomness, be sure you introduce the randomness yourself instead of depending upon the data to be random.

Note that there are "derandomization" techniques that can take an average-case fast algorithm and turn it into a fully deterministic algorithm. Sometimes the overhead of derandomization is so much that it requires very large datasets to get any gains. Nevertheless, derandomization in itself has theoretical value.

The randomized find algorithm was invented by C. A. R. "Tony" Hoare. While Hoare is an important figure in computer science, he may be best known in general circles for his quicksort algorithm, which we discuss in the next section.

Quicksort

The median-finding partitioning algorithm in the previous section is actually very close to the implementation of a full blown sorting algorithm.

Shuffling an Array

[TODO: for n elements, you pick randomly in chances 1 to n an element of the array and swap that element with the first; then, randomly pick from 2 to n, and swap it with the second element of the array; and so on]

Equal Multivariate Polynomials

[TODO: as of now, there is no known deterministic polynomial time solution, but there is a randomized polytime solution. The canonical example used to be IsPrime, but a deterministic, polytime solution has been found.]

Skip Lists

[TODO: Talk about skips lists. The point is to show how randomization can sometimes make a structure easier to understand, compared to the complexity of balanced trees.]

Derandomization

[TODO: Deterministic algorithms for Quicksort exist that perform as well as quicksort in the average case and are guaranteed to perform at least that well in all cases. Best of all, no randomization is needed. Also in the discussion should be some perspective on using randomization: some randomized algorithms give you better confidence probabilities than the actual hardware itself! (e.g. sunspots can randomly flip bits in hardware, causing failure, which is a risk we take quite often)]

[Main idea: Look at all blocks of 5 elements, and pick the median (O(1) to pick), put all medians into an array (O(n)), recursively pick the medians of that array, repeat until you have < 5 elements in the array. This recursive median constructing of every five elements takes time T(n)=T(n/5) + O(n), which by the master theorem is O(n). Thus, in O(n) we can find the right pivot. Need to show that this pivot is sufficiently good so that we're still O(n log n) no matter what the input is. This version of quicksort doesn't need rand, and it never performs poorly. Still need to show that element picked out is sufficiently good for a pivot.]

Exercises

  1. Write a find-min function and run it on several different inputs to demonstrate its correctness.



Template:Computer science:AlgorithmsTOC