RANDOMIZED ALGORITHM
Randomized Algorithm is one that predicts what to do next in its logic by using random numbers. In a standard algorithm, it is typically used to reduce the running time, time complexity, space complexity, or memory used.
The area of randomized algorithms has been observed with exponential growth in the last decade. During this time, the randomized algorithm evolved from a tool in computational number theory to a widely used algorithm in a variety of applications. Randomization’s simplicity and speed are the two advantages that have supported this development. For several applications, a randomized algorithm is the simplest or fastest algorithm available.
A randomized algorithm M with domain A and discrete range B is associated with a mapping
M: A -> Δ(B). On input a ∈ A, the algorithm M outputs M(a) = b with probability (M(a))b, for each b ∈ B. The probability space is over the coin flips of the algorithm M.
How does randomization work:
Computers can create genuinely random numbers by analyzing unpredictably generated data, such as mouse movements or fan noise, and generating data from it. We term this phenomenon as entropy.
They even use an algorithm to generate “pseudorandom” numbers that look random despite the fact that they aren’t.
Generating a Pseudo-Random number:
• Accept a seed or key as an initial input number.
• To generate the result, use that seed in a series of mathematical operations.
• As seed for the next iteration, should use resulting random number.
• Repeat the procedure to create the illusion of randomness.
Importance of randomized algorithms:
• Randomized algorithms are frequently faster, both in terms of worst-case asymptotic performance and numerical implementation.
• Randomized algorithms are also simpler and easier to understand than deterministic algorithms.
• They are more easy to interpret the outputs, which is useful in applications where the analyst’s time is more important than the time complexity.
- Numerical algorithms that are randomized can also be better coordinated to take advantage of modern computing architectures. They’re used to solve problems involving large data matrix computations.
Now, let’s discuss with a problem:-
Problem: Two friends have N bit files each on computers and want to check whether they have the same file or not. To do so, one of them transmit some information to other friend over the network, now the one of them want to reduce the size of this information due to bandwidth problem.
Deterministic solution: It is clear that in order to complete this task algorithmically, one must send the entire file to another friend. one simply compares the two files and returns the confirmation to another. But what if the file is 10 GB in size? how can they send such a big file over the network for this seemingly simple task?
Randomized Solution: View the file as a sequence of bits, transform it to a decimal number(D), modulo some large prime number(P), and should give this number(M) along with the prime number to the other friend. Then he computes a decimal number that corresponds to his file, modulo it with P, and compares it to M. If they match, one can almost absolutely say they are the same, but if they don’t, can almost definitely say they are not. The amount of data that is exchanged is measured in logarithms (N) i.e., log(N).
Randomization can be classified into:
1. A Las Vegas algorithm is a randomized algorithm that consistently generates correct results or alerts the user to a failure. The runtime of a Las Vegas algorithm, on the other hand, varies depending on the input.
Randomized Quick Sort, for example, often returns a correctly sorted list. On average, it takes O(nlogn) time, but in the worst case, it can take as long as O(n2).
Example:
Randomized quick sort:
Quick sort
Quick sort (A[0…n])
{
Idx = partition (A);
Quick sort (A[0…idx-1]);
Quick sort (A[idx+1…n]);
}
Partition()
There are two versions deterministic and random, In the deterministic version, the pivot is chosen at random, while in the randomized version, we generate a random element and use it as the pivot.
1. A Monte Carlo algorithm is a randomized algorithm whose output has a certain chance of being inaccurate.
For example, if we use a non-perfect hash to allocate hash values to two separate strings and then compare the hash values to see if the strings would be the same or not, this is similar to an MC algorithm. While we almost always get the correct answer, there is a chance that two separate strings will have the same hash value.
Example:
Randomized MINCUT:
MinCut(G,w):
w(minCut) ← ∞
While |V| > 1
s-t-phaseCut ← MinCutPhase(G,w)
if w(s-t-phaseCut) < w(minCut)
minCut ← s-t-phaseCut
Merge(G,s,t)
Return minCut
A comparison of the Las Vegas and Monte Carlo algorithms with an example:
Let us discuss that can any Las Vegas algorithm be used to generate a Monte Carlo algorithm –
A Monte Carlo algorithm is a deterministic runtime algorithm with a nondeterministic result that has a ϵ<1/2 probability of being wrong.
A Las Vegas algorithm is a nondeterministic algorithm that always produces the right output while having a known finite predicted runtime.
Hence, we can transform a Las Vegas algorithm A with an estimated runtime P(n) to a Monte Carlo algorithm A′ by taking A and running it until time P(n)/ϵ. We terminate A and retrieve a random output if it does not return an output during this time. It is evident that A′ has a deterministic runtime (P(n)/ϵ), and we may use Markov’s inequality to limit the probability of trying to conclude A early and return a probably inaccurate value by the quantity ϵ, since A’s runtime is a nonnegative random variable.
Example:
Input: An array of n≥2 elements, half of which will be ‘a’s and half of which are ‘b’s.
output: Find a ‘a’ in the list.
The algorithm is shown in two versions: a Las Vegas algorithm and a Monte Carlo algorithm.
Las Vegas algorithm:
findingA_LV(array A, n)
begin
repeat
Randomly select one element out of n elements.
until ‘a’ is found
end
This algorithm has a 1 chance of succeeding. The number of iterations varies and may be extremely high, but the average number is:
Since it is continuous, the estimated run time for a large number of calls is Θ(1).
Monte Carlo algorithm:
findingA_MC(array A, n, k)
begin
i=0
repeat
Randomly select one element out of n elements.
i = i + 1
until i=k or ‘a’ is found
end
The algorithm succeeds if a ‘a’ is found, otherwise, the algorithm fails. The chance of having a ‘a’ after k iterations is:
This algorithm does not ensure success, but it does limit the amount of time it takes to run, k is always smaller than or equal to the number of iterations. Taking k to be constant, the estimated and absolute run time is Θ(1).