Skip to main content
IBM Quantum Platform

Choosing the number of iterations

We have established that the state vector of the register Q\mathsf{Q} in Grover's algorithm remains in the two-dimensional subspace spanned by A0\vert A_0\rangle and A1\vert A_1\rangle once the initialization step has been performed.

The goal is to find an element xA1,x\in A_1, and this goal will be accomplished if we can obtain the state A1\vert A_1\rangle — for if we measure this state, we're guaranteed to get a measurement outcome xA1.x\in A_1. Given that the state of Q\mathsf{Q} after tt iterations in step 2 is

Gtu=cos((2t+1)θ)A0+sin((2t+1)θ)A1,G^t \vert u \rangle = \cos\bigl((2t + 1)\theta\bigr) \vert A_0\rangle + \sin\bigl((2t + 1)\theta\bigr) \vert A_1\rangle,

we should choose tt so that

A1Gtu=sin((2t+1)θ)\langle A_1 \vert G^t \vert u \rangle = \sin((2t + 1)\theta)

is as close to 11 as possible in absolute value, to maximize the probability to obtain xA1x\in A_1 from the measurement. For any angle θ(0,2π),\theta \in (0,2\pi), the value sin((2t+1)θ)\sin((2t + 1)\theta) oscillates as tt increases, though it is not necessarily periodic — there's no guarantee that we'll ever get the same value twice.

Naturally, in addition to making the probability of obtaining an element xA1x\in A_1 from the measurement large, we would also like to choose tt to be as small as possible, because tt applications of the operation GG requires tt queries to the function f.f. Because we're aiming to make sin((2t+1)θ)\sin( (2t + 1) \theta) close to 11 in absolute value, a natural way to do this is to choose tt so that

(2t+1)θπ2.(2t + 1) \theta \approx \frac{\pi}{2}.

Solving for tt yields

tπ4θ12.t \approx \frac{\pi}{4\theta} - \frac{1}{2}.

Of course, tt must be an integer, so we won't necessarily be able to hit this value exactly — but what we can do is to take the closest integer to this value, which is

t=π4θ.t = \Bigl\lfloor \frac{\pi}{4\theta} \Bigr\rfloor.

This is the recommended number of iterations for Grover's algorithm. As we proceed with the analysis, we'll see that the closeness of this integer to the target value naturally affects the performance of the algorithm.

(As an aside, if the target value π/(4θ)1/2\pi/(4\theta) - 1/2 happens to be exactly half-way between two integers, this expression of tt is what we get by rounding up. We could alternatively round down, which makes sense to do because it means one fewer query — but this is secondary and unimportant for the sake of the lesson.)

Recalling that the value of the angle θ\theta is given by the formula

θ=sin1(A1N),\theta = \sin^{-1}\biggl(\sqrt{\frac{\vert A_1\vert}{N}}\biggr),

we see that the recommended number of iterations tt depends on the number of strings in A1.A_1. This presents a challenge if we don't know how many solutions we have, as we'll discuss later.


First, let's focus on the situation in which there's a single string xx such that f(x)=1.f(x)=1. Another way to say this is that we're considering an instance of the Unique search problem. In this case we have

θ=sin1(1N),\theta = \sin^{-1}\biggl( \sqrt{\frac{1}{N}} \biggr),

which can be conveniently approximated as

θ=sin1(1N)1N\theta = \sin^{-1}\biggl( \sqrt{\frac{1}{N}} \biggr) \approx \sqrt{\frac{1}{N}}

when NN gets large. If we substitute θ=1/N\theta = 1/\sqrt{N} into the expression

t=π4θt = \Bigl\lfloor \frac{\pi}{4\theta} \Bigr\rfloor

we obtain

t=π4N.t = \Bigl\lfloor \frac{\pi}{4}\sqrt{N} \Bigr\rfloor.

Recalling that tt is not only the number of times the operation GG is performed, but also the number of queries to the function ff required by the algorithm, we see that we're on track to obtaining an algorithm that requires O(N)O(\sqrt{N}) queries.

Now we'll investigate how well the recommended choice of tt works. The probability that the final measurement results in the unique solution can be expressed explicitly as

p(N,1)=sin2((2t+1)θ).p(N,1) = \sin^2 \bigl( (2t + 1) \theta \bigr).

The first argument, N,N, refers to the number of items we're searching over, and the second argument, which is 11 in this case, refers to the number of solutions. A bit later we'll use the same notation more generally, where there are multiple solutions.

Here's a table of the probabilities of success for increasing values of N=2n.N = 2^n.

Np(N,1)20.500000000041.000000000080.9453125000160.9613189697320.9991823155640.99658568081280.99561986572560.99994704215120.999448026210240.999461244720480.999996847840960.999945346181920.9999157752163840.9999997811327680.9999868295655360.9999882596\begin{array}{ll} N & p(N,1)\\ \hline 2 & 0.5000000000\\ 4 & 1.0000000000\\ 8 & 0.9453125000\\ 16 & 0.9613189697\\ 32 & 0.9991823155\\ 64 & 0.9965856808\\ 128 & 0.9956198657\\ 256 & 0.9999470421\\ 512 & 0.9994480262\\ 1024 & 0.9994612447\\ 2048 & 0.9999968478\\ 4096 & 0.9999453461\\ 8192 & 0.9999157752\\ 16384 & 0.9999997811\\ 32768 & 0.9999868295\\ 65536 & 0.9999882596 \end{array}

Notice that these probabilities are not strictly increasing. In particular, we have an interesting anomaly when N=4,N=4, where we get a solution with certainty. It can, however, be proved in general that

p(N,1)11Np(N,1) \geq 1 - \frac{1}{N}

for all N,N, so the probability of success goes to 11 in the limit as NN becomes large, as the values above seem to suggest. This is good!

But notice, however, that even a weak bound such as p(N,1)1/2p(N,1) \geq 1/2 establishes the utility of Grover's algorithm. For whatever measurement outcome xx we obtain from running the procedure, we can always check to see if f(x)=1f(x) = 1 using a single query to f.f. And if we fail to obtain the unique string xx for which f(x)=1f(x) = 1 with probability at most 1/21/2 by running the procedure once, then after mm independent runs of the procedure we will have failed to obtain this unique string xx with probability at most 2m.2^{-m}. That is, using O(mN)O(m \sqrt{N}) queries to f,f, we'll obtain the unique solution xx with probability at least 12m.1 - 2^{-m}. Using the better bound p(N,1)11/Np(N,1) \geq 1 - 1/N reveals that the probability to find xA1x\in A_1 using this method is actually at least 1Nm.1 - N^{-m}.


Multiple solutions

As the number of elements in A1A_1 varies, so too does the angle θ,\theta, which can have a significant effect on the algorithm's probability of success. For the sake of brevity, let's write s=A1s = \vert A_1 \vert to denote the number of solutions, and as before we'll assume that s1.s\geq 1.

As a motivating example, let's imagine that we have s=4s = 4 solutions rather than a single solution, as we considered above. This means that

θ=sin1(4N),\theta = \sin^{-1}\biggl( \sqrt{\frac{4}{N}} \biggr),

which is approximately double the angle we had in the A1=1\vert A_1 \vert = 1 case when NN is large. Suppose that we didn't know any better, and selected the same value of tt as in the unique solution setting:

t=π4sin1(1/N).t = \Biggl\lfloor \frac{\pi}{4\sin^{-1}\bigl(1/\sqrt{N}\bigr)}\Biggr\rfloor.

The effect will be catastrophic as the following table of probabilities reveals.

NSuccess probability41.000000000080.5000000000160.2500000000320.0122070313640.02038076891280.01445307582560.00007050585120.001931074110240.002300908320480.000007750640960.000230150281920.0003439882163840.0000007053327680.0000533810655360.0000472907\begin{array}{ll} N & \text{Success probability}\\ \hline 4 & 1.0000000000\\ 8 & 0.5000000000\\ 16 & 0.2500000000\\ 32 & 0.0122070313\\ 64 & 0.0203807689\\ 128 & 0.0144530758\\ 256 & 0.0000705058\\ 512 & 0.0019310741\\ 1024 & 0.0023009083\\ 2048 & 0.0000077506\\ 4096 & 0.0002301502\\ 8192 & 0.0003439882\\ 16384 & 0.0000007053\\ 32768 & 0.0000533810\\ 65536 & 0.0000472907 \end{array}

This time the probability of success goes to 00 as NN goes to infinity. This happens because we're effectively rotating twice as fast as we did when there was a unique solution, so we end up zooming past the target A1\vert A_1\rangle and landing near A0.-\vert A_0\rangle.

However, if instead we use the recommended choice of t,t, which is

t=π4θt = \Bigl\lfloor \frac{\pi}{4\theta}\Bigr\rfloor

for

θ=sin1(sN),\theta = \sin^{-1}\biggl( \sqrt{\frac{s}{N}} \biggr),

then the performance will be better. To be more precise, using this choice of tt leads to success with high probability.

Np(N,4)41.000000000080.5000000000161.0000000000320.9453125000640.96131896971280.99918231552560.99658568085120.995619865710240.999947042120480.999448026240960.999461244781920.9999968478163840.9999453461327680.9999157752655360.9999997811\begin{array}{ll} N & p(N,4)\\ \hline 4 & 1.0000000000\\ 8 & 0.5000000000\\ 16 & 1.0000000000\\ 32 & 0.9453125000\\ 64 & 0.9613189697\\ 128 & 0.9991823155\\ 256 & 0.9965856808\\ 512 & 0.9956198657\\ 1024 & 0.9999470421\\ 2048 & 0.9994480262\\ 4096 & 0.9994612447\\ 8192 & 0.9999968478\\ 16384 & 0.9999453461\\ 32768 & 0.9999157752\\ 65536 & 0.9999997811 \end{array}

Generalizing what was claimed earlier, it can be proved that

p(N,s)1sN,p(N,s) \geq 1 - \frac{s}{N},

where we're using the notation suggested earlier: p(N,s)p(N,s) denotes the probability that Grover's algorithm run for tt iterations reveals a solution when there are ss solutions in total out of NN possibilities.

This lower bound of 1s/N1 - s/N on the probability of success is slightly peculiar in that more solutions implies a worse lower bound — but under the assumption that ss is significantly smaller than N,N, we nevertheless conclude that the probability of success is reasonably high. As before, the mere fact that p(N,s)p(N,s) is reasonably large implies the algorithm's usefulness.

It also happens to be the case that

p(N,s)sN.p(N,s) \geq \frac{s}{N}.

This lower bound describes the probability that a string xΣnx\in\Sigma^n selected uniformly at random is a solution — so Grover's algorithm always does at least as well as random guessing. (In fact, when t=0,t=0, Grover's algorithm is random guessing.)

Now let's take a look at the number of iterations (and hence the number of queries)

t=π4θ,t = \Bigl\lfloor \frac{\pi}{4\theta}\Bigr\rfloor,

for

θ=sin1(sN).\theta = \sin^{-1}\biggl(\sqrt{\frac{s}{N}}\biggr).

For every α[0,1],\alpha \in [0,1], it is the case that sin1(α)α,\sin^{-1}(\alpha)\geq \alpha, and so

θ=sin1(sN)sN.\theta = \sin^{-1}\left(\sqrt{\frac{s}{N}}\right) \geq \sqrt{\frac{s}{N}}.

This implies that

tπ4θπ4Ns.t \leq \frac{\pi}{4\theta} \leq \frac{\pi}{4}\sqrt{\frac{N}{s}}.

This translates to a savings in the number of queries as ss grows. In particular, the number of queries required is

O(Ns).O\biggl(\sqrt{\frac{N}{s}}\biggr).

Unknown number of solutions

If the number of solutions s=A1s = \vert A_1 \vert is unknown, then a different approach is required, for in this situation we have no knowledge of ss to inform our choice of t.t. There are, in fact, multiple approaches.

One simple approach is to choose

t{1,,πN/4}t \in \Bigl\{ 1,\ldots,\bigl\lfloor\pi\sqrt{N}/4\bigr\rfloor \Bigr\}

uniformly at random. Selecting tt in this way always finds a solution (assuming one exists) with probability greater than 40%, though this is not obvious and requires an analysis that will not be included here. It does makes sense, however, particularly when we think about the geometric picture: rotating the state of Q\mathsf{Q} a random number of times like this is not unlike choosing a random unit vector in the space spanned by A0\vert A_0\rangle and A1,\vert A_1\rangle, for which it is likely that the coefficient of A1\vert A_1\rangle is reasonably large. By repeating this procedure and checking the outcome in the same way as described before, the probability to find a solution can be made very close to 1.1.

There is a refined method that finds a solution when one exists using O(N/s)O(\sqrt{N/s}) queries, even when the number of solutions ss is not known, and requires O(N)O(\sqrt{N}) queries to determine that there are no solutions when s=0.s=0.

The basic idea is to choose tt uniformly at random from the set {1,,T}\{1,\ldots,T\} iteratively, for increasing values of T.T. In particular, we can start with T=1T = 1 and increase it exponentially, always terminating the process as soon as a solution is found and capping TT so as not to waste queries when there isn't a solution. The process takes advantage of the fact that fewer queries are required when more solutions exist. Some care is required, however, to balance the rate of growth of TT with the probability of success for each iteration. (Taking T54TT \leftarrow \lceil \frac{5}{4}T\rceil works, for instance, as an analysis reveals. Doubling T,T, however, does not — this turns out to be too fast of an increase.)


The trivial cases

Throughout the analysis we've just gone through, we've assumed that the number of solutions is non-zero. Indeed, by referring to the vectors

A0=1A0xA0xA1=1A1xA1x\begin{aligned} \vert A_0\rangle &= \frac{1}{\sqrt{\vert A_0\vert}} \sum_{x\in A_0} \vert x\rangle \\ \vert A_1\rangle &= \frac{1}{\sqrt{\vert A_1\vert}} \sum_{x\in A_1} \vert x\rangle \end{aligned}

we have implicitly assumed that A0A_0 and A1A_1 are both nonempty. Here we will briefly consider what happens when one of these sets is empty.

Before we bother with an analysis, let's observe the obvious: if every string xΣnx\in\Sigma^n is a solution, then we'll see a solution when we measure; and when there aren't any solutions, we won't see one. In some sense there's no need to go deeper than this.

We can, however, quickly verify the mathematics for these trivial cases. The situation where one of A0A_0 and A1A_1 is empty happens when ff is constant; A1A_1 is empty when f(x)=0f(x) = 0 for every xΣn,x\in\Sigma^n, and A0A_0 is empty when f(x)=1f(x) = 1 for every xΣn.x\in\Sigma^n. This means that

Zfu=±u,Z_f \vert u\rangle = \pm \vert u\rangle,

and therefore

Gu=(2uuI)Zfu=±(2uuI)u=±u.\begin{aligned} G \vert u \rangle & = \bigl( 2 \vert u\rangle \langle u \vert - \mathbb{I}\bigr) Z_f\vert u\rangle \\ & = \pm \bigl( 2 \vert u\rangle \langle u \vert - \mathbb{I}\bigr) \vert u\rangle \\ & = \pm \vert u\rangle. \end{aligned}

So, irrespective of the number of iterations tt we perform in these cases, the measurements will always reveal a uniform random string xΣn.x\in\Sigma^n.

Was this page helpful?
Report a bug or request content on GitHub.