Skip to main content
IBM Quantum Platform

Shor's algorithm

Now we'll turn our attention to the integer factorization problem, and see how it can be solved efficiently on a quantum computer using phase estimation. The algorithm we'll obtain is Shor's algorithm for integer factorization. Shor didn't describe his algorithm specifically in terms of phase estimation, but it is a natural and intuitive way to explain how it works.

We'll begin by discussing an intermediate problem known as the order-finding problem and see how phase estimation provides a solution to this problem. We'll then see how an efficient solution to the order-finding problem gives us an efficient solution to the integer factorization problem. (When a solution to one problem provides a solution to another problem like this, we say that the second problem reduces to the first — so in this case we're reducing integer factorization to order finding.) This second part of Shor's algorithm doesn't make use of quantum computing at all; it's completely classical. Quantum computing is only needed to solve order finding.


The order-finding problem

Some basic number theory

To explain the order-finding problem and how it can be solved using phase estimation, it will be helpful to begin with a couple of basic number theory concepts, and to introduce some handy notation along the way.

To begin, for any given positive integer N,N, define the set ZN\mathbb{Z}_N like this.

ZN={0,1,,N1}\mathbb{Z}_N = \{0,1,\ldots,N-1\}

For instance, Z1={0},  \mathbb{Z}_1 = \{0\},\; Z2={0,1},  \mathbb{Z}_2 = \{0,1\},\; Z3={0,1,2},  \mathbb{Z}_3 = \{0,1,2\},\; and so on.

These are sets of numbers, but we can think of them as more than sets. In particular, we can think about arithmetic operations on ZN\mathbb{Z}_N such as addition and multiplication — and if we agree to always take our answers modulo NN (i.e., divide by NN and take the remainder as the result), we'll always stay within this set when we perform these operations. The two specific operations of addition and multiplication, both taken modulo N,N, turn ZN\mathbb{Z}_N into a ring, which is a fundamentally important type of object in algebra.

For example, 33 and 55 are elements of Z7,\mathbb{Z}_7, and if we multiply them together we get 35=15,3\cdot 5 = 15, which leaves a remainder of 11 when divided by 7.7. Sometimes we express this as follows.

351  (mod 7)3 \cdot 5 \equiv 1 \; (\textrm{mod } 7)

But we can also simply write 35=1,3 \cdot 5 = 1, provided that it's been made clear that we're working in Z7,\mathbb{Z}_7, just to keep our notation as simple as possible.

As an example, here are the addition and multiplication tables for Z6.\mathbb{Z}_6.

+012345001234511234502234501334501244501235501234012345000000010123452024024303030340420425054321\begin{array}{c|cccccc} + & 0 & 1 & 2 & 3 & 4 & 5 \\\hline 0 & 0 & 1 & 2 & 3 & 4 & 5 \\ 1 & 1 & 2 & 3 & 4 & 5 & 0 \\ 2 & 2 & 3 & 4 & 5 & 0 & 1 \\ 3 & 3 & 4 & 5 & 0 & 1 & 2 \\ 4 & 4 & 5 & 0 & 1 & 2 & 3 \\ 5 & 5 & 0 & 1 & 2 & 3 & 4 \\ \end{array} \qquad \begin{array}{c|cccccc} \cdot & 0 & 1 & 2 & 3 & 4 & 5 \\\hline 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 1 & 2 & 3 & 4 & 5 \\ 2 & 0 & 2 & 4 & 0 & 2 & 4 \\ 3 & 0 & 3 & 0 & 3 & 0 & 3 \\ 4 & 0 & 4 & 2 & 0 & 4 & 2 \\ 5 & 0 & 5 & 4 & 3 & 2 & 1 \\ \end{array}

Among the NN elements of ZN,\mathbb{Z}_N, the elements aZNa\in\mathbb{Z}_N that satisfy gcd(a,N)=1\gcd(a,N) = 1 are special. Frequently the set containing these elements is denoted with a star like so.

ZN={aZN:gcd(a,N)=1}\mathbb{Z}_N^{\ast} = \{a\in \mathbb{Z}_N : \gcd(a,N) = 1\}

If we focus our attention on the operation of multiplication, the set ZN\mathbb{Z}_N^{\ast} forms a group — specifically an abelian group — which is another important type of object in algebra. It's a basic fact about these sets (and finite groups in general), that if we pick any element aZNa\in\mathbb{Z}_N^{\ast} and repeatedly multiply aa to itself, we'll always eventually get the number 1.1.

For a first example, let's take N=6.N=6. We have that 5Z65\in\mathbb{Z}_6^{\ast} because gcd(5,6)=1,\gcd(5,6) = 1, and if we multiply 55 to itself we get 1,1, as the table above confirms.

52=1(working within Z6)5^2 = 1 \quad \text{(working within $\mathbb{Z}_6$)}

As a second example, let's take N=21.N = 21. If we go through the numbers from 00 to 20,20, the ones having GCD equal to 11 with 2121 are as follows.

Z21={1,2,4,5,8,10,11,13,16,17,19,20}\mathbb{Z}_{21}^{\ast} = \{1,2,4,5,8,10,11,13,16,17,19,20\}

For each of these elements, it is possible to raise that number to a positive integer power to get 1.1. Here are the smallest powers for which this works:

11=182=1163=126=1106=1176=143=1116=1196=156=1132=1202=1\begin{array}{ccc} 1^{1} = 1 \quad & 8^{2} = 1 \quad & 16^{3} = 1 \\[1mm] 2^{6} = 1 \quad & 10^{6} = 1 \quad & 17^{6} = 1 \\[1mm] 4^{3} = 1 \quad & 11^{6} = 1 \quad & 19^{6} = 1 \\[1mm] 5^{6} = 1 \quad & 13^{2} = 1 \quad & 20^{2} = 1 \end{array}

Naturally we're working within Z21\mathbb{Z}_{21} for all of these equations, which we haven't bothered to write — we take it to be implicit to avoid cluttering things up. We'll continue to do that throughout the rest of the lesson.

Problem statement and connection to phase estimation

Now we can state the order-finding problem.

Order finding
Input: positive integers NN and aa satisfying gcd(N,a)=1\gcd(N,a) = 1
Output: the smallest positive integer rr such that ar1a^r \equiv 1 (mod N)(\textrm{mod } N)

Alternatively, in terms of the notation we just introduced above, we're given aZN,a \in \mathbb{Z}_N^{\ast}, and we're looking for the smallest positive integer rr such that ar=1.a^r = 1. This number rr is called the order of aa modulo N.N.

To connect the order-finding problem to phase estimation, let's think about the operation defined on a system whose classical states correspond to ZN,\mathbb{Z}_N, where we multiply by a fixed element aZN.a\in\mathbb{Z}_N^{\ast}.

Max=ax(for each xZN)M_a \vert x\rangle = \vert ax \rangle \qquad \text{(for each $x\in\mathbb{Z}_N$)}

To be clear, we're doing the multiplication in ZN,\mathbb{Z}_N, so it's implicit that we're taking the product modulo NN inside of the ket on the right-hand side of the equation.

For example, if we take N=15N = 15 and a=2,a=2, then the action of M2M_2 on the standard basis {0,,14}\{\vert 0\rangle,\ldots,\vert 14\rangle\} is as follows.

M20=0M25=10M210=5M21=2M26=12M211=7M22=4M27=14M212=9M23=6M28=1M213=11M24=8M29=3M214=13\begin{array}{ccc} M_{2} \vert 0 \rangle = \vert 0\rangle \quad & M_{2} \vert 5 \rangle = \vert 10\rangle \quad & M_{2} \vert 10 \rangle = \vert 5\rangle \\[1mm] M_{2} \vert 1 \rangle = \vert 2\rangle \quad & M_{2} \vert 6 \rangle = \vert 12\rangle \quad & M_{2} \vert 11 \rangle = \vert 7\rangle \\[1mm] M_{2} \vert 2 \rangle = \vert 4\rangle \quad & M_{2} \vert 7 \rangle = \vert 14\rangle \quad & M_{2} \vert 12 \rangle = \vert 9\rangle \\[1mm] M_{2} \vert 3 \rangle = \vert 6\rangle \quad & M_{2} \vert 8 \rangle = \vert 1\rangle \quad & M_{2} \vert 13 \rangle = \vert 11\rangle \\[1mm] M_{2} \vert 4 \rangle = \vert 8\rangle \quad & M_{2} \vert 9 \rangle = \vert 3\rangle \quad & M_{2} \vert 14 \rangle = \vert 13\rangle \end{array}

This is a unitary operation provided that gcd(a,N)=1;\gcd(a,N)=1; it shuffles the elements of the standard basis {0,,N1},\{\vert 0\rangle,\ldots,\vert N-1\rangle\}, so as a matrix it's a permutation matrix. It's evident from its definition that this operation is deterministic, and a simple way to see that it's invertible is to think about the order rr of aa modulo N,N, and to recognize that the inverse of MaM_a is Mar1.M_a^{r-1}.

Mar1Ma=Mar=Mar=M1=IM_a^{r-1} M_a = M_a^r = M_{a^r} = M_1 = \mathbb{I}

There's another way to think about the inverse that doesn't require any knowledge of rr (which, after all, is what we're trying to compute). For every element aZNa\in\mathbb{Z}_N^{\ast} there's always a unique element bZNb\in\mathbb{Z}_N^{\ast} that satisfies ab=1.ab=1. We denote this element bb by a1,a^{-1}, and it can be computed efficiently; an extension of Euclid's GCD algorithm does it with cost quadratic in lg(N).\operatorname{lg}(N). And thus

Ma1Ma=Ma1a=M1=I.M_{a^{-1}} M_a = M_{a^{-1}a} = M_1 = \mathbb{I}.

So, the operation MaM_a is both deterministic and invertible. That implies that it's described by a permutation matrix, and is therefore unitary.

Now let's think about the eigenvectors and eigenvalues of the operation Ma,M_a, assuming that aZN.a\in\mathbb{Z}_N^{\ast}. As was just argued, this assumption tells us that MaM_a is unitary.

There are NN eigenvalues of Ma,M_a, possibly including the same eigenvalue repeated multiple times, and in general there's some freedom in selecting corresponding eigenvectors — but we won't need to worry about all of the possibilities. Let's start simply and identify just one eigenvector of Ma.M_a.

ψ0=1+a++ar1r\vert \psi_0 \rangle = \frac{\vert 1 \rangle + \vert a \rangle + \cdots + \vert a^{r-1} \rangle}{\sqrt{r}}

The number rr is the order of aa modulo N,N, here and throughout the remainder of the lesson. The eigenvalue associated with this eigenvector is 11 because it isn't changed when we multiply by a.a.

Maψ0=a++ar1+arr=a++ar1+1r=ψ0M_a \vert \psi_0 \rangle = \frac{\vert a \rangle + \cdots + \vert a^{r-1} \rangle + \vert a^r \rangle}{\sqrt{r}} = \frac{\vert a \rangle + \cdots + \vert a^{r-1} \rangle + \vert 1 \rangle}{\sqrt{r}} = \vert \psi_0 \rangle

This happens because ar=1,a^r = 1, so each standard basis state ak\vert a^k \rangle gets shifted to ak+1\vert a^{k+1} \rangle for kr1,k\leq r-1, and ar1\vert a^{r-1} \rangle gets shifted back to 1.\vert 1\rangle. Informally speaking, it's like we're slowly stirring ψ0,\vert \psi_0 \rangle, but it's already completely stirred so nothing changes.

Here's another example of an eigenvector of Ma.M_a. This one happens to be more interesting in the context of order finding and phase estimation.

ψ1=1+ωr1a++ωr(r1)ar1r\vert \psi_1 \rangle = \frac{\vert 1 \rangle + \omega_r^{-1} \vert a \rangle + \cdots + \omega_r^{-(r-1)}\vert a^{r-1} \rangle}{\sqrt{r}}

Alternatively, we can write this vector using a summation as follows.

ψ1=1rk=0r1ωrkak\vert \psi_1 \rangle = \frac{1}{\sqrt{r}} \sum_{k = 0}^{r-1} \omega_r^{-k} \vert a^k \rangle

Here we're seeing the complex number ωr=e2πi/r\omega_r = e^{2\pi i/r} showing up naturally, due to the way that multiplication by aa works modulo N.N. This time the corresponding eigenvalue is ωr.\omega_r. To see this, we can first compute as follows.

Maψ1=k=0r1ωrkMaak=k=0r1ωrkak+1=k=1rωr(k1)ak=ωrk=1rωrkakM_a \vert \psi_1 \rangle = \sum_{k = 0}^{r-1} \omega_r^{-k} M_a\vert a^k \rangle = \sum_{k = 0}^{r-1} \omega_r^{-k} \vert a^{k+1} \rangle = \sum_{k = 1}^{r} \omega_r^{-(k - 1)} \vert a^{k} \rangle = \omega_r \sum_{k = 1}^{r} \omega_r^{-k} \vert a^{k} \rangle

Then, because ωrr=1=ωr0\omega_r^{-r} = 1 = \omega_r^0 and ar=1=a0,\vert a^r \rangle = \vert 1\rangle = \vert a^0\rangle, we see that

k=1rωrkak=k=0r1ωrkak=ψ1,\sum_{k = 1}^{r} \omega_r^{-k} \vert a^{k} \rangle = \sum_{k = 0}^{r-1} \omega_r^{-k} \vert a^k \rangle = \vert\psi_1\rangle,

so Maψ1=ωrψ1.M_a \vert\psi_1\rangle = \omega_r \vert\psi_1\rangle.

Using the same reasoning, we can identify additional eigenvector/eigenvalue pairs for Ma.M_a. For any choice of j{0,,r1}j\in\{0,\ldots,r-1\} we have that

ψj=1rk=0r1ωrjkak\vert \psi_j \rangle = \frac{1}{\sqrt{r}} \sum_{k = 0}^{r-1} \omega_r^{-jk} \vert a^k \rangle

is an eigenvector of MaM_a whose corresponding eigenvalue is ωrj.\omega_r^j.

Maψj=ωrjψjM_a \vert \psi_j \rangle = \omega_r^j \vert \psi_j \rangle

There are other eigenvectors of Ma,M_a, but we don't need to concern ourselves with them — we'll focus solely on the eigenvectors ψ0,,ψr1\vert\psi_0\rangle,\ldots,\vert\psi_{r-1}\rangle that we've just identified.


Order finding through phase estimation

To solve the order-finding problem for a given choice of aZN,a\in\mathbb{Z}_N^{\ast}, we can apply the phase-estimation procedure to the operation Ma.M_a.

To do this, we need to implement not only MaM_a efficiently with a quantum circuit, but also Ma2,M_a^2, Ma4,M_a^4, Ma8,M_a^8, and so on, going as far as needed to obtain a precise enough estimate from the phase estimation procedure. Here we'll explain how this can be done, and we'll figure out exactly how much precision is needed later.

Let's start with the operation MaM_a by itself. Naturally, because we're working with the quantum circuit model, we'll use binary notation to encode the numbers between 00 and N1.N-1. The largest number we need to encode is N1,N-1, so the number of bits we need is

n=lg(N1)=log(N1)+1.n = \operatorname{lg}(N-1) = \lfloor \log(N-1) \rfloor + 1.

For example, if N=21N = 21 we have n=lg(N1)=5.n = \operatorname{lg}(N-1) = 5. Here's what the encoding of elements of Z21\mathbb{Z}_{21} as binary strings of length 55 looks like.

0000001000012010100\begin{gathered} 0 \mapsto 00000\\[1mm] 1 \mapsto 00001\\[1mm] \vdots\\[1mm] 20 \mapsto 10100 \end{gathered}

And now, here's a precise definition of how MaM_a is defined as an nn-qubit operation.

Max={ax  (mod  N)0x<NxNx<2nM_a \vert x\rangle = \begin{cases} \vert ax \; (\textrm{mod}\;N)\rangle & 0\leq x < N\\[1mm] \vert x\rangle & N\leq x < 2^n \end{cases}

The point is that although we only care about how MaM_a works for 0,,N1,\vert 0\rangle,\ldots,\vert N-1\rangle, we do have to specify how it works for the remaining 2nN2^n - N standard basis states — and we need to do this in a way that still gives us a unitary operation. Defining MaM_a so that it does nothing to the remaining standard basis states accomplishes this.

Using the algorithms for integer multiplication and division discussed in the previous lesson, together with the methodology for reversible, garbage-free implementations of them, we can build a quantum circuit that performs Ma,M_a, for any choice of aZN,a\in\mathbb{Z}_N^{\ast}, at cost O(n2).O(n^2). Here is one way this can be done.

  1. Build a circuit for performing the operation

    xyxyfa(x)\vert x \rangle \vert y \rangle \mapsto \vert x \rangle \vert y \oplus f_a(x)\rangle

    where

    fa(x)={ax  (mod  N)0x<NxNx<2nf_a(x) = \begin{cases} ax \; (\textrm{mod}\;N) & 0\leq x < N\\[1mm] x & N\leq x < 2^n \end{cases}

    using the method described in the previous lesson. This gives us a circuit of size O(n2).O(n^2).

  2. Swap the two nn-qubit systems using nn swap gates to swap the qubits individually.

  3. Along similar lines to the first step, build a circuit for the operation

    xyxyfa1(x)\vert x \rangle \vert y \rangle \mapsto \vert x \rangle \bigl\vert y \oplus f_{a^{-1}}(x)\bigr\rangle

    where a1a^{-1} is the inverse of aa in ZN.\mathbb{Z}_N^{\ast}.

By initializing the bottom nn qubits and composing the three steps, we obtain this transformation:

x0nstep 1xfa(x)step 2fa(x)xstep 3fa(x)xfa1(fa(x))=fa(x)0n\vert x \rangle \vert 0^n \rangle \stackrel{\text{step 1}}{\mapsto} \vert x \rangle \vert f_a(x)\rangle \stackrel{\text{step 2}}{\mapsto} \vert f_a(x)\rangle \vert x \rangle \stackrel{\text{step 3}}{\mapsto} \vert f_a(x)\rangle \bigl\vert x \oplus f_{a^{-1}}(f_a(x)) \bigr\rangle = \vert f_a(x)\rangle\vert 0^n \rangle

The method requires workspace qubits, but they're returned to their initialized state at the end, which allows us to use these circuits for phase estimation. The total cost of the circuit we obtain is O(n2).O(n^2).

To perform Ma2,M_a^2, Ma4,M_a^4, Ma8,M_a^8, and so on, we can use exactly the same method, except that we replace aa with a2,a^2, a4,a^4, a8,a^8, and so on, as elements of ZN.\mathbb{Z}_N^{\ast}. That is, for any power kk we choose, we can create a circuit for MakM_a^k not by iterating kk times the circuit for Ma,M_a, but instead by computing b=akZNb = a^k \in \mathbb{Z}_N^{\ast} and then using the circuit for Mb.M_b.

The computation of powers akZNa^k \in \mathbb{Z}_N is the modular exponentiation problem mentioned in the previous lesson. This computation can be done classically, using the algorithm for modular exponentiation mentioned in the previous lesson (often called the power algorithm in computational number theory). In fact, we only require power-of-2 powers of a,a, in particular a2,a4,a2m1ZN,a^2, a^4, \ldots a^{2^{m-1}} \in \mathbb{Z}_N^{\ast}, and we can obtain these powers by iteratively squaring m1m-1 times. Each squaring can be performed by a Boolean circuit of size O(n2).O(n^2).

In essence, what we're effectively doing here is offloading the problem of iterating MaM_a as many as 2m12^{m-1} times to an efficient classical computation. And it's good fortune that this is possible! For an arbitrary choice of a quantum circuit in the phase estimation problem, this is not likely to be possible — and in that case the resulting cost for phase estimation grows exponentially in the number of control qubits m.m.

Solution given a convenient eigenvector

To understand how we can solve the order-finding problem using phase estimation, let's start by supposing that we run the phase estimation procedure on the operation MaM_a using the eigenvector ψ1.\vert\psi_1\rangle. Getting our hands on this eigenvector isn't easy, as it turns out, so this won't be the end of the story — but it's helpful to start here.

The eigenvalue of MaM_a corresponding to the eigenvector ψ1\vert \psi_1\rangle is

ωr=e2πi1r.\omega_r = e^{2\pi i \frac{1}{r}}.

That is, ωr=e2πiθ\omega_r = e^{2\pi i \theta} for θ=1/r.\theta = 1/r. So, if we run the phase estimation procedure on MaM_a using the eigenvector ψ1,\vert\psi_1\rangle, we'll get an approximation to 1/r.1/r. By computing the reciprocal we'll be able to learn rr — provided that our approximation is good enough.

In more detail, when we run the phase-estimation procedure using mm control qubits, what we obtain is a number y{0,,2m1}.y\in\{0,\ldots,2^m-1\}. We then take y/2my/2^m as a guess for θ,\theta, which is 1/r1/r in the case at hand. To figure out what rr is from this approximation, the natural thing to do is to compute the reciprocal of our approximation and round to the nearest integer.

2my+12\left\lfloor \frac{2^m}{y} + \frac{1}{2} \right\rfloor

For example, let's suppose r=6r = 6 and we perform phase estimation on MaM_a with the eigenvector ψ1\vert\psi_1\rangle using m=5m = 5 control bits. The best 55-bit approximation to 1/r=1/61/r = 1/6 is 5/32,5/32, and we have a pretty good chance (about 68%68\% in this case) to obtain the outcome y=5y=5 from phase estimation. We have

2my=325=6.4,\frac{2^m}{y} = \frac{32}{5} = 6.4,

and rounding to the nearest integer gives 6,6, which is the correct answer.

On the other hand, if we don't use enough precision, we might not get the right answer. For instance, if we take m=4m = 4 control qubits in phase estimation, we might obtain the best 44-bit approximation to 1/r=1/6,1/r = 1/6, which is 3/16.3/16. Taking the reciprocal yields

2my=163=5.333\frac{2^m}{y} = \frac{16}{3} = 5.333 \cdots

and rounding to the nearest integer gives an incorrect answer of 5.5.

So how much precision do we need to get the right answer? We know that the order rr is an integer, and intuitively speaking what we need is enough precision to distinguish 1/r1/r from nearby possibilities, including 1/(r+1)1/(r+1) and 1/(r1).1/(r-1). The closest number to 1/r1/r that we need to be concerned with is 1/(r+1),1/(r+1), and the distance between these two numbers is

1r1r+1=1r(r+1).\frac{1}{r} - \frac{1}{r+1} = \frac{1}{r(r+1)}.

So, if we want to make sure that we don't mistake 1/r1/r for 1/(r+1),1/(r+1), it suffices to use enough precision to guarantee that a best approximation y/2my/2^m to 1/r1/r is closer to 1/r1/r than it is to 1/(r+1).1/(r+1). If we use enough precision so that

y2m1r<12r(r+1),\left\vert \frac{y}{2^m} - \frac{1}{r} \right\vert < \frac{1}{2 r (r+1)},

so that the error is less than half of the distance between 1/r1/r and 1/(r+1),1/(r+1), then y/2my/2^m will be closer to 1/r1/r than to any other possibility, including 1/(r+1)1/(r+1) and 1/(r1).1/(r-1).

We can double-check this as follows. Suppose that

y2m=1r+ε\frac{y}{2^m} = \frac{1}{r} + \varepsilon

for ε\varepsilon satisfying

ε<12r(r+1).\vert\varepsilon\vert < \frac{1}{2 r (r+1)}.

When we take the reciprocal we obtain

2my=11r+ε=r1+εr=rεr21+εr.\frac{2^m}{y} = \frac{1}{\frac{1}{r} + \varepsilon} = \frac{r}{1+\varepsilon r} = r - \frac{\varepsilon r^2}{1+\varepsilon r}.

By maximizing in the numerator and minimizing in the denominator, we can bound how far away we are from rr as follows.

εr21+εrr22r(r+1)1r2r(r+1)=r2r+1<12\left\vert \frac{\varepsilon r^2}{1+\varepsilon r} \right\vert \leq \frac{ \frac{r^2}{2 r(r+1)}}{1 - \frac{r}{2r(r+1)}} %= \frac{r^2}{2 r (r+1) - r} = \frac{r}{2 r + 1} < \frac{1}{2}

We're less than 1/21/2 away from r,r, so as expected we'll get rr when we round.

Unfortunately, because we don't yet know what rr is, we can't use it to tell us how much accuracy we need. What we can do instead is to use the fact that rr must be smaller than NN to ensure that we use enough precision. In particular, if we use enough accuracy to guarantee that the best approximation y/2my/2^m to 1/r1/r satisfies

y2m1r12N2,\left\vert \frac{y}{2^m} - \frac{1}{r} \right\vert \leq \frac{1}{2N^2},

then we'll have enough precision to correctly determine rr when we take the reciprocal. Taking m=2lg(N)+1m = 2\operatorname{lg}(N)+1 ensures that we have a high chance to obtain an estimation with this precision using the method described previously. (Taking m=2lg(N)m = 2\operatorname{lg}(N) is good enough if we're comfortable with a lower bound of 40% on the probability of success.)

General solution

As we just saw, if we have the eigenvector ψ1\vert \psi_1 \rangle of Ma,M_a, we can learn rr through phase estimation, so long as we use enough control qubits to do this with sufficient precision. Unfortunately, it's not easy to get our hands on the eigenvector ψ1,\vert\psi_1\rangle, so we need to figure out how to proceed.

Let's suppose momentarily that we proceed just like above, except with the eigenvector ψk\vert\psi_k\rangle in place of ψ1,\vert\psi_1\rangle, for any choice of k{0,,r1}k\in\{0,\ldots,r-1\} that we choose to think about. The result we get from the phase estimation procedure will be an approximation

y2mkr.\frac{y}{2^m} \approx \frac{k}{r}.

Working under the assumption that we don't know either kk or r,r, this might or might not allow us to identify r.r. For example, if k=0k = 0 we'll get an approximation y/2my/2^m to 0,0, which unfortunately tells us nothing. This, however, is an unusual case; for other values of k,k, we'll at least be able to learn something about r.r.

We can use an algorithm known as the continued fraction algorithm to turn our approximation y/2my/2^m into nearby fractions — including k/rk/r if the approximation is good enough. We won't explain the continued fraction algorithm here. Instead, here's a statement of a known fact about this algorithm.

Fact. Given an integer N2N\geq 2 and a real number α(0,1),\alpha\in(0,1), there is at most one choice of integers u,v{0,,N1}u,v\in\{0,\ldots,N-1\} with v0v\neq 0 and gcd(u,v)=1\gcd(u,v)=1 satisfying αu/v<12N2.\vert \alpha - u/v\vert < \frac{1}{2N^2}. Given α\alpha and N,N, the continued fraction algorithm finds uu and v,v, or reports that they don't exist. This algorithm can be implemented as a Boolean circuit having size O((lg(N))3).O((\operatorname{lg}(N))^3).

If we have a very close approximation y/2my/2^m to k/r,k/r, and we run the continued fraction algorithm for NN and α=y/2m,\alpha = y/2^m, we'll get uu and v,v, as they're described in the fact. An analysis of the fact allows us to conclude that

uv=kr.\frac{u}{v} = \frac{k}{r}.

Notice in particular that we don't necessarily learn kk and r,r, we only learn k/rk/r in lowest terms.

For example, and as we've already noticed, we're not going to learn anything from k=0.k=0. But that's the only value of kk where that happens. When kk is nonzero, it might have common factors with r,r, but the number vv we obtain from the continued fraction algorithm must at least divide r.r.

It's far from obvious, but it is true that if we have the ability to learn uu and vv for u/v=k/ru/v = k/r for k{0,,r1}k\in\{0,\ldots,r-1\} chosen uniformly at random, then we're very likely to be able to recover rr after just a few samples. In particular, if our guess for rr is the least common multiple of all the values for the denominator vv that we observe, we'll be right with high probability. Intuitively speaking, some values of kk aren't good because they share common factors with r,r, and those common factors are hidden from us when we learn uu and v.v. But random choices of kk aren't likely to hide factors of rr for long, and the probability that we don't guess rr correctly by taking the least common multiple of the denominators we observe drops exponentially in the number of samples.

It remains to address the issue of how we get our hands on an eigenvector ψk\vert\psi_k\rangle of MaM_a on which to run the phase estimation procedure. As it turns out, we don't actually need to create them!

What we will do instead is to run the phase estimation procedure on the state 1,\vert 1\rangle, by which we mean the nn-bit binary encoding of the number 1,1, in place of an eigenvector ψ\vert\psi\rangle of Ma.M_a. So far, we've only talked about running the phase estimation procedure on a particular eigenvector, but nothing prevents us from running the procedure on an input state that isn't an eigenvector of Ma,M_a, and that's what we're doing here with the state 1.\vert 1\rangle. (This isn't an eigenvector of MaM_a unless a=1,a=1, which isn't a choice we'll be interested in.)

The rationale for choosing the state 1\vert 1\rangle in place of an eigenvector of MaM_a is that the following equation is true.

1=1rk=0r1ψk\vert 1\rangle = \frac{1}{\sqrt{r}} \sum_{k = 0}^{r-1} \vert \psi_k\rangle

One way to verify this equation is to compare the inner products of the two sides with each standard basis state, using formulas mentioned previously in the lesson to help to evaluate the results for the right-hand side. As a consequence, we will obtain precisely the same measurement results as if we had chosen k{0,,r1}k\in\{0,\ldots,r-1\} uniformly at random and used ψk\vert\psi_k\rangle as an eigenvector.

In greater detail, let's imagine that we run the phase estimation procedure with the state 1\vert 1\rangle in place of one of the eigenvectors ψk.\vert\psi_k\rangle. After the inverse quantum Fourier transform is performed, this leaves us with the state

1rk=0r1ψkγk,\frac{1}{\sqrt{r}} \sum_{k = 0}^{r-1} \vert \psi_k\rangle \vert \gamma_k\rangle,

where

γk=12my=02m1x=02m1e2πix(k/ry/2m)y.\vert\gamma_k\rangle = \frac{1}{2^m} \sum_{y=0}^{2^m - 1} \sum_{x=0}^{2^m-1} e^{2\pi i x (k/r - y/2^m)} \vert y\rangle.

The vector γk\vert\gamma_k\rangle represents the state of the top mm qubits after the inverse of the quantum Fourier transform has been performed on them.

So, by virtue of the fact that {ψ0,,ψr1}\{\vert\psi_0\rangle,\ldots,\vert\psi_{r-1}\rangle\} is an orthonormal set, we find that a measurement of the top mm qubits yields an approximation y/2my/2^m to the value k/rk/r where k{0,,r1}k\in\{0,\ldots,r-1\} is chosen uniformly at random. As we've already discussed, this allows us to learn rr with a high degree of confidence after several independent runs, which was our goal.

Total cost

The cost to implement each controlled-unitary MakM_a^k is O(n2).O(n^2). There are mm controlled-unitary operations, and we have m=O(n),m = O(n), so the total cost for the controlled-unitary operations is O(n3).O(n^3). In addition, we have mm Hadamard gates (which contribute O(n)O(n) to the cost), and the inverse quantum Fourier transform contributes O(n2)O(n^2) to the cost. Thus, the cost of the controlled-unitary operations dominates the cost of the entire procedure — which is therefore O(n3).O(n^3).

In addition to the quantum circuit itself, there are a few classical computations that need to be performed along the way. This includes computing the powers aka^k in ZN\mathbb{Z}_N for k=2,4,8,,2m1,k = 2, 4, 8, \ldots, 2^{m-1}, which are needed to create the controlled-unitary gates, as well as the continued fraction algorithm that converts approximations of θ\theta into fractions. These computations can be performed by Boolean circuits with a total cost of O(n3).O(n^3).

As is typical, all of these bounds can be improved using asymptotically fast algorithms; these bounds assume we're using standard algorithms for basic arithmetic operations.


Factoring by order finding

The very last thing we need to discuss is how solving the order-finding problem helps us to factor. This part is completely classical — it has nothing specifically to do with quantum computing.

Here's the basic idea. We want to factorize the number N,N, and we can do this recursively. Specifically, we can focus on the task of splitting N,N, which means finding any two integers b,c2b,c\geq 2 for which N=bc.N = bc. This isn't possible if NN is a prime number, but we can efficiently test to see if NN is prime using a primality testing algorithm first, and if NN isn't prime we'll try to split it. Once we split N,N, we can simply recurse on bb and cc until all of our factors are prime and we obtain the prime factorization of N.N.

Splitting even integers is easy: we just output 22 and N/2.N/2.

It's also easy to split perfect powers, meaning numbers of the form N=sjN = s^j for integers s,j2,s,j\geq 2, just by approximating the roots N1/2,N^{1/2}, N1/3,N^{1/3}, N1/4,N^{1/4}, etc., and checking nearby integers as suspects for s.s. We don't need to go further than log(N)\log(N) steps into this sequence, because at that point the root drops below 22 and won't reveal additional candidates.

It's good that we can do both of these things because order finding won't help us to factor even numbers or for prime powers, where the number ss happens to be prime. If NN is odd and not a prime power, however, order finding allows us to split N.N.

Probabilistic algorithm to split an odd, composite integer N that is not a prime power

  1. Randomly choose a{2,,N1}.a\in\{2,\ldots,N-1\}.

  2. Compute d=gcd(a,N).d=\gcd(a,N).

  3. If d>1d > 1 then output b=db = d and c=N/dc = N/d and stop. Otherwise continue to the next step knowing that aZN.a\in\mathbb{Z}_N^{\ast}.

  4. Let rr be the order of aa modulo N.N. (Here's where we need order finding.)

  5. If rr is even:

    5.1 Compute x=ar/21x = a^{r/2} - 1 modulo NN
    5.2 Compute d=gcd(x,N).d = \gcd(x,N).
    5.3 If d>1d>1 then output b=db=d and c=N/dc = N/d and stop.

  6. If this point is reached, the algorithm has failed to find a factor of N.N.

A run of this algorithm may fail to find a factor of N.N. Specifically, this happens in two situations:

  • The order of aa modulo NN is odd.
  • The order of aa modulo NN is even and gcd(ar/21,N)=1.\gcd\bigl(a^{r/2} - 1, N\bigr) = 1.

Using basic number theory it can be proved that, for a random choice of a,a, with probability at least 1/21/2 neither of these events happen. In fact, the probability that either event happens is at most 2(m1)2^{-(m-1)} for mm being the number of distinct prime factors of N,N, which is why the assumption that NN is not a prime power is needed. (The assumption that NN is odd is also required for this fact to be true.)

This means that each run has at least a 50% chance to split N.N. Therefore, if we run the algorithm tt times, randomly choosing aa each time, we'll succeed in splitting NN with probability at least 12t.1 - 2^{-t}.

The basic idea behind the algorithm is as follows. If we have a choice of aa for which the order rr of aa modulo NN is even, then r/2r/2 is an integer and we can consider the numbers

ar/21  (mod  N)andar/2+1  (mod  N).a^{r/2} - 1\; (\textrm{mod}\; N) \quad \text{and} \quad a^{r/2} + 1\; (\textrm{mod}\; N).

Using the formula Z21=(Z+1)(Z1),Z^2 - 1 = (Z+1)(Z-1), we conclude that

(ar/21)(ar/2+1)=ar1.\bigl(a^{r/2} - 1\bigr) \bigl(a^{r/2} + 1\bigr) = a^r - 1.

Now, we know that ar  (mod  N)=1a^r \; (\textrm{mod}\; N) = 1 by the definition of the order — which is another way of saying that NN evenly divides ar1.a^r - 1. That means that NN evenly divides the product

(ar/21)(ar/2+1).\bigl(a^{r/2} - 1\bigr) \bigl(a^{r/2} + 1\bigr).

For this to be true, all of the prime factors of NN must also be prime factors of ar/21a^{r/2} - 1 or ar/2+1a^{r/2} + 1 (or both) — and for a random selection of aa it turns out to be unlikely that all of the prime factors of NN will divide one of the terms and none will divide the other. Otherwise, so long as some of the prime factors of NN divide the first term and some divide the second term, we'll be able to find a non-trivial factor of NN by computing the GCD with the first term.

Was this page helpful?
Report a bug or request content on GitHub.