Skip to main content
IBM Quantum Platform

Simon's algorithm

Simon's algorithm is a quantum query algorithm for a problem known as Simon's problem. This is a promise problem with a flavor similar to the Deutsch-Jozsa and Bernstein-Vazirani problems, but the specifics are different.

Simon's algorithm is significant because it provides an exponential advantage of quantum over classical (including probabilistic) algorithms, and the technique it uses inspired Peter Shor's discovery of an efficient quantum algorithm for integer factorization.


Simon's problem

The input function for Simon's problem takes the form

f:ΣnΣmf:\Sigma^n \rightarrow \Sigma^m

for positive integers nn and m.m. We could restrict our attention to the case m=nm = n in the interest of simplicity, but there's little to be gained in making this assumption — Simon's algorithm and its analysis are basically the same either way.

Simon's problem
Input: a function f:ΣnΣmf:\Sigma^n \rightarrow \Sigma^m
Promise: there exists a string sΣns\in\Sigma^n such that [f(x)=f(y)][(x=y)(xs=y)][f(x) = f(y)] \Leftrightarrow [(x = y) \vee (x \oplus s = y)] for all x,yΣnx,y\in\Sigma^n
Output: the string ss

We'll unpack the promise to better understand what it says momentarily, but first let's be clear that it requires that ff has a very special structure — so most functions won't satisfy this promise. It's also fitting to acknowledge that this problem isn't intended to have practical importance. Rather, it's a somewhat artificial problem tailor-made to be easy for quantum computers and hard for classical computers.

There are two main cases: the first case is that ss is the all-zero string 0n,0^n, and the second case is that ss is not the all-zero string.

  • Case 1: s=0n.s=0^n. If ss is the all-zero string, then we can simplify the if and only if statement in the promise so that it reads [f(x)=f(y)][x=y].[f(x) = f(y)] \Leftrightarrow [x = y]. This is equivalent to ff being a one-to-one function.

  • Case 2: s0n.s\neq 0^n. If ss is not the all-zero string, then the promise being satisfied for this string implies that ff is two-to-one, meaning that for every possible output string of f,f, there are exactly two input strings that cause ff to output that string. Moreover, these two input strings must take the form ww and wsw \oplus s for some string w.w.

It's important to recognize that there can only be one string ss that works if the promise is met, so there's always a unique correct answer for functions that satisfy the promise.

Here's an example of a function taking the form f:Σ3Σ5f:\Sigma^3 \rightarrow \Sigma^5 that satisfies the promise for the string s=011.s = 011.

f(000)=10011f(001)=00101f(010)=00101f(011)=10011f(100)=11010f(101)=00001f(110)=00001f(111)=11010\begin{aligned} f(000) & = 10011 \\ f(001) & = 00101 \\ f(010) & = 00101 \\ f(011) & = 10011 \\ f(100) & = 11010 \\ f(101) & = 00001 \\ f(110) & = 00001 \\ f(111) & = 11010 \end{aligned}

There are 88 different input strings and 44 different output strings, each of which occurs twice — so this is a two-to-one function. Moreover, for any two different input strings that produce the same output string, we see that the bitwise XOR of these two input strings is equal to 011,011, which is equivalent to saying that either one of them equals the other XORed with s.s.

Notice that the only thing that matters about the actual output strings is whether they're the same or different for different choices of input strings. For instance, in the example above, there are four strings (10011,(10011, 00101,00101, 00001,00001, and 11010)11010) that appear as outputs of f.f. We could replace these four strings with different strings, so long as they're all distinct, and the correct solution s=011s = 011 would not change.


Algorithm description

Here's a quantum circuit diagram representing Simon's algorithm.

Simon's algorithm

To be clear, there are nn qubits on the top that are acted upon by Hadamard gates and mm qubits on the bottom that go directly into the query gate. It looks very similar to the algorithms we've already discussed in the lesson, but this time there's no phase kickback; the bottom mm qubits all go into the query gate in the state 0.\vert 0\rangle.

To solve Simon's problem using this circuit will actually require several independent runs of it followed by a classical post-processing step, which will be described later after the behavior of the circuit is analyzed.


Analysis

The analysis of Simon's algorithm begins along similar lines to the Deutsch-Jozsa algorithm. After the first layer of Hadamard gates is performed on the top nn qubits, the state becomes

12nxΣn0mx.\frac{1}{\sqrt{2^n}} \sum_{x\in\Sigma^n} \vert 0^m \rangle \vert x\rangle.

When the UfU_f is performed, the output of the function ff is XORed onto the all-zero state of the bottom mm qubits, so the state becomes

12nxΣnf(x)x.\frac{1}{\sqrt{2^n}} \sum_{x\in\Sigma^n} \vert f(x) \rangle \vert x\rangle.

When the second layer of Hadamard gates is performed, we obtain the following state by using the same formula for the action of a layer of Hadamard gates as before.

12nxΣnyΣn(1)xyf(x)y\frac{1}{2^n} \sum_{x\in\Sigma^n} \sum_{y\in\Sigma^n} (-1)^{x\cdot y} \vert f(x) \rangle \vert y\rangle

At this point, the analysis diverges from the ones for the previous algorithms in this lesson.

We're interested in the probability for the measurements to result in each possible string yΣn.y\in\Sigma^n. Through the rules for analyzing measurements described in the Multiple systems lesson of the Basics of quantum information course, we find that the probability p(y)p(y) to obtain the string yy is equal to

p(y)=12nxΣn(1)xyf(x)2.p(y) = \left\|\frac{1}{2^n} \sum_{x\in\Sigma^n} (-1)^{x\cdot y} \vert f(x) \rangle \right\|^2.

To get a better handle on these probabilities, we'll need just a bit more notation and terminology. First, the range of the function ff is the set containing all of its output strings.

range(f)={f(x):xΣn}\operatorname{range}(f) = \{ f(x) : x\in \Sigma^n \}

Second, for each string zrange(f),z\in\operatorname{range}(f), we can express the set of all input strings that cause the function to evaluate to this output string zz as f1({z}).f^{-1}(\{z\}).

f1({z})={xΣn:f(x)=z}f^{-1}(\{z\}) = \{ x\in\Sigma^n : f(x) = z \}

The set f1({z})f^{-1}(\{z\}) is known as the preimage of {z}\{z\} under f.f. We can define the preimage under ff of any set in place of {z}\{z\} in an analogous way — it's the set of all elements that ff maps to that set. (This notation should not be confused with the inverse of the function f,f, which may not exist. The fact that the argument on the left-hand side is the set {z}\{z\} rather than the element zz is the clue that allows us to avoid this confusion.)

Using this notation, we can split up the sum in our expression for the probabilities above to obtain

p(y)=12nzrange(f)(xf1({z})(1)xy)z2.p(y) = \left\| \frac{1}{2^n} \sum_{z\in\operatorname{range}(f)} \Biggl(\sum_{x\in f^{-1}(\{z\})} (-1)^{x\cdot y}\Biggr) \vert z \rangle \right\|^2.

Every string xΣnx\in\Sigma^n is represented exactly once by the two summations — we're basically just putting these strings into separate buckets depending on which output string z=f(x)z = f(x) they produce when we evaluate the function f,f, and then summing separately over all the buckets.

We can now evaluate the Euclidean norm squared to obtain

p(y)=122nzrange(f)xf1({z})(1)xy2.p(y) = \frac{1}{2^{2n}} \sum_{z\in\operatorname{range}(f)} \left\vert \sum_{x\in f^{-1}(\{z\})} (-1)^{x\cdot y} \right\vert^2.

To simplify these probabilities further, let's take a look at the value

xf1({z})(1)xy2(1)\left\vert \sum_{x\in f^{-1}(\{z\})} (-1)^{x\cdot y} \right\vert^2 \tag{1}

for an arbitrary selection of zrange(f).z\in\operatorname{range}(f).

If it happens to be the case that s=0n,s = 0^n, then ff is a one-to-one function and there's always just a single element xf1({z}),x\in f^{-1}(\{z\}), for every zrange(f).z\in\operatorname{range}(f). The value of the expression (1)(1) is 11 in this case.

If, on the other hand, s0n,s\neq 0^n, then there are exactly two strings in the set f1({z}).f^{-1}(\{z\}). To be precise, if we choose wf1({z})w\in f^{-1}(\{z\}) to be any one of these two strings, then the other string must be wsw \oplus s by the promise in Simon's problem. Using this observation we can simplify (1)(1) as follows.

xf1({z})(1)xy2=(1)wy+(1)(ws)y2=(1)wy(1+(1)sy)2=1+(1)ys2={4ys=00ys=1\begin{aligned} \left\vert \sum_{x\in f^{-1}(\{z\})} (-1)^{x\cdot y} \right\vert^2 & = \Bigl\vert (-1)^{w\cdot y} + (-1)^{(w\oplus s)\cdot y} \Bigr\vert^2 \\ & = \Bigl\vert (-1)^{w\cdot y} \Bigl(1 + (-1)^{s\cdot y}\Bigr) \Bigr\vert^2 \\ & = \Bigl\vert 1 + (-1)^{y\cdot s} \Bigr\vert^2 \\ & = \begin{cases} 4 & y \cdot s = 0\\[1mm] 0 & y \cdot s = 1 \end{cases} \end{aligned}

So, it turns out that the value (1)(1) is independent of the specific choice of zrange(f)z\in\operatorname{range}(f) in both cases.

We can now finish off the analysis by looking at the same two cases as before separately.

  • Case 1: s=0n.s = 0^n. In this case the function ff is one-to-one, so there are 2n2^n strings zrange(f),z\in\operatorname{range}(f), and we obtain

    p(y)=122n2n=12n.p(y) = \frac{1}{2^{2n}} \cdot 2^n = \frac{1}{2^n}.

    In words, the measurements result in a string yΣny\in\Sigma^n chosen uniformly at random.

  • Case 2: s0n.s \neq 0^n. In this case ff is two-to-one, so there are 2n12^{n-1} elements in range(f).\operatorname{range}(f). Using the formula from above we conclude that the probability to measure each yΣny\in\Sigma^n is

    p(y)=122nzrange(f)xf1({z})(1)xy2={12n1ys=00ys=1p(y) = \frac{1}{2^{2n}} \sum_{z\in\operatorname{range}(f)} \Biggl\vert \sum_{x\in f^{-1}(\{z\})} (-1)^{x\cdot y} \Biggr\vert^2 = \begin{cases} \frac{1}{2^{n-1}} & y \cdot s = 0\\[1mm] 0 & y \cdot s = 1 \end{cases}

    In words, we obtain a string chosen uniformly at random from the set {yΣn:ys=0},\{y\in\Sigma^n : y \cdot s = 0\}, which contains 2n12^{n-1} strings. (Because s0n,s\neq 0^n, exactly half of the binary strings of length nn have binary dot product 11 with ss and the other have binary dot product 00 with s,s, as we already observed in the analysis of the Deutsch-Jozsa algorithm for the Bernstein-Vazirani problem.)

Classical post-processing

We now know what the probabilities are for the possible measurement outcomes when we run the quantum circuit for Simon's algorithm. Is this enough information to determine ss?

The answer is yes, provided that we're willing to repeat the process several times and accept that it could fail with some probability, which we can make very small by running the circuit enough times. The essential idea is that each execution of the circuit provides us with statistical evidence concerning s,s, and we can use that evidence to find ss with very high probability if we run the circuit sufficiently many times.

Let's suppose that we run the circuit independently kk times, for k=n+10.k = n + 10. There's nothing special about this particular number of iterations — we could take kk to be larger (or smaller) depending on the probability of failure we're willing to tolerate, as we will see. Choosing k=n+10k = n + 10 will ensure that we have greater than a 99.999.9% chance of recovering s.s.

By running the circuit kk times, we obtain strings y1,...,ykΣn.y^1,...,y^{k} \in \Sigma^n. To be clear, the superscripts here are part of the names of these strings, not exponents or indexes to their bits, so we have

y1=yn11y01y2=yn12y02    yk=yn1ky0k\begin{aligned} y^1 & = y^1_{n-1} \cdots y^1_{0}\\[1mm] y^2 & = y^2_{n-1} \cdots y^2_{0}\\[1mm] & \;\; \vdots\\[1mm] y^{k} & = y^{k}_{n-1} \cdots y^{k}_{0} \end{aligned}

We now form a matrix MM having kk rows and nn columns by taking the bits of these strings as binary-valued entries.

M=(yn11y01yn12y02yn1ky0k)M = \begin{pmatrix} y^1_{n-1} & \cdots & y^1_{0}\\[1mm] y^2_{n-1} & \cdots & y^2_{0}\\[1mm] \vdots & \ddots & \vdots \\[1mm] y^{k}_{n-1} & \cdots & y^{k}_{0} \end{pmatrix}

Now, we don't know what ss is at this point — our goal is to find this string. But imagine for a moment that we do know the string s,s, and we form a column vector vv from the bits of the string s=sn1s0s = s_{n-1} \cdots s_0 as follows.

v=(sn1s0)v = \begin{pmatrix} s_{n-1}\\ \vdots\\ s_0 \end{pmatrix}

If we perform the matrix-vector multiplication MvM v modulo 22 — meaning that we perform the multiplication as usual and then take the remainder of the entries of the result after dividing by 22 — we obtain the all-zero vector.

Mv=(y1sy2syks)=(000)M v = \begin{pmatrix} y^1 \cdot s\\ y^2 \cdot s\\ \vdots\\[1mm] y^{k} \cdot s \end{pmatrix} = \begin{pmatrix} 0\\ 0\\ \vdots\\[1mm] 0 \end{pmatrix}

That is, treated as a column vector vv as just described, the string ss will always be an element of the null space of the matrix M,M, provided that we do the arithmetic modulo 2.2. This is true in both the case that s=0ns = 0^n and s0n.s\neq 0^n. To be more precise, the all-zero vector is always in the null space of M,M, and it's joined by the vector whose entries are the bits of ss in case s0n.s\neq 0^n.

The question remaining is whether there will be any other vectors in the null space of MM besides the ones corresponding to 0n0^n and s.s. The answer is that this becomes increasingly unlikely as kk increases — and if we choose k=n+10,k = n + 10, the null space of MM will contain no other vectors in addition to those corresponding to 0n0^n and ss with greater than a 99.999.9% chance. More generally, if we replace k=n+10k = n + 10 with k=n+rk = n + r for an arbitrary choice of a positive integer r,r, the probability that the vectors corresponding to 0n0^n and ss are alone in the null space of MM is at least 12r.1 - 2^{-r}.

Using linear algebra, it is possible to efficiently calculate a description of the null space of MM modulo 2.2. Specifically, it can be done using Gaussian elimination, which works the same way when arithmetic is done modulo 22 as it does with real or complex numbers. So long as the vectors corresponding to 0n0^n and ss are alone in the null space of M,M, which happens with high probability, we can deduce ss from the results of this computation.

Classical difficulty

How many queries does a classical query algorithm need to solve Simon's problem? The answer is: a lot, in general.

There are different precise statements that can be made about the classical difficulty of this problem, and here's just one of them. If we have any probabilistic query algorithm, and that algorithm makes fewer than 2n/2112^{n/2 - 1} - 1 queries, which is a number of queries that's exponential in n,n, then that algorithm will fail to solve Simon's problem with probability at least 1/2.1/2.

Sometimes, proving impossibility results like this can be very challenging, but this one isn't too difficult to prove through an elementary probabilistic analysis. Here, however, we'll only briefly examine the basic intuition behind it.

We're trying to find the hidden string s,s, but so long as we don't query the function on two strings having the same output value, we'll get very limited information about s.s. Intuitively speaking, all we'll learn is that the hidden string ss is not the exclusive-OR of any two distinct strings we've queried. And if we query fewer than 2n/2112^{n/2 - 1} - 1 strings, then there will still be a lot of choices for ss that we haven't ruled out because there aren't enough pairs of strings to do this. This isn't a formal proof, it's just the basic idea.

So, in summary, Simon's algorithm provides us with a striking advantage of quantum over classical algorithms within the query model. In particular, Simon's algorithm solves Simon's problem with a number of queries that's linear in the number of input bits nn of our function, whereas any classical algorithm, even if it's probabilistic, needs to make a number of queries that's exponential in nn in order to solve Simon's problem with a reasonable probability of success.

Was this page helpful?
Report a bug or request content on GitHub.