Skip to main content
IBM Quantum Platform

Quantum state discrimination and tomography

In the last part of the lesson, we'll briefly consider two tasks associated with measurements: quantum state discrimination and quantum state tomography.

  1. Quantum state discrimination

    For quantum state discrimination, we have a known collection of quantum states ρ0,,ρm1,\rho_0,\ldots,\rho_{m-1}, along with probabilities p0,,pm1p_0,\ldots,p_{m-1} associated with these states. A succinct way of expressing this is to say that we have an ensemble

    {(p0,ρ0),,(pm1,ρm1)}\{(p_0,\rho_0),\ldots,(p_{m-1},\rho_{m-1})\}

    of quantum states.

    A number a{0,,m1}a\in\{0,\ldots,m-1\} is chosen randomly according to the probabilities (p0,,pm1)(p_0,\ldots,p_{m-1}) and the system X\mathsf{X} is prepared in the state ρa.\rho_a. The goal is to determine, by means of a measurement of X\mathsf{X} alone, which value of aa was chosen.

    Thus, we have a finite number of alternatives, along with a prior — which is our knowledge of the probability for each aa to be selected — and the goal is to determine which alternative actually happened. This may be easy for some choices of states and probabilities, and for others it may not be possible without some chance of making an error.

  2. Quantum state tomography

    For quantum state tomography, we have an unknown quantum state of a system — so unlike in quantum state discrimination there's typically no prior or any information about possible alternatives.

    This time, however, it's not a single copy of the state that's made available, but rather many independent copies are made available. That is, NN identical systems X1,,XN\mathsf{X}_1,\ldots,\mathsf{X}_N are each independently prepared in the state ρ\rho for some (possibly large) number N.N. The goal is to find an approximation of the unknown state, as a density matrix, by measuring the systems.


Discriminating between two states

The simplest case for quantum state discrimination is that there are two states, ρ0\rho_0 and ρ1,\rho_1, that are to be discriminated.

Imagine a situation in which a bit aa is chosen randomly: a=0a = 0 with probability pp and a=1a = 1 with probability 1p.1 - p. A system X\mathsf{X} is prepared in the state ρa,\rho_a, meaning ρ0\rho_0 or ρ1\rho_1 depending on the value of a,a, and given to us. Our goal is to correctly guess the value of aa by means of a measurement on X.\mathsf{X}. To be precise, we shall aim to maximize the probability that our guess is correct.

An optimal measurement

An optimal way to solve this problem begins with a spectral decomposition of a weighted difference between ρ0\rho_0 and ρ1,\rho_1, where the weights are the corresponding probabilities.

pρ0(1p)ρ1=k=0n1λkψkψkp \rho_0 - (1-p) \rho_1 = \sum_{k = 0}^{n-1} \lambda_k \vert \psi_k \rangle \langle \psi_k \vert

Notice that we have a minus sign rather than a plus sign in this expression: this is a weighted difference not a weighted sum.

We can maximize the probability of a correct guess by selecting a projective measurement {Π0,Π1}\{\Pi_0,\Pi_1\} as follows. First let's partition the elements of {0,,n1}\{0,\ldots,n-1\} into two disjoint sets S0S_0 and S1S_1 depending upon whether the corresponding eigenvalue of the weighted difference is nonnegative or negative.

S0={k{0,,n1}:λk0}S1={k{0,,n1}:λk<0}\begin{gathered} S_0 = \{k\in\{0,\ldots,n-1\} : \lambda_k \geq 0 \}\\[2mm] S_1 = \{k\in\{0,\ldots,n-1\} : \lambda_k < 0 \} \end{gathered}

We can then choose a projective measurement as follows.

Π0=kS0ψkψkandΠ1=kS1ψkψk\Pi_0 = \sum_{k \in S_0} \vert \psi_k \rangle \langle \psi_k \vert \quad\text{and}\quad \Pi_1 = \sum_{k \in S_1} \vert \psi_k \rangle \langle \psi_k \vert

(It doesn't actually matter in which set S0S_0 or S1S_1 we include the values of kk for which λk=0.\lambda_k = 0. Here we're choosing arbitrarily to include these values in S0.S_0.)

This is an optimal measurement in the situation at hand that minimizes the probability of an incorrect determination of the selected state.

Correctness probability

Now we will determine the probability of correctness for the measurement {Π0,Π1}.\{\Pi_0,\Pi_1\}.

To begin we don't really need to be concerned with the specific choice we've made for Π0\Pi_0 and Π1,\Pi_1, though it may be helpful to keep it in mind. For any measurement {P0,P1}\{P_0,P_1\} (not necessarily projective) we can write the correctness probability as follows.

pTr(P0ρ0)+(1p)Tr(P1ρ1)p \operatorname{Tr}(P_0 \rho_0) + (1 - p) \operatorname{Tr}(P_1 \rho_1)

Using the fact that {P0,P1}\{P_0,P_1\} is a measurement, so P1=IP0,P_1 = \mathbb{I} - P_0, we can rewrite this expression as follows.

pTr(P0ρ0)+(1p)Tr((IP0)ρ1)=pTr(P0ρ0)(1p)Tr(P0ρ1)+(1p)Tr(ρ1)=Tr(P0(pρ0(1p)ρ1))+1pp \operatorname{Tr}(P_0 \rho_0) + (1 - p) \operatorname{Tr}((\mathbb{I} - P_0) \rho_1)\hspace*{3cm}\\[1mm] \begin{aligned} & = p \operatorname{Tr}(P_0 \rho_0) - (1 - p) \operatorname{Tr}(P_0 \rho_1) + (1-p) \operatorname{Tr}(\rho_1)\\[1mm] & = \operatorname{Tr}\bigl( P_0 (p \rho_0 - (1-p)\rho_1) \bigr) + 1 - p \end{aligned}

On the other hand, we could have made the substitution P0=IP1P_0 = \mathbb{I} - P_1 instead. That wouldn't change the value but it does give us an alternative expression.

pTr((IP1)ρ0)+(1p)Tr(P1ρ1)=pTr(ρ0)pTr(P1ρ0)+(1p)Tr(P1ρ1)=pTr(P1(pρ0(1p)ρ1))p \operatorname{Tr}((\mathbb{I} - P_1) \rho_0) + (1 - p) \operatorname{Tr}(P_1 \rho_1)\hspace*{3cm}\\[1mm] \begin{aligned} & = p \operatorname{Tr}(\rho_0) - p \operatorname{Tr}(P_1 \rho_0) + (1 - p) \operatorname{Tr}(P_1 \rho_1)\\[1mm] & = p - \operatorname{Tr}\bigl( P_1 (p \rho_0 - (1-p)\rho_1) \bigr) \end{aligned}

The two expressions have the same value, so we can average them to give yet another expression for this value. (Averaging the two expressions is just a trick to simplify the resulting expression.)

12(Tr(P0(pρ0(1p)ρ1))+1p)+12(pTr(P1(pρ0(1p)ρ1)))=12Tr((P0P1)(pρ0(1p)ρ1))+12\frac{1}{2} \bigl(\operatorname{Tr}\bigl( P_0 (p \rho_0 - (1-p)\rho_1) \bigr) + 1-p\bigr) + \frac{1}{2} \bigl(p - \operatorname{Tr}\bigl( P_1 (p \rho_0 - (1-p)\rho_1) \bigr)\bigr)\\ = \frac{1}{2} \operatorname{Tr}\bigl( (P_0-P_1) (p \rho_0 - (1-p)\rho_1)\bigr) + \frac{1}{2}

Now we can see why it makes sense to choose the projections Π0\Pi_0 and Π1\Pi_1 (as specified above) for P0P_0 and P1,P_1, respectively — because that's how we can make the trace in the final expression as large as possible. In particular,

(Π0Π1)(pρ0(1p)ρ1)=k=0n1λkψkψk.(\Pi_0-\Pi_1) (p \rho_0 - (1-p)\rho_1) = \sum_{k = 0}^{n-1} \vert\lambda_k\vert \cdot \vert \psi_k \rangle \langle \psi_k \vert.

So, when we take the trace, we obtain the sum of the absolute values of the eigenvalues — which is equal to what's known as the trace norm of the weighted difference.

Tr((Π0Π1)(pρ0(1p)ρ1))=k=0n1λk=pρ0(1p)ρ11\operatorname{Tr}\bigl( (\Pi_0-\Pi_1) (p \rho_0 - (1-p)\rho_1)\bigr) = \sum_{k = 0}^{n-1} \vert\lambda_k\vert = \bigl\| p \rho_0 - (1-p)\rho_1 \bigr\|_1

Thus, the probability that the measurement {Π0,Π1}\{\Pi_0,\Pi_1\} leads to a correct discrimination of ρ0\rho_0 and ρ1,\rho_1, given with probabilities pp and 1p,1-p, respectively, is as follows.

12+12pρ0(1p)ρ11\frac{1}{2} + \frac{1}{2} \bigl\| p \rho_0 - (1-p)\rho_1 \bigr\|_1

The fact that this is the optimal probability for a correct discrimination of ρ0\rho_0 and ρ1,\rho_1, given with probabilities pp and 1p,1-p, is commonly referred to as the Helstrom–Holevo theorem (or sometimes just Helstrom's theorem).


Discriminating three or more states

For quantum state discrimination when there are three or more states, there is no known closed-form solution for an optimal measurement, although it is possible to formulate the problem as a semidefinite program — which allows for efficient numerical approximations of optimal measurements with the help of a computer.

It is also possible to verify (or falsify) optimality of a given measurement in a state discrimination task through a condition known as the Holevo-Yuen-Kennedy-Lax condition. In particular, for the state discrimination task defined by the ensemble

{(p0,ρ0),,(pm1,ρm1)},\{(p_0,\rho_0),\ldots,(p_{m-1},\rho_{m-1})\},

the measurement {P0,,Pm1}\{P_0,\ldots,P_{m-1}\} is optimal if and only if the matrix

Qa=b=0m1pbρbPbpaρaQ_a = \sum_{b = 0}^{m-1} p_b \rho_b P_b - p_a \rho_a

is positive semidefinite for every a{0,,m1}.a\in\{0,\ldots,m-1\}.

For example, consider the quantum state discrimination task in which one of the four tetrahedral states ϕ0,,ϕ3\vert\phi_0\rangle,\ldots,\vert\phi_3\rangle is selected uniformly at random. The tetrahedral measurement {P0,P1,P2,P3}\{P_0,P_1,P_2,P_3\} succeeds with probability

14Tr(P0ϕ0ϕ0)+14Tr(P1ϕ1ϕ1)+14Tr(P2ϕ2ϕ2)+14Tr(P3ϕ3ϕ3)=12.\frac{1}{4} \operatorname{Tr}(P_0 \vert\phi_0\rangle\langle \phi_0 \vert) + \frac{1}{4} \operatorname{Tr}(P_1 \vert\phi_1\rangle\langle \phi_1 \vert) + \frac{1}{4} \operatorname{Tr}(P_2 \vert\phi_2\rangle\langle \phi_2 \vert) + \frac{1}{4} \operatorname{Tr}(P_3 \vert\phi_3\rangle\langle \phi_3 \vert) = \frac{1}{2}.

This is optimal by the Holevo-Yuen-Kennedy-Lax condition, as a calculation reveals that

Qa=14(Iϕaϕa)0Q_a = \frac{1}{4}(\mathbb{I} - \vert\phi_a\rangle\langle\phi_a\vert) \geq 0

for a=0,1,2,3.a = 0,1,2,3.


Quantum state tomography

Finally, we'll briefly discuss the problem of quantum state tomography. For this problem, we're given a large number NN of independent copies of an unknown quantum state ρ,\rho, and the goal is to reconstruct an approximation ρ~\tilde{\rho} of ρ.\rho. To be clear, this means that we wish to find a classical description of a density matrix ρ~\tilde{\rho} that is as close as possible to ρ.\rho.

We can alternatively describe the set-up in the following way. An unknown density matrix ρ\rho is selected, and we're given access to NN quantum systems X1,,XN,\mathsf{X}_1,\ldots,\mathsf{X}_N, each of which has been independently prepared in the state ρ.\rho. Thus, the state of the compound system (X1,,XN)(\mathsf{X}_1,\ldots,\mathsf{X}_N) is

ρN=ρρρ(N times)\rho^{\otimes N} = \rho \otimes \rho \otimes \cdots \otimes \rho \quad \text{($N$ times)}

The goal is to perform measurements on the systems X1,,XN\mathsf{X}_1,\ldots,\mathsf{X}_N and, based on the outcomes of those measurements, to compute a density matrix ρ~\tilde{\rho} that closely approximates ρ.\rho. This turns out to be a fascinating problem and there is ongoing research on it.

Different types of strategies for approaching the problem may be considered. For example, we can imagine a strategy where each of the systems X1,,XN\mathsf{X}_1,\ldots,\mathsf{X}_N is measured separately, in turn, producing a sequence of measurement outcomes. Different specific choices for which measurements are performed can be made, including adaptive and non-adaptive selections. In other words, the choice of what measurement is performed on a particular system might or might not depend on the outcomes of prior measurements. Based on the sequence of measurement outcomes, a guess ρ~\tilde{\rho} for the state ρ\rho is derived — and again there are different methodologies for doing this.

An alternative approach is to perform a single joint measurement of the entire collection, where we think about (X1,,XN)(\mathsf{X}_1,\ldots,\mathsf{X}_N) as a single system and select a single measurement whose output is a guess ρ~\tilde{\rho} for the state ρ.\rho. This can lead to an improved estimate over what is possible for separate measurements of the individual systems, although a joint measurement on all of the systems together is likely to be much more difficult to implement.

Qubit tomography using Pauli measurements

We'll now consider quantum state tomography in the simple case where ρ\rho is a qubit density matrix. We assume that we're given qubits X1,,XN\mathsf{X}_1,\ldots,\mathsf{X}_N that are each independently in the state ρ,\rho, and our goal is to compute an approximation ρ~\tilde{\rho} that is close to ρ.\rho.

Our strategy will be to divide the NN qubits X1,,XN\mathsf{X}_1,\ldots,\mathsf{X}_N into three roughly equal-size collections, one for each of the three Pauli matrices σx,\sigma_x, σy,\sigma_y, and σz.\sigma_z. Each qubit is then measured independently as follows.

  1. For each of the qubits in the collection associated with σx\sigma_x we perform a σx\sigma_x measurement. This means that the qubit is measured with respect to the basis {+,},\{\vert + \rangle, \vert -\rangle\}, which is an orthonormal basis of eigenvectors of σx,\sigma_x, and the corresponding measurement outcomes are the eigenvalues associated with the two eigenvectors: +1+1 for the state +\vert + \rangle and 1-1 for the state .\vert -\rangle. By averaging together the outcomes over all of the states in the collection associated with σx,\sigma_x, we obtain an approximation of the expectation value

    +ρ+ρ=Tr(σxρ).\langle + \vert \rho \vert + \rangle - \langle - \vert \rho \vert - \rangle = \operatorname{Tr}(\sigma_x \rho).
  2. For each of the qubits in the collection associated with σy\sigma_y we perform a σy\sigma_y measurement. Such a measurement is similar to a σx\sigma_x measurement, except that the measurement basis is { ⁣+ ⁣i, ⁣ ⁣i},\{\vert\! +\!i \rangle, \vert\! -\!i \rangle\}, the eigenvectors of σy.\sigma_y. Averaging the outcomes over all of the states in the collection associated with σy,\sigma_y, we obtain an approximation of the expectation value

    +iρ ⁣+ ⁣iiρ ⁣ ⁣i=Tr(σyρ).\langle +i \vert \rho \vert \!+\!i \rangle - \langle -i \vert \rho \vert \!-\!i \rangle = \operatorname{Tr}(\sigma_y \rho).
  3. For each of the qubits in the collection associated with σz\sigma_z we perform a σz\sigma_z measurement. This time the measurement basis is the standard basis {0,1},\{\vert 0\rangle, \vert 1 \rangle\}, the eigenvectors of σz.\sigma_z. Averaging the outcomes over all of the states in the collection associated with σz,\sigma_z, we obtain an approximation of the expectation value

    0ρ01ρ1=Tr(σzρ).\langle 0 \vert \rho \vert 0 \rangle - \langle 1 \vert \rho \vert 1 \rangle = \operatorname{Tr}(\sigma_z \rho).

Once we have obtained approximations

αxTr(σxρ),  αyTr(σyρ),  αzTr(σzρ)\alpha_x \approx \operatorname{Tr}(\sigma_x \rho),\; \alpha_y \approx \operatorname{Tr}(\sigma_y \rho),\; \alpha_z \approx \operatorname{Tr}(\sigma_z \rho)

by averaging the measurement outcomes for each collection, we can approximate ρ\rho as

ρ~=I+αxσx+αyσy+αzσz2I+Tr(σxρ)σx+Tr(σyρ)σy+Tr(σzρ)σz2=ρ.\tilde{\rho} = \frac{\mathbb{I} + \alpha_x \sigma_x + \alpha_y \sigma_y + \alpha_z \sigma_z}{2} \approx \frac{\mathbb{I} + \operatorname{Tr}(\sigma_x \rho) \sigma_x + \operatorname{Tr}(\sigma_y \rho) \sigma_y + \operatorname{Tr}(\sigma_z \rho) \sigma_z}{2} = \rho.

In the limit as NN approaches infinity, this approximation converges in probability to the true density matrix ρ\rho by the law of large numbers, and well-known statistical bounds (such as Hoeffding's inequality) can be used to bound the probability that the approximation ρ~\tilde{\rho} deviates from ρ\rho by varying amounts.

An important thing to recognize, however, is that the matrix ρ~\tilde{\rho} obtained in this way may fail to be a density matrix. In particular, although it will always have trace equal to 1,1, it may fail to be positive semidefinite. There are different known strategies for "rounding" such an approximation ρ~\tilde{\rho} to a density matrix, one of them being to compute a spectral decomposition, replace any negative eigenvalues with 0,0, and then renormalize (by dividing the matrix we obtain by its trace).

Qubit tomography using the tetrahedral measurement

Another option for performing qubit tomography is to measure every qubit X1,,XN\mathsf{X}_1,\ldots,\mathsf{X}_N using the tetrahedral measurement {P0,P1,P2,P3}\{P_0,P_1,P_2,P_3\} described earlier. That is,

P0=ϕ0ϕ02,P1=ϕ1ϕ12,P2=ϕ2ϕ22,P3=ϕ3ϕ32P_0 = \frac{\vert \phi_0 \rangle \langle \phi_0 \vert}{2}, \quad P_1 = \frac{\vert \phi_1 \rangle \langle \phi_1 \vert}{2}, \quad P_2 = \frac{\vert \phi_2 \rangle \langle \phi_2 \vert}{2}, \quad P_3 = \frac{\vert \phi_3 \rangle \langle \phi_3 \vert}{2}

for

ϕ0=0ϕ1=130+231ϕ2=130+23e2πi/31ϕ3=130+23e2πi/31.\begin{aligned} \vert \phi_0 \rangle & = \vert 0 \rangle\\ \vert \phi_1 \rangle & = \frac{1}{\sqrt{3}} \vert 0 \rangle + \sqrt{\frac{2}{3}} \vert 1 \rangle\\ \vert \phi_2 \rangle & = \frac{1}{\sqrt{3}} \vert 0 \rangle + \sqrt{\frac{2}{3}} e^{2\pi i/3} \vert 1 \rangle\\ \vert \phi_3 \rangle & = \frac{1}{\sqrt{3}} \vert 0 \rangle + \sqrt{\frac{2}{3}} e^{-2\pi i/3} \vert 1 \rangle. \end{aligned}

Each outcome is obtained some number of times, which we will denote as nan_a for each a{0,1,2,3},a\in\{0,1,2,3\}, so that n0+n1+n2+n3=N.n_0 + n_1 + n_2 + n_3 = N. The ratio of these numbers with NN provides an estimate of the probability associated with each possible outcome:

naNTr(Paρ).\frac{n_a}{N} \approx \operatorname{Tr}(P_a \rho).

Finally, we shall make use of the following remarkable formula:

ρ=a=03(3Tr(Paρ)12)ϕaϕa.\rho = \sum_{a=0}^3 \Bigl( 3 \operatorname{Tr}(P_a \rho) - \frac{1}{2}\Bigr) \vert \phi_a \rangle \langle \phi_a \vert.

To establish this formula, we can use the following equation for the absolute values squared of inner products of tetrahedral states, which can be checked through direct calculations.

ϕaϕb2={1a=b13ab.\bigl\vert \langle \phi_a \vert \phi_b \rangle \bigr\vert^2 = \begin{cases} 1 & a=b\\ \frac{1}{3} & a\neq b. \end{cases}

The four matrices

ϕ0ϕ0=(1000)ϕ1ϕ1=(13232323)ϕ2ϕ2=(1323e2πi/323e2πi/323)ϕ3ϕ3=(1323e2πi/323e2πi/323)\begin{aligned} \vert\phi_0\rangle \langle \phi_0 \vert & = \begin{pmatrix} 1 & 0\\[2mm] 0 & 0\end{pmatrix}\\[3mm] \vert\phi_1\rangle \langle \phi_1 \vert & = \begin{pmatrix} \frac{1}{3} & \frac{\sqrt{2}}{3}\\[2mm] \frac{\sqrt{2}}{3} & \frac{2}{3}\end{pmatrix}\\[3mm] \vert\phi_2\rangle \langle \phi_2 \vert & = \begin{pmatrix} \frac{1}{3} & \frac{\sqrt{2}}{3}e^{-2\pi i/3}\\[2mm] \frac{\sqrt{2}}{3}e^{2\pi i/3} & \frac{2}{3}\end{pmatrix}\\[3mm] \vert\phi_3\rangle \langle \phi_3 \vert & = \begin{pmatrix} \frac{1}{3} & \frac{\sqrt{2}}{3}e^{2\pi i/3}\\[2mm] \frac{\sqrt{2}}{3}e^{-2\pi i/3} & \frac{2}{3}\end{pmatrix} \end{aligned}

are linearly independent, so it suffices to prove that the formula is true when ρ=ϕbϕb\rho = \vert\phi_b\rangle\langle\phi_b\vert for b=0,1,2,3.b = 0,1,2,3. In particular,

3Tr(Paϕbϕb)12=32ϕaϕb212={1a=b0ab3 \operatorname{Tr}(P_a \vert\phi_b\rangle\langle\phi_b\vert) - \frac{1}{2} = \frac{3}{2} \vert \langle \phi_a \vert \phi_b \rangle \vert^2 - \frac{1}{2} = \begin{cases} 1 & a=b\\ 0 & a\neq b \end{cases}

and therefore

a=03(3Tr(Paϕbϕb)Tr(ϕbϕb)2)ϕaϕa=ϕbϕb.\sum_{a=0}^3 \biggl( 3 \operatorname{Tr}(P_a \vert\phi_b\rangle\langle\phi_b\vert) - \frac{\operatorname{Tr}(\vert\phi_b\rangle\langle\phi_b\vert)}{2}\biggr) \vert \phi_a \rangle \langle \phi_a \vert = \vert \phi_b\rangle\langle \phi_b \vert.

We arrive at an approximation of ρ:\rho:

ρ~=a=03(3naN12)ϕaϕa.\tilde{\rho} = \sum_{a=0}^3 \Bigl( \frac{3 n_a}{N} - \frac{1}{2}\Bigr) \vert \phi_a \rangle \langle \phi_a \vert.

This approximation will always be a Hermitian matrix having trace equal to one, but it may fail to be positive semidefinite. In this case, the approximation must be "rounded" to a density matrix, similar to the strategy involving Pauli measurements.

Was this page helpful?
Report a bug or request content on GitHub.