Quantum state discrimination and tomography

In the last part of the lesson, we'll briefly consider two tasks associated with measurements: quantum state discrimination and quantum state tomography.

Quantum state discrimination

For quantum state discrimination, we have a known collection of quantum states $\rho_0,\ldots,\rho_{m-1},$ along with probabilities $p_0,\ldots,p_{m-1}$ associated with these states. A succinct way of expressing this is to say that we have an ensemble
$\{(p_0,\rho_0),\ldots,(p_{m-1},\rho_{m-1})\}$
of quantum states.

A number $a\in\{0,\ldots,m-1\}$ is chosen randomly according to the probabilities $(p_0,\ldots,p_{m-1})$ and the system $\mathsf{X}$ is prepared in the state $\rho_a.$ The goal is to determine, by means of a measurement of $\mathsf{X}$ alone, which value of $a$ was chosen.

Thus, we have a finite number of alternatives, along with a prior — which is our knowledge of the probability for each $a$ to be selected — and the goal is to determine which alternative actually happened. This may be easy for some choices of states and probabilities, and for others it may not be possible without some chance of making an error.
Quantum state tomography

For quantum state tomography, we have an unknown quantum state of a system — so unlike in quantum state discrimination there's typically no prior or any information about possible alternatives.

This time, however, it's not a single copy of the state that's made available, but rather many independent copies are made available. That is, $N$ identical systems $\mathsf{X}_1,\ldots,\mathsf{X}_N$ are each independently prepared in the state $\rho$ for some (possibly large) number $N.$ The goal is to find an approximation of the unknown state, as a density matrix, by measuring the systems.

Discriminating between two states

The simplest case for quantum state discrimination is that there are two states, $\rho_0$ and $\rho_1,$ that are to be discriminated.

Imagine a situation in which a bit $a$ is chosen randomly: $a = 0$ with probability $p$ and $a = 1$ with probability $1 - p.$ A system $\mathsf{X}$ is prepared in the state $\rho_a,$ meaning $\rho_0$ or $\rho_1$ depending on the value of $a,$ and given to us. Our goal is to correctly guess the value of $a$ by means of a measurement on $\mathsf{X}.$ To be precise, we shall aim to maximize the probability that our guess is correct.

An optimal measurement

An optimal way to solve this problem begins with a spectral decomposition of a weighted difference between $\rho_0$ and $\rho_1,$ where the weights are the corresponding probabilities.

p \rho_0 - (1-p) \rho_1 = \sum_{k = 0}^{n-1} \lambda_k \vert \psi_k \rangle \langle \psi_k \vert

Notice that we have a minus sign rather than a plus sign in this expression: this is a weighted difference not a weighted sum.

We can maximize the probability of a correct guess by selecting a projective measurement $\{\Pi_0,\Pi_1\}$ as follows. First let's partition the elements of $\{0,\ldots,n-1\}$ into two disjoint sets $S_0$ and $S_1$ depending upon whether the corresponding eigenvalue of the weighted difference is nonnegative or negative.

\begin{gathered} S_0 = \{k\in\{0,\ldots,n-1\} : \lambda_k \geq 0 \}\\[2mm] S_1 = \{k\in\{0,\ldots,n-1\} : \lambda_k < 0 \} \end{gathered}

We can then choose a projective measurement as follows.

\Pi_0 = \sum_{k \in S_0} \vert \psi_k \rangle \langle \psi_k \vert \quad\text{and}\quad \Pi_1 = \sum_{k \in S_1} \vert \psi_k \rangle \langle \psi_k \vert

(It doesn't actually matter in which set $S_0$ or $S_1$ we include the values of $k$ for which $\lambda_k = 0.$ Here we're choosing arbitrarily to include these values in $S_0.$ )

This is an optimal measurement in the situation at hand that minimizes the probability of an incorrect determination of the selected state.

Correctness probability

Now we will determine the probability of correctness for the measurement $\{\Pi_0,\Pi_1\}.$

To begin we don't really need to be concerned with the specific choice we've made for $\Pi_0$ and $\Pi_1,$ though it may be helpful to keep it in mind. For any measurement $\{P_0,P_1\}$ (not necessarily projective) we can write the correctness probability as follows.

p \operatorname{Tr}(P_0 \rho_0) + (1 - p) \operatorname{Tr}(P_1 \rho_1)

Using the fact that $\{P_0,P_1\}$ is a measurement, so $P_1 = \mathbb{I} - P_0,$ we can rewrite this expression as follows.

p \operatorname{Tr}(P_0 \rho_0) + (1 - p) \operatorname{Tr}((\mathbb{I} - P_0) \rho_1)\hspace*{3cm}\\[1mm] \begin{aligned} & = p \operatorname{Tr}(P_0 \rho_0) - (1 - p) \operatorname{Tr}(P_0 \rho_1) + (1-p) \operatorname{Tr}(\rho_1)\\[1mm] & = \operatorname{Tr}\bigl( P_0 (p \rho_0 - (1-p)\rho_1) \bigr) + 1 - p \end{aligned}

On the other hand, we could have made the substitution $P_0 = \mathbb{I} - P_1$ instead. That wouldn't change the value but it does give us an alternative expression.

p \operatorname{Tr}((\mathbb{I} - P_1) \rho_0) + (1 - p) \operatorname{Tr}(P_1 \rho_1)\hspace*{3cm}\\[1mm] \begin{aligned} & = p \operatorname{Tr}(\rho_0) - p \operatorname{Tr}(P_1 \rho_0) + (1 - p) \operatorname{Tr}(P_1 \rho_1)\\[1mm] & = p - \operatorname{Tr}\bigl( P_1 (p \rho_0 - (1-p)\rho_1) \bigr) \end{aligned}

The two expressions have the same value, so we can average them to give yet another expression for this value. (Averaging the two expressions is just a trick to simplify the resulting expression.)

\frac{1}{2} \bigl(\operatorname{Tr}\bigl( P_0 (p \rho_0 - (1-p)\rho_1) \bigr) + 1-p\bigr) + \frac{1}{2} \bigl(p - \operatorname{Tr}\bigl( P_1 (p \rho_0 - (1-p)\rho_1) \bigr)\bigr)\\ = \frac{1}{2} \operatorname{Tr}\bigl( (P_0-P_1) (p \rho_0 - (1-p)\rho_1)\bigr) + \frac{1}{2}

Now we can see why it makes sense to choose the projections $\Pi_0$ and $\Pi_1$ (as specified above) for $P_0$ and $P_1,$ respectively — because that's how we can make the trace in the final expression as large as possible. In particular,

(\Pi_0-\Pi_1) (p \rho_0 - (1-p)\rho_1) = \sum_{k = 0}^{n-1} \vert\lambda_k\vert \cdot \vert \psi_k \rangle \langle \psi_k \vert.

So, when we take the trace, we obtain the sum of the absolute values of the eigenvalues — which is equal to what's known as the trace norm of the weighted difference.

\operatorname{Tr}\bigl( (\Pi_0-\Pi_1) (p \rho_0 - (1-p)\rho_1)\bigr) = \sum_{k = 0}^{n-1} \vert\lambda_k\vert = \bigl\| p \rho_0 - (1-p)\rho_1 \bigr\|_1

Thus, the probability that the measurement $\{\Pi_0,\Pi_1\}$ leads to a correct discrimination of $\rho_0$ and $\rho_1,$ given with probabilities $p$ and $1-p,$ respectively, is as follows.

\frac{1}{2} + \frac{1}{2} \bigl\| p \rho_0 - (1-p)\rho_1 \bigr\|_1

The fact that this is the optimal probability for a correct discrimination of $\rho_0$ and $\rho_1,$ given with probabilities $p$ and $1-p,$ is commonly referred to as the Helstrom–Holevo theorem (or sometimes just Helstrom's theorem).

Discriminating three or more states

For quantum state discrimination when there are three or more states, there is no known closed-form solution for an optimal measurement, although it is possible to formulate the problem as a semidefinite program — which allows for efficient numerical approximations of optimal measurements with the help of a computer.

It is also possible to verify (or falsify) optimality of a given measurement in a state discrimination task through a condition known as the Holevo-Yuen-Kennedy-Lax condition. In particular, for the state discrimination task defined by the ensemble

\{(p_0,\rho_0),\ldots,(p_{m-1},\rho_{m-1})\},

the measurement $\{P_0,\ldots,P_{m-1}\}$ is optimal if and only if the matrix

Q_a = \sum_{b = 0}^{m-1} p_b \rho_b P_b - p_a \rho_a

is positive semidefinite for every $a\in\{0,\ldots,m-1\}.$

For example, consider the quantum state discrimination task in which one of the four tetrahedral states $\vert\phi_0\rangle,\ldots,\vert\phi_3\rangle$ is selected uniformly at random. The tetrahedral measurement $\{P_0,P_1,P_2,P_3\}$ succeeds with probability

\frac{1}{4} \operatorname{Tr}(P_0 \vert\phi_0\rangle\langle \phi_0 \vert) + \frac{1}{4} \operatorname{Tr}(P_1 \vert\phi_1\rangle\langle \phi_1 \vert) + \frac{1}{4} \operatorname{Tr}(P_2 \vert\phi_2\rangle\langle \phi_2 \vert) + \frac{1}{4} \operatorname{Tr}(P_3 \vert\phi_3\rangle\langle \phi_3 \vert) = \frac{1}{2}.

This is optimal by the Holevo-Yuen-Kennedy-Lax condition, as a calculation reveals that

Q_a = \frac{1}{4}(\mathbb{I} - \vert\phi_a\rangle\langle\phi_a\vert) \geq 0

for $a = 0,1,2,3.$

Quantum state tomography

Finally, we'll briefly discuss the problem of quantum state tomography. For this problem, we're given a large number $N$ of independent copies of an unknown quantum state $\rho,$ and the goal is to reconstruct an approximation $\tilde{\rho}$ of $\rho.$ To be clear, this means that we wish to find a classical description of a density matrix $\tilde{\rho}$ that is as close as possible to $\rho.$

We can alternatively describe the set-up in the following way. An unknown density matrix $\rho$ is selected, and we're given access to $N$ quantum systems $\mathsf{X}_1,\ldots,\mathsf{X}_N,$ each of which has been independently prepared in the state $\rho.$ Thus, the state of the compound system $(\mathsf{X}_1,\ldots,\mathsf{X}_N)$ is

\rho^{\otimes N} = \rho \otimes \rho \otimes \cdots \otimes \rho \quad \text{($N$ times)}

The goal is to perform measurements on the systems $\mathsf{X}_1,\ldots,\mathsf{X}_N$ and, based on the outcomes of those measurements, to compute a density matrix $\tilde{\rho}$ that closely approximates $\rho.$ This turns out to be a fascinating problem and there is ongoing research on it.

Different types of strategies for approaching the problem may be considered. For example, we can imagine a strategy where each of the systems $\mathsf{X}_1,\ldots,\mathsf{X}_N$ is measured separately, in turn, producing a sequence of measurement outcomes. Different specific choices for which measurements are performed can be made, including adaptive and non-adaptive selections. In other words, the choice of what measurement is performed on a particular system might or might not depend on the outcomes of prior measurements. Based on the sequence of measurement outcomes, a guess $\tilde{\rho}$ for the state $\rho$ is derived — and again there are different methodologies for doing this.

An alternative approach is to perform a single joint measurement of the entire collection, where we think about $(\mathsf{X}_1,\ldots,\mathsf{X}_N)$ as a single system and select a single measurement whose output is a guess $\tilde{\rho}$ for the state $\rho.$ This can lead to an improved estimate over what is possible for separate measurements of the individual systems, although a joint measurement on all of the systems together is likely to be much more difficult to implement.

Qubit tomography using Pauli measurements

We'll now consider quantum state tomography in the simple case where $\rho$ is a qubit density matrix. We assume that we're given qubits $\mathsf{X}_1,\ldots,\mathsf{X}_N$ that are each independently in the state $\rho,$ and our goal is to compute an approximation $\tilde{\rho}$ that is close to $\rho.$

Our strategy will be to divide the $N$ qubits $\mathsf{X}_1,\ldots,\mathsf{X}_N$ into three roughly equal-size collections, one for each of the three Pauli matrices $\sigma_x,$ $\sigma_y,$ and $\sigma_z.$ Each qubit is then measured independently as follows.

For each of the qubits in the collection associated with $\sigma_x$ we perform a $\sigma_x$ measurement. This means that the qubit is measured with respect to the basis $\{\vert + \rangle, \vert -\rangle\},$ which is an orthonormal basis of eigenvectors of $\sigma_x,$ and the corresponding measurement outcomes are the eigenvalues associated with the two eigenvectors: $+1$ for the state $\vert + \rangle$ and $-1$ for the state $\vert -\rangle.$ By averaging together the outcomes over all of the states in the collection associated with $\sigma_x,$ we obtain an approximation of the expectation value
$\langle + \vert \rho \vert + \rangle - \langle - \vert \rho \vert - \rangle = \operatorname{Tr}(\sigma_x \rho).$
For each of the qubits in the collection associated with $\sigma_y$ we perform a $\sigma_y$ measurement. Such a measurement is similar to a $\sigma_x$ measurement, except that the measurement basis is $\{\vert\! +\!i \rangle, \vert\! -\!i \rangle\},$ the eigenvectors of $\sigma_y.$ Averaging the outcomes over all of the states in the collection associated with $\sigma_y,$ we obtain an approximation of the expectation value
$\langle +i \vert \rho \vert \!+\!i \rangle - \langle -i \vert \rho \vert \!-\!i \rangle = \operatorname{Tr}(\sigma_y \rho).$
For each of the qubits in the collection associated with $\sigma_z$ we perform a $\sigma_z$ measurement. This time the measurement basis is the standard basis $\{\vert 0\rangle, \vert 1 \rangle\},$ the eigenvectors of $\sigma_z.$ Averaging the outcomes over all of the states in the collection associated with $\sigma_z,$ we obtain an approximation of the expectation value
$\langle 0 \vert \rho \vert 0 \rangle - \langle 1 \vert \rho \vert 1 \rangle = \operatorname{Tr}(\sigma_z \rho).$

Once we have obtained approximations

\alpha_x \approx \operatorname{Tr}(\sigma_x \rho),\; \alpha_y \approx \operatorname{Tr}(\sigma_y \rho),\; \alpha_z \approx \operatorname{Tr}(\sigma_z \rho)

by averaging the measurement outcomes for each collection, we can approximate $\rho$ as

\tilde{\rho} = \frac{\mathbb{I} + \alpha_x \sigma_x + \alpha_y \sigma_y + \alpha_z \sigma_z}{2} \approx \frac{\mathbb{I} + \operatorname{Tr}(\sigma_x \rho) \sigma_x + \operatorname{Tr}(\sigma_y \rho) \sigma_y + \operatorname{Tr}(\sigma_z \rho) \sigma_z}{2} = \rho.

In the limit as $N$ approaches infinity, this approximation converges in probability to the true density matrix $\rho$ by the law of large numbers, and well-known statistical bounds (such as Hoeffding's inequality) can be used to bound the probability that the approximation $\tilde{\rho}$ deviates from $\rho$ by varying amounts.

An important thing to recognize, however, is that the matrix $\tilde{\rho}$ obtained in this way may fail to be a density matrix. In particular, although it will always have trace equal to $1,$ it may fail to be positive semidefinite. There are different known strategies for "rounding" such an approximation $\tilde{\rho}$ to a density matrix, one of them being to compute a spectral decomposition, replace any negative eigenvalues with $0,$ and then renormalize (by dividing the matrix we obtain by its trace).

Qubit tomography using the tetrahedral measurement

Another option for performing qubit tomography is to measure every qubit $\mathsf{X}_1,\ldots,\mathsf{X}_N$ using the tetrahedral measurement $\{P_0,P_1,P_2,P_3\}$ described earlier. That is,

P_0 = \frac{\vert \phi_0 \rangle \langle \phi_0 \vert}{2}, \quad P_1 = \frac{\vert \phi_1 \rangle \langle \phi_1 \vert}{2}, \quad P_2 = \frac{\vert \phi_2 \rangle \langle \phi_2 \vert}{2}, \quad P_3 = \frac{\vert \phi_3 \rangle \langle \phi_3 \vert}{2}

for

\begin{aligned} \vert \phi_0 \rangle & = \vert 0 \rangle\\ \vert \phi_1 \rangle & = \frac{1}{\sqrt{3}} \vert 0 \rangle + \sqrt{\frac{2}{3}} \vert 1 \rangle\\ \vert \phi_2 \rangle & = \frac{1}{\sqrt{3}} \vert 0 \rangle + \sqrt{\frac{2}{3}} e^{2\pi i/3} \vert 1 \rangle\\ \vert \phi_3 \rangle & = \frac{1}{\sqrt{3}} \vert 0 \rangle + \sqrt{\frac{2}{3}} e^{-2\pi i/3} \vert 1 \rangle. \end{aligned}

Each outcome is obtained some number of times, which we will denote as $n_a$ for each $a\in\{0,1,2,3\},$ so that $n_0 + n_1 + n_2 + n_3 = N.$ The ratio of these numbers with $N$ provides an estimate of the probability associated with each possible outcome:

\frac{n_a}{N} \approx \operatorname{Tr}(P_a \rho).

Finally, we shall make use of the following remarkable formula:

\rho = \sum_{a=0}^3 \Bigl( 3 \operatorname{Tr}(P_a \rho) - \frac{1}{2}\Bigr) \vert \phi_a \rangle \langle \phi_a \vert.

To establish this formula, we can use the following equation for the absolute values squared of inner products of tetrahedral states, which can be checked through direct calculations.

\bigl\vert \langle \phi_a \vert \phi_b \rangle \bigr\vert^2 = \begin{cases} 1 & a=b\\ \frac{1}{3} & a\neq b. \end{cases}

The four matrices

\begin{aligned} \vert\phi_0\rangle \langle \phi_0 \vert & = \begin{pmatrix} 1 & 0\\[2mm] 0 & 0\end{pmatrix}\\[3mm] \vert\phi_1\rangle \langle \phi_1 \vert & = \begin{pmatrix} \frac{1}{3} & \frac{\sqrt{2}}{3}\\[2mm] \frac{\sqrt{2}}{3} & \frac{2}{3}\end{pmatrix}\\[3mm] \vert\phi_2\rangle \langle \phi_2 \vert & = \begin{pmatrix} \frac{1}{3} & \frac{\sqrt{2}}{3}e^{-2\pi i/3}\\[2mm] \frac{\sqrt{2}}{3}e^{2\pi i/3} & \frac{2}{3}\end{pmatrix}\\[3mm] \vert\phi_3\rangle \langle \phi_3 \vert & = \begin{pmatrix} \frac{1}{3} & \frac{\sqrt{2}}{3}e^{2\pi i/3}\\[2mm] \frac{\sqrt{2}}{3}e^{-2\pi i/3} & \frac{2}{3}\end{pmatrix} \end{aligned}

are linearly independent, so it suffices to prove that the formula is true when $\rho = \vert\phi_b\rangle\langle\phi_b\vert$ for $b = 0,1,2,3.$ In particular,

3 \operatorname{Tr}(P_a \vert\phi_b\rangle\langle\phi_b\vert) - \frac{1}{2} = \frac{3}{2} \vert \langle \phi_a \vert \phi_b \rangle \vert^2 - \frac{1}{2} = \begin{cases} 1 & a=b\\ 0 & a\neq b \end{cases}

and therefore

\sum_{a=0}^3 \biggl( 3 \operatorname{Tr}(P_a \vert\phi_b\rangle\langle\phi_b\vert) - \frac{\operatorname{Tr}(\vert\phi_b\rangle\langle\phi_b\vert)}{2}\biggr) \vert \phi_a \rangle \langle \phi_a \vert = \vert \phi_b\rangle\langle \phi_b \vert.

We arrive at an approximation of $\rho:$

\tilde{\rho} = \sum_{a=0}^3 \Bigl( \frac{3 n_a}{N} - \frac{1}{2}\Bigr) \vert \phi_a \rangle \langle \phi_a \vert.

This approximation will always be a Hermitian matrix having trace equal to one, but it may fail to be positive semidefinite. In this case, the approximation must be "rounded" to a density matrix, similar to the strategy involving Pauli measurements.

Was this page helpful?

Report a bug or request content on GitHub.