Naimark's theorem

Naimark's theorem is a fundamental fact concerning measurements. It states that every general measurement can be implemented in a simple way that's reminiscent of Stinespring representations of channels:

The system to be measured is first combined with an initialized workspace system, forming a compound system.
A unitary operation is then performed on the compound system.
Finally, the workspace system is measured with respect to a standard basis measurement, yielding the outcome of the original general measurement.

Theorem statement and proof

Let $\mathsf{X}$ be a system and let $\{P_0,\ldots,P_{m-1}\}$ be a collection of positive semidefinite matrices satisfying

P_0 + \cdots + P_{m-1} = \mathbb{I}_{\mathsf{X}},

which is to say that they describe a measurement of $\mathsf{X}.$ Also let $\mathsf{Y}$ be a system whose classical state set is $\{0,\ldots,m-1\},$ which is the set of possible outcomes of this measurement.

Naimark's theorem states that there exists a unitary operation $U$ on the compound system $(\mathsf{Y},\mathsf{X})$ so that the implementation suggested by the following figure yields measurement outcomes that agree with the given measurement $\{P_0,\ldots,P_{m-1}\},$ meaning that the probabilities for the different possible measurement outcomes are precisely in agreement.

An implementation of a general measurement as in Naimark's theorem

To be clear, the system $\mathsf{X}$ starts out in some arbitrary state $\rho$ while $\mathsf{Y}$ is initialized to the $\vert 0\rangle$ state. The unitary operation $U$ is applied to $(\mathsf{Y},\mathsf{X})$ and then the system $\mathsf{Y}$ is measured with a standard basis measurement, yielding some outcome $a\in\{0,\ldots,m-1\}.$

The system $\mathsf{X}$ is pictured as part of the output of the circuit, but for now we won't concern ourselves with the state of $\mathsf{X}$ after $U$ is performed, and can imagine that it is traced out. We'll be interested in the state of $\mathsf{X}$ after $U$ is performed later in the lesson, though.

An implementation of a measurement in this way is clearly reminiscent of a Stinespring representation of a channel, and the mathematical underpinnings are similar as well. The difference here is that the workspace system is measured rather than being traced out like in the case of a Stinespring representation.

The fact that every measurement can be implemented in this way is pretty simple to prove, but we're going to need a fact concerning positive semidefinite matrices first.

Fact. Suppose $P$ is an $n \times n$ positive semidefinite matrix. There exists a unique $n\times n$ positive semidefinite matrix $Q$ for which $Q^2 = P.$ This unique positive semidefinite matrix is called the square root of $P$ and is denoted $\sqrt{P}.$

One way to find the square root of a positive semidefinite matrix is to first compute a spectral decomposition.

P = \sum_{k=0}^{n-1} \lambda_k \vert \psi_k \rangle \langle \psi_k \vert

Because $P$ is positive semidefinite, its eigenvalues must be nonnegative real numbers, and by replacing them with their square roots we obtain an expression for the square root of $P.$

\sqrt{P} = \sum_{k=0}^{n-1} \sqrt{\lambda_k} \vert \psi_k \rangle \langle \psi_k \vert

With this concept in hand, we're ready to prove Naimark's theorem. Under the assumption that $\mathsf{X}$ has $n$ classical states, a unitary operation $U$ on the pair $(\mathsf{Y},\mathsf{X})$ can be represented by an $nm\times nm$ matrix, which we can view as an $m\times m$ block matrix whose blocks are $n\times n.$ The key to the proof is to take $U$ to be any unitary matrix that matches the following pattern.

U = \begin{pmatrix} \sqrt{P_0} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \sqrt{P_1} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \sqrt{P_{m-1}} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix}

For it to be possible to fill in the blocks marked with a question mark so that $U$ is unitary, it's both necessary and sufficient that the first $n$ columns, which are formed by the blocks $\sqrt{P_0},\ldots,\sqrt{P_{m-1}},$ are orthonormal. We can then use the Gram-Schmidt orthogonalization process to fill in the remaining columns, just like we encountered in the previous lesson.

The first $n$ columns of $U$ can be expressed as vectors in the following way, where $c = 0,\ldots,n-1$ refers to the column number starting from $0.$

\vert\gamma_c\rangle = \sum_{a = 0}^{m-1} \vert a \rangle \otimes \sqrt{P_a} \vert c\rangle

We can compute the inner product between any two of them as follows.

\langle \gamma_c \vert \gamma_d \rangle = \sum_{a,b = 0}^{m-1} \langle a \vert b \rangle \cdot \langle c \vert \sqrt{P_a}\sqrt{P_b}\, \vert d\rangle = \langle c \vert \Biggl(\sum_{a = 0}^{m-1} P_a \Biggr) \vert d\rangle = \langle c \vert d\rangle

This shows that these columns are in fact orthonormal, so we can fill in the remaining columns of $U$ in a way that guarantees the entire matrix is unitary.

It remains to check that the measurement outcome probabilities for the simulation are consistent with the original measurement. For a given initial state $\rho$ of $\mathsf{X},$ the measurement described by the collection $\{P_0,\ldots,P_{m-1}\}$ results in each outcome $a\in\{0,\ldots,m-1\}$ with probability $\operatorname{Tr}(P_a \rho).$

To obtain the outcome probabilities for the simulation, let's first give the name $\sigma$ to the state of $(\mathsf{Y},\mathsf{X})$ after $U$ has been performed. This state can be expressed as follows.

\sigma = U \bigl(\vert 0\rangle \langle 0 \vert \otimes \rho\bigr) U^{\dagger} = \sum_{a,b=0}^{m-1} \vert a\rangle \langle b \vert \otimes \sqrt{P_a} \rho \sqrt{P_b}

Equivalently, in a block matrix form, we have the following equation.

\begin{aligned} \sigma & = \begin{pmatrix} \sqrt{P_0} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \sqrt{P_1} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \sqrt{P_{m-1}} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix} \begin{pmatrix} \rho & 0 & \cdots & 0 \\[1mm] 0 & 0 & \cdots & 0 \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] 0 & 0 & \cdots & 0 \end{pmatrix} \begin{pmatrix} \sqrt{P_0} & \sqrt{P_1} & \cdots & \sqrt{P_{m-1}} \\[1mm] \fbox{?} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \fbox{?} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix}\\[5mm] & = \begin{pmatrix} \sqrt{P_0}\rho\sqrt{P_0} & \cdots & \sqrt{P_0}\rho\sqrt{P_{m-1}} \\[1mm] \vdots & \ddots & \vdots\\[1mm] \sqrt{P_{m-1}}\rho\sqrt{P_0} & \cdots & \sqrt{P_{m-1}}\rho\sqrt{P_{m-1}} \end{pmatrix} \end{aligned}

Notice that the entries of $U$ falling into the blocks marked with a question mark have no influence on the outcome by virtue of the fact that we're conjugating a matrix of the form $\vert 0 \rangle \langle 0 \vert \otimes \rho$ — so the question mark entries are always multiplied by zero entries of $\vert 0 \rangle \langle 0 \vert \otimes \rho$ when the matrix product is computed.

Now we can analyze what happens when a standard basis measurement is performed on $\mathsf{Y}.$ The probabilities of the possible outcomes are given by the diagonal entries of the reduced state $\sigma_{\mathsf{Y}}$ of $\mathsf{Y}.$

\sigma_{\mathsf{Y}} = \sum_{a,b=0}^{m-1} \operatorname{Tr}\Bigl(\sqrt{P_a} \rho \sqrt{P_b}\Bigr) \vert a\rangle \langle b \vert

In particular, using the cyclic property of the trace, we see that the probability to obtain a given outcome $a\in\{0,\ldots,m-1\}$ is as follows.

\langle a \vert \sigma_{\mathsf{Y}} \vert a \rangle = \operatorname{Tr}\Bigl(\sqrt{P_a} \rho \sqrt{P_a}\Bigr) = \operatorname{Tr}(P_a \rho)

This matches with the original measurement, establishing the correctness of the simulation.

Non-destructive measurements

So far in the lesson, we've concerned ourselves with destructive measurements, where the output consists of the classical measurement result alone and there is no specification of the post-measurement quantum state of the system that was measured.

Non-destructive measurements, on the other hand, do precisely this. Specifically, non-destructive measurements describe not only the classical measurement outcome probabilities, but also the state of the system that was measured conditioned on each possible measurement outcome. Note that the term non-destructive refers to the system being measured but not necessarily its state, which could change significantly as a result of the measurement.

In general, for a given destructive measurement, there will be multiple (in fact infinitely many) non-destructive measurements that are compatible with the given destructive measurement, meaning that the classical measurement outcome probabilities match precisely with the destructive measurement. So, there isn't a unique way to define the post-measurement quantum state of a system for a given measurement.

It is, in fact, possible to generalize non-destructive measurements even further, so that they produce a classical measurement outcome along with a quantum state output of a system that isn't necessarily the same as the input system.

The notion of a non-destructive measurement is an interesting and useful abstraction. It should, however, be recognized that non-destructive measurements can always be described as compositions of channels and destructive measurements — so there is a sense in which the notion of a destructive measurement is the more fundamental one.

From Naimark's theorem

Consider the simulation of a general measurement like we have in Naimark's theorem. A simple way to obtain a non-destructive measurement from this simulation is revealed by the figure from before, where the system $\mathsf{X}$ is not traced out, but is part of the output. This yields both a classical measurement outcome $a\in\{0,\ldots,m-1\}$ as well as a post-measurement quantum state of $\mathsf{X}.$

Let's describe these states in mathematical terms. We're assuming that the initial state of $\mathsf{X}$ is $\rho,$ so that after the initialized system $\mathsf{Y}$ is introduced and $U$ is performed, we have that $(\mathsf{Y},\mathsf{X})$ is in the state

\sigma = U \bigl(\vert 0\rangle \langle 0 \vert \otimes \rho\bigr) U^{\dagger} = \sum_{a,b=0}^{m-1} \vert a\rangle \langle b \vert \otimes \sqrt{P_a} \rho \sqrt{P_b}.

The probabilities for the different classical outcomes to appear are the same as before — they can't change as a result of us deciding to ignore or not ignore $\mathsf{X}.$ That is, we obtain each $a\in\{0,\ldots,m-1\}$ with probability $\operatorname{Tr}(P_a \rho).$

Conditioned upon having obtained a particular measurement outcome $a,$ the resulting state of $\mathsf{X}$ is given by this expression.

\frac{\sqrt{P_a} \rho \sqrt{P_a}}{\operatorname{Tr}(P_a \rho)}

One way to see this is to represent a standard basis measurement of $\mathsf{Y}$ by the completely dephasing channel $\Delta_m,$ where the channel output describes classical measurement outcomes as (diagonal) density matrices. An expression of the state we obtain is as follows.

\sum_{a,b=0}^{m-1} \Delta_m(\vert a\rangle \langle b \vert) \otimes \sqrt{P_a} \rho \sqrt{P_b} = \sum_{a=0}^{m-1} \vert a\rangle \langle a \vert \otimes \sqrt{P_a} \rho \sqrt{P_a}.

We can then write this state as a convex combination of product states,

\sum_{a=0}^{m-1} \operatorname{Tr}(P_a \rho)\, \vert a\rangle \langle a \vert \otimes \frac{\sqrt{P_a} \rho \sqrt{P_a}}{\operatorname{Tr}(P_a \rho)},

which is consistent with the expression we've obtained for the state of $\mathsf{X}$ conditioned on each possible measurement outcome.

From a Kraus representation

There are alternative selections for $U$ in the context of Naimark's theorem that produce the same measurement outcome probabilities but give entirely different output states of $\mathsf{X}.$

For instance, one option is to substitute $(\mathbb{I}_{\mathsf{Y}} \otimes V) U$ for $U,$ where $V$ is any unitary operation on $\mathsf{X}.$ The application of $V$ to $\mathsf{X}$ commutes with the measurement of $\mathsf{Y}$ so the classical outcome probabilities do not change, but now the state of $\mathsf{X}$ conditioned on the outcome $a$ becomes

\frac{V \sqrt{P_a} \rho \sqrt{P_a}V^{\dagger}}{\operatorname{Tr}(P_a \rho)}.

More generally, we could replace $U$ by the unitary matrix

\Biggl(\sum_{a=0}^{m-1} \vert a\rangle\langle a \vert \otimes V_a\Biggr) U

for any choice of unitary operations $V_0,\ldots,V_{m-1}$ on $\mathsf{X}.$ Again, the classical outcome probabilities are unchanged, but now the state of $\mathsf{X}$ conditioned on the outcome $a$ becomes

\frac{V_a \sqrt{P_a} \rho \sqrt{P_a}V_a^{\dagger}}{\operatorname{Tr}(P_a \rho)}.

An equivalent way to express this freedom is connected with Kraus representations. That is, we can describe an $m$ -outcome non-destructive measurement of a system having $n$ classical states by a selection of $n\times n$ Kraus matrices $A_0,\ldots,A_{m-1}$ satisfying the typical condition for Kraus matrices.

\sum_{a = 0}^{m-1} A_a^{\dagger} A_a = \mathbb{I}_{\mathsf{X}} \tag{1}

Assuming that the initial state of $\mathsf{X}$ is $\rho,$ the classical measurement outcome is $a$ with probability

\operatorname{Tr}\bigl(A_a \rho A_a^{\dagger}\bigr) = \operatorname{Tr}\bigl(A_a^{\dagger} A_a \rho \bigr)

and conditioned upon the outcome being $a$ the state of $\mathsf{X}$ becomes

\frac{A_a \rho A_a^{\dagger}}{\operatorname{Tr}(A_a^{\dagger}A_a \rho)}.

Note that this is equivalent to choosing the unitary operation $U$ in Naimark's theorem as follows.

U = \begin{pmatrix} A_{0} & \fbox{?} & \cdots & \fbox{?} \\[1mm] A_{1} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] A_{m-1} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix}

In the previous lesson we observed that the columns formed by the blocks $A_0,\ldots,A_{m-1}$ are necessarily orthogonal, by virtue of the condition $(1).$

Generalizations

There are even more general ways to formulate non-destructive measurements than the ways we've discussed. The notion of a quantum instrument (which won't be described here) represents one way to do this.

Was this page helpful?

Report a bug or request content on GitHub.