Skip to main content
IBM Quantum Platform

Naimark's theorem

Naimark's theorem is a fundamental fact concerning measurements. It states that every general measurement can be implemented in a simple way that's reminiscent of Stinespring representations of channels:

  1. The system to be measured is first combined with an initialized workspace system, forming a compound system.
  2. A unitary operation is then performed on the compound system.
  3. Finally, the workspace system is measured with respect to a standard basis measurement, yielding the outcome of the original general measurement.

Theorem statement and proof

Let X\mathsf{X} be a system and let {P0,,Pm1}\{P_0,\ldots,P_{m-1}\} be a collection of positive semidefinite matrices satisfying

P0++Pm1=IX,P_0 + \cdots + P_{m-1} = \mathbb{I}_{\mathsf{X}},

which is to say that they describe a measurement of X.\mathsf{X}. Also let Y\mathsf{Y} be a system whose classical state set is {0,,m1},\{0,\ldots,m-1\}, which is the set of possible outcomes of this measurement.

Naimark's theorem states that there exists a unitary operation UU on the compound system (Y,X)(\mathsf{Y},\mathsf{X}) so that the implementation suggested by the following figure yields measurement outcomes that agree with the given measurement {P0,,Pm1},\{P_0,\ldots,P_{m-1}\}, meaning that the probabilities for the different possible measurement outcomes are precisely in agreement.

An implementation of a general measurement as in Naimark's theorem

To be clear, the system X\mathsf{X} starts out in some arbitrary state ρ\rho while Y\mathsf{Y} is initialized to the 0\vert 0\rangle state. The unitary operation UU is applied to (Y,X)(\mathsf{Y},\mathsf{X}) and then the system Y\mathsf{Y} is measured with a standard basis measurement, yielding some outcome a{0,,m1}.a\in\{0,\ldots,m-1\}.

The system X\mathsf{X} is pictured as part of the output of the circuit, but for now we won't concern ourselves with the state of X\mathsf{X} after UU is performed, and can imagine that it is traced out. We'll be interested in the state of X\mathsf{X} after UU is performed later in the lesson, though.

An implementation of a measurement in this way is clearly reminiscent of a Stinespring representation of a channel, and the mathematical underpinnings are similar as well. The difference here is that the workspace system is measured rather than being traced out like in the case of a Stinespring representation.

The fact that every measurement can be implemented in this way is pretty simple to prove, but we're going to need a fact concerning positive semidefinite matrices first.

Fact. Suppose PP is an n×nn \times n positive semidefinite matrix. There exists a unique n×nn\times n positive semidefinite matrix QQ for which Q2=P.Q^2 = P. This unique positive semidefinite matrix is called the square root of PP and is denoted P.\sqrt{P}.

One way to find the square root of a positive semidefinite matrix is to first compute a spectral decomposition.

P=k=0n1λkψkψkP = \sum_{k=0}^{n-1} \lambda_k \vert \psi_k \rangle \langle \psi_k \vert

Because PP is positive semidefinite, its eigenvalues must be nonnegative real numbers, and by replacing them with their square roots we obtain an expression for the square root of P.P.

P=k=0n1λkψkψk\sqrt{P} = \sum_{k=0}^{n-1} \sqrt{\lambda_k} \vert \psi_k \rangle \langle \psi_k \vert

With this concept in hand, we're ready to prove Naimark's theorem. Under the assumption that X\mathsf{X} has nn classical states, a unitary operation UU on the pair (Y,X)(\mathsf{Y},\mathsf{X}) can be represented by an nm×nmnm\times nm matrix, which we can view as an m×mm\times m block matrix whose blocks are n×n.n\times n. The key to the proof is to take UU to be any unitary matrix that matches the following pattern.

U=(P0??P1??Pm1??)U = \begin{pmatrix} \sqrt{P_0} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \sqrt{P_1} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \sqrt{P_{m-1}} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix}

For it to be possible to fill in the blocks marked with a question mark so that UU is unitary, it's both necessary and sufficient that the first nn columns, which are formed by the blocks P0,,Pm1,\sqrt{P_0},\ldots,\sqrt{P_{m-1}}, are orthonormal. We can then use the Gram-Schmidt orthogonalization process to fill in the remaining columns, just like we encountered in the previous lesson.

The first nn columns of UU can be expressed as vectors in the following way, where c=0,,n1c = 0,\ldots,n-1 refers to the column number starting from 0.0.

γc=a=0m1aPac\vert\gamma_c\rangle = \sum_{a = 0}^{m-1} \vert a \rangle \otimes \sqrt{P_a} \vert c\rangle

We can compute the inner product between any two of them as follows.

γcγd=a,b=0m1abcPaPbd=c(a=0m1Pa)d=cd\langle \gamma_c \vert \gamma_d \rangle = \sum_{a,b = 0}^{m-1} \langle a \vert b \rangle \cdot \langle c \vert \sqrt{P_a}\sqrt{P_b}\, \vert d\rangle = \langle c \vert \Biggl(\sum_{a = 0}^{m-1} P_a \Biggr) \vert d\rangle = \langle c \vert d\rangle

This shows that these columns are in fact orthonormal, so we can fill in the remaining columns of UU in a way that guarantees the entire matrix is unitary.

It remains to check that the measurement outcome probabilities for the simulation are consistent with the original measurement. For a given initial state ρ\rho of X,\mathsf{X}, the measurement described by the collection {P0,,Pm1}\{P_0,\ldots,P_{m-1}\} results in each outcome a{0,,m1}a\in\{0,\ldots,m-1\} with probability Tr(Paρ).\operatorname{Tr}(P_a \rho).

To obtain the outcome probabilities for the simulation, let's first give the name σ\sigma to the state of (Y,X)(\mathsf{Y},\mathsf{X}) after UU has been performed. This state can be expressed as follows.

σ=U(00ρ)U=a,b=0m1abPaρPb\sigma = U \bigl(\vert 0\rangle \langle 0 \vert \otimes \rho\bigr) U^{\dagger} = \sum_{a,b=0}^{m-1} \vert a\rangle \langle b \vert \otimes \sqrt{P_a} \rho \sqrt{P_b}

Equivalently, in a block matrix form, we have the following equation.

σ=(P0??P1??Pm1??)(ρ00000000)(P0P1Pm1??????)=(P0ρP0P0ρPm1Pm1ρP0Pm1ρPm1)\begin{aligned} \sigma & = \begin{pmatrix} \sqrt{P_0} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \sqrt{P_1} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \sqrt{P_{m-1}} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix} \begin{pmatrix} \rho & 0 & \cdots & 0 \\[1mm] 0 & 0 & \cdots & 0 \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] 0 & 0 & \cdots & 0 \end{pmatrix} \begin{pmatrix} \sqrt{P_0} & \sqrt{P_1} & \cdots & \sqrt{P_{m-1}} \\[1mm] \fbox{?} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \fbox{?} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix}\\[5mm] & = \begin{pmatrix} \sqrt{P_0}\rho\sqrt{P_0} & \cdots & \sqrt{P_0}\rho\sqrt{P_{m-1}} \\[1mm] \vdots & \ddots & \vdots\\[1mm] \sqrt{P_{m-1}}\rho\sqrt{P_0} & \cdots & \sqrt{P_{m-1}}\rho\sqrt{P_{m-1}} \end{pmatrix} \end{aligned}

Notice that the entries of UU falling into the blocks marked with a question mark have no influence on the outcome by virtue of the fact that we're conjugating a matrix of the form 00ρ\vert 0 \rangle \langle 0 \vert \otimes \rho — so the question mark entries are always multiplied by zero entries of 00ρ\vert 0 \rangle \langle 0 \vert \otimes \rho when the matrix product is computed.

Now we can analyze what happens when a standard basis measurement is performed on Y.\mathsf{Y}. The probabilities of the possible outcomes are given by the diagonal entries of the reduced state σY\sigma_{\mathsf{Y}} of Y.\mathsf{Y}.

σY=a,b=0m1Tr(PaρPb)ab\sigma_{\mathsf{Y}} = \sum_{a,b=0}^{m-1} \operatorname{Tr}\Bigl(\sqrt{P_a} \rho \sqrt{P_b}\Bigr) \vert a\rangle \langle b \vert

In particular, using the cyclic property of the trace, we see that the probability to obtain a given outcome a{0,,m1}a\in\{0,\ldots,m-1\} is as follows.

aσYa=Tr(PaρPa)=Tr(Paρ)\langle a \vert \sigma_{\mathsf{Y}} \vert a \rangle = \operatorname{Tr}\Bigl(\sqrt{P_a} \rho \sqrt{P_a}\Bigr) = \operatorname{Tr}(P_a \rho)

This matches with the original measurement, establishing the correctness of the simulation.


Non-destructive measurements

So far in the lesson, we've concerned ourselves with destructive measurements, where the output consists of the classical measurement result alone and there is no specification of the post-measurement quantum state of the system that was measured.

Non-destructive measurements, on the other hand, do precisely this. Specifically, non-destructive measurements describe not only the classical measurement outcome probabilities, but also the state of the system that was measured conditioned on each possible measurement outcome. Note that the term non-destructive refers to the system being measured but not necessarily its state, which could change significantly as a result of the measurement.

In general, for a given destructive measurement, there will be multiple (in fact infinitely many) non-destructive measurements that are compatible with the given destructive measurement, meaning that the classical measurement outcome probabilities match precisely with the destructive measurement. So, there isn't a unique way to define the post-measurement quantum state of a system for a given measurement.

It is, in fact, possible to generalize non-destructive measurements even further, so that they produce a classical measurement outcome along with a quantum state output of a system that isn't necessarily the same as the input system.

The notion of a non-destructive measurement is an interesting and useful abstraction. It should, however, be recognized that non-destructive measurements can always be described as compositions of channels and destructive measurements — so there is a sense in which the notion of a destructive measurement is the more fundamental one.

From Naimark's theorem

Consider the simulation of a general measurement like we have in Naimark's theorem. A simple way to obtain a non-destructive measurement from this simulation is revealed by the figure from before, where the system X\mathsf{X} is not traced out, but is part of the output. This yields both a classical measurement outcome a{0,,m1}a\in\{0,\ldots,m-1\} as well as a post-measurement quantum state of X.\mathsf{X}.

Let's describe these states in mathematical terms. We're assuming that the initial state of X\mathsf{X} is ρ,\rho, so that after the initialized system Y\mathsf{Y} is introduced and UU is performed, we have that (Y,X)(\mathsf{Y},\mathsf{X}) is in the state

σ=U(00ρ)U=a,b=0m1abPaρPb.\sigma = U \bigl(\vert 0\rangle \langle 0 \vert \otimes \rho\bigr) U^{\dagger} = \sum_{a,b=0}^{m-1} \vert a\rangle \langle b \vert \otimes \sqrt{P_a} \rho \sqrt{P_b}.

The probabilities for the different classical outcomes to appear are the same as before — they can't change as a result of us deciding to ignore or not ignore X.\mathsf{X}. That is, we obtain each a{0,,m1}a\in\{0,\ldots,m-1\} with probability Tr(Paρ).\operatorname{Tr}(P_a \rho).

Conditioned upon having obtained a particular measurement outcome a,a, the resulting state of X\mathsf{X} is given by this expression.

PaρPaTr(Paρ)\frac{\sqrt{P_a} \rho \sqrt{P_a}}{\operatorname{Tr}(P_a \rho)}

One way to see this is to represent a standard basis measurement of Y\mathsf{Y} by the completely dephasing channel Δm,\Delta_m, where the channel output describes classical measurement outcomes as (diagonal) density matrices. An expression of the state we obtain is as follows.

a,b=0m1Δm(ab)PaρPb=a=0m1aaPaρPa.\sum_{a,b=0}^{m-1} \Delta_m(\vert a\rangle \langle b \vert) \otimes \sqrt{P_a} \rho \sqrt{P_b} = \sum_{a=0}^{m-1} \vert a\rangle \langle a \vert \otimes \sqrt{P_a} \rho \sqrt{P_a}.

We can then write this state as a convex combination of product states,

a=0m1Tr(Paρ)aaPaρPaTr(Paρ),\sum_{a=0}^{m-1} \operatorname{Tr}(P_a \rho)\, \vert a\rangle \langle a \vert \otimes \frac{\sqrt{P_a} \rho \sqrt{P_a}}{\operatorname{Tr}(P_a \rho)},

which is consistent with the expression we've obtained for the state of X\mathsf{X} conditioned on each possible measurement outcome.

From a Kraus representation

There are alternative selections for UU in the context of Naimark's theorem that produce the same measurement outcome probabilities but give entirely different output states of X.\mathsf{X}.

For instance, one option is to substitute (IYV)U(\mathbb{I}_{\mathsf{Y}} \otimes V) U for U,U, where VV is any unitary operation on X.\mathsf{X}. The application of VV to X\mathsf{X} commutes with the measurement of Y\mathsf{Y} so the classical outcome probabilities do not change, but now the state of X\mathsf{X} conditioned on the outcome aa becomes

VPaρPaVTr(Paρ).\frac{V \sqrt{P_a} \rho \sqrt{P_a}V^{\dagger}}{\operatorname{Tr}(P_a \rho)}.

More generally, we could replace UU by the unitary matrix

(a=0m1aaVa)U\Biggl(\sum_{a=0}^{m-1} \vert a\rangle\langle a \vert \otimes V_a\Biggr) U

for any choice of unitary operations V0,,Vm1V_0,\ldots,V_{m-1} on X.\mathsf{X}. Again, the classical outcome probabilities are unchanged, but now the state of X\mathsf{X} conditioned on the outcome aa becomes

VaPaρPaVaTr(Paρ).\frac{V_a \sqrt{P_a} \rho \sqrt{P_a}V_a^{\dagger}}{\operatorname{Tr}(P_a \rho)}.

An equivalent way to express this freedom is connected with Kraus representations. That is, we can describe an mm-outcome non-destructive measurement of a system having nn classical states by a selection of n×nn\times n Kraus matrices A0,,Am1A_0,\ldots,A_{m-1} satisfying the typical condition for Kraus matrices.

a=0m1AaAa=IX(1)\sum_{a = 0}^{m-1} A_a^{\dagger} A_a = \mathbb{I}_{\mathsf{X}} \tag{1}

Assuming that the initial state of X\mathsf{X} is ρ,\rho, the classical measurement outcome is aa with probability

Tr(AaρAa)=Tr(AaAaρ)\operatorname{Tr}\bigl(A_a \rho A_a^{\dagger}\bigr) = \operatorname{Tr}\bigl(A_a^{\dagger} A_a \rho \bigr)

and conditioned upon the outcome being aa the state of X\mathsf{X} becomes

AaρAaTr(AaAaρ).\frac{A_a \rho A_a^{\dagger}}{\operatorname{Tr}(A_a^{\dagger}A_a \rho)}.

Note that this is equivalent to choosing the unitary operation UU in Naimark's theorem as follows.

U=(A0??A1??Am1??)U = \begin{pmatrix} A_{0} & \fbox{?} & \cdots & \fbox{?} \\[1mm] A_{1} & \fbox{?} & \cdots & \fbox{?} \\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] A_{m-1} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix}

In the previous lesson we observed that the columns formed by the blocks A0,,Am1A_0,\ldots,A_{m-1} are necessarily orthogonal, by virtue of the condition (1).(1).

Generalizations

There are even more general ways to formulate non-destructive measurements than the ways we've discussed. The notion of a quantum instrument (which won't be described here) represents one way to do this.

Was this page helpful?
Report a bug or request content on GitHub.