Skip to main content
IBM Quantum Platform

Mathematical formulations of measurements

The lesson begins with two equivalent mathematical descriptions of measurements:

  1. General measurements can be described by collections of matrices, one for each measurement outcome, in a way that generalizes the description of projective measurements.
  2. General measurements can be described as channels whose outputs are always classical states (represented by diagonal density matrices).

We'll restrict our attention to measurements having finitely many possible outcomes. Although it is possible to define measurements with infinitely many possible outcomes, they're much less typically encountered in the context of computation and information processing, and they also require some additional mathematics (namely measure theory) to be properly formalized.

Our initial focus will be on so-called destructive measurements, where the output of the measurement is a classical measurement outcome alone — with no specification of the post-measurement quantum state of whatever system was measured. Intuitively speaking, we can imagine that such a measurement destroys the quantum system itself, or that the system is immediately discarded once the measurement is made. Later in the lesson we'll broaden our view and consider non-destructive measurements, where there's both a classical measurement outcome and a post-measurement quantum state of the measured system.


Measurements as collections of matrices

Suppose X\mathsf{X} is a system that is to be measured, and assume for simplicity that the classical state set of X\mathsf{X} is {0,,n1}\{0,\ldots, n-1\} for some positive integer n,n, so that density matrices representing quantum states of X\mathsf{X} are n×nn\times n matrices. We won't actually have much need to refer to the classical states of X,\mathsf{X}, but it will be convenient to refer to n,n, the number of classical states of X.\mathsf{X}. We'll also assume that the possible outcomes of the measurement are the integers 0,,m10,\ldots,m-1 for some positive integer m.m.

Note that we're just using these names to keep things simple; it's straightforward to generalize everything that follows to other finite sets of classical states and measurement outcomes, renaming them as desired.

Projective measurements

Recall that a projective measurement is described by a collection of projection matrices that sum to the identity matrix. In symbols,

{Π0,,Πm1}\{\Pi_0,\ldots,\Pi_{m-1}\}

describes a projective measurement of X\mathsf{X} if each Πa\Pi_a is an n×nn\times n projection matrix and the following condition is met.

Π0++Πm1=IX\Pi_0 + \cdots + \Pi_{m-1} = \mathbb{I}_{\mathsf{X}}

When such a measurement is performed on a system X\mathsf{X} while it's in a state described by some quantum state vector ψ,\vert\psi\rangle, each outcome aa is obtained with probability equal to Πaψ2.\|\Pi_a\vert\psi\rangle\|^2. We also have that the post-measurement state of X\mathsf{X} is obtained by normalizing the vector Πaψ,\Pi_a\vert\psi\rangle, but we're ignoring the post-measurement state for now.

If the state of X\mathsf{X} is described by a density matrix ρ\rho rather than a quantum state vector ψ,\vert\psi\rangle, then we can alternatively express the probability to obtain the outcome aa as Tr(Πaρ).\operatorname{Tr}(\Pi_a \rho).

If ρ=ψψ\rho = \vert \psi\rangle\langle\psi\vert is a pure state, then the two expressions are equal:

Tr(Πaρ)=Tr(Πaψψ)=ψΠaψ=ψΠaΠaψ=Πaψ2.\operatorname{Tr}(\Pi_a \rho) = \operatorname{Tr}(\Pi_a \vert \psi\rangle\langle\psi \vert) = \langle \psi \vert \Pi_a \vert \psi \rangle = \langle \psi \vert \Pi_a \Pi_a \vert \psi \rangle = \|\Pi_a\vert\psi\rangle\|^2.

Here we're using the cyclic property of the trace for the second equality, and for the third equality we're using the fact that each Πa\Pi_a is a projection matrix, and therefore satisfies Πa2=Πa.\Pi_a^2 = \Pi_a.

In general, if ρ\rho is a convex combination

ρ=k=0N1pkψkψk\rho = \sum_{k = 0}^{N-1} p_k \vert \psi_k\rangle\langle \psi_k \vert

of pure states, then the expression Tr(Πaρ)\operatorname{Tr}(\Pi_a \rho) coincides with the average probability for the outcome a,a, owing to the fact that this expression is linear in ρ.\rho.

Tr(Πaρ)=k=0N1pkTr(Πaψkψk)=k=0N1pkΠaψk2\operatorname{Tr}(\Pi_a \rho) = \sum_{k = 0}^{N-1} p_k \operatorname{Tr}(\Pi_a \vert \psi_k\rangle\langle\psi_k\vert) = \sum_{k = 0}^{N-1} p_k \|\Pi_a\vert\psi_k\rangle\|^2

General measurements

A mathematical description for general measurements is obtained by relaxing the definition of projective measurements. Specifically, we allow the matrices in the collection describing the measurement to be arbitrary positive semidefinite matrices rather than projections. (Projections are always positive semidefinite; they can alternatively be defined as positive semidefinite matrices whose eigenvalues are all either 0 or 1.)

In particular, a general measurement of a system X\mathsf{X} having outcomes 0,,m10,\ldots,m-1 is specified by a collection of positive semidefinite matrices {P0,,Pm1}\{P_0,\ldots,P_{m-1}\} whose rows and columns correspond to the classical states of X\mathsf{X} and that meet the condition

P0++Pm1=IX.P_0 + \cdots + P_{m-1} = \mathbb{I}_{\mathsf{X}}.

If the system X\mathsf{X} is measured while it is in a state described by the density matrix ρ,\rho, then each outcome a{0,,m1}a\in\{0,\ldots,m-1\} appears with probability Tr(Paρ).\operatorname{Tr}(P_a \rho).

As we must naturally demand, the vector of outcome probabilities

(Tr(P0ρ),,Tr(Pm1ρ))\bigl(\operatorname{Tr}(P_0 \rho),\ldots,\operatorname{Tr}(P_{m-1} \rho)\bigr)

of a general measurement always forms a probability vector, for any choice of a density matrix ρ.\rho. The following two observations establish that this is the case.

  1. Each value Tr(Paρ)\operatorname{Tr}(P_a \rho) must be nonnegative, owing to the fact that the trace of the product of any two positive semidefinite matrices is always nonnegative:

    Q,R0  Tr(QR)0.Q, R \geq 0 \; \Rightarrow \: \operatorname{Tr}(QR) \geq 0.

    One way to argue this fact is to use spectral decompositions of QQ and RR together with the cyclic property of the trace to express the trace of the product QRQR as a sum of nonnegative real numbers, which must therefore be nonnegative.

  2. The condition P0++Pm1=IXP_0 + \cdots + P_{m-1} = \mathbb{I}_{\mathsf{X}} together with the linearity of the trace ensures that the probabilities sum to 1.1.

    a=0m1Tr(Paρ)=Tr(a=0m1Paρ)=Tr(Iρ)=Tr(ρ)=1\sum_{a = 0}^{m-1} \operatorname{Tr}(P_a \rho) = \operatorname{Tr}\Biggl(\sum_{a = 0}^{m-1} P_a \rho\Biggr) = \operatorname{Tr}(\mathbb{I}\rho) = \operatorname{Tr}(\rho) = 1

Example 1: any projective measurement

Projections are always positive semidefinite, so every projective measurement is an example of a general measurement.

For example, a standard basis measurement of a qubit can be represented by {P0,P1}\{P_0,P_1\} where

P0=00=(1000)andP1=11=(0001).P_0 = \vert 0\rangle\langle 0\vert = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix} \quad\text{and}\quad P_1 = \vert 1\rangle\langle 1\vert = \begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}.

Measuring a qubit in the state ρ\rho results in outcome probabilities as follows.

Prob(outcome=0)=Tr(P0ρ)=Tr(00ρ)=0ρ0Prob(outcome=1)=Tr(P1ρ)=Tr(11ρ)=1ρ1\begin{aligned} \operatorname{Prob}(\text{outcome} = 0) & = \operatorname{Tr}(P_0 \rho) = \operatorname{Tr}\bigl(\vert 0\rangle\langle 0\vert \rho\bigr) = \langle 0\vert \rho \vert 0 \rangle \\[1mm] \operatorname{Prob}(\text{outcome} = 1) & = \operatorname{Tr}(P_1 \rho) = \operatorname{Tr}\bigl(\vert 1\rangle\langle 1\vert\rho\bigr) = \langle 1 \vert \rho \vert 1 \rangle \end{aligned}

Example 2: a non-projective qubit measurement

Suppose X\mathsf{X} is a qubit, and define two matrices as follows.

P0=(23131313)P1=(13131323)P_0 = \begin{pmatrix} \frac{2}{3} & \frac{1}{3}\\[2mm] \frac{1}{3} & \frac{1}{3} \end{pmatrix} \qquad P_1 = \begin{pmatrix} \frac{1}{3} & -\frac{1}{3}\\[2mm] -\frac{1}{3} & \frac{2}{3} \end{pmatrix}

These are both positive semidefinite matrices: they're Hermitian, and in both cases the eigenvalues happen to be 1/2±5/6,1/2 \pm \sqrt{5}/6, which are both positive. We also have that P0+P1=I,P_0 + P_1 = \mathbb{I}, and therefore {P0,P1}\{P_0,P_1\} describes a measurement.

If the state of X\mathsf{X} is described by a density matrix ρ\rho and we perform this measurement, then the probability of obtaining the outcome 00 is Tr(P0ρ)\operatorname{Tr}(P_0 \rho) and the probability of obtaining the outcome 11 is Tr(P1ρ).\operatorname{Tr}(P_1 \rho). For instance, if ρ=++\rho = \vert + \rangle \langle + \vert then the probabilities for the two outcomes 00 and 11 are as follows.

Tr(P0ρ)=Tr((23131313)(12121212))=(2312+1312)+(1312+1312)=12+13=56Tr(P1ρ)=Tr((13131323)(12121212))=(13121312)+(1312+2312)=0+16=16\begin{aligned} \operatorname{Tr}(P_0 \rho) & = \operatorname{Tr}\left( \begin{pmatrix} \frac{2}{3} & \frac{1}{3}\\[2mm] \frac{1}{3} & \frac{1}{3} \end{pmatrix} \begin{pmatrix} \frac{1}{2} & \frac{1}{2}\\[2mm] \frac{1}{2} & \frac{1}{2} \end{pmatrix} \right)\\[4mm] & = \biggl(\frac{2}{3} \cdot \frac{1}{2} + \frac{1}{3} \cdot \frac{1}{2}\biggr) + \biggl(\frac{1}{3}\cdot\frac{1}{2} + \frac{1}{3}\cdot\frac{1}{2}\biggr)\\ & = \frac{1}{2} + \frac{1}{3} = \frac{5}{6}\\[4mm] \operatorname{Tr}(P_1 \rho) & = \operatorname{Tr}\left( \begin{pmatrix} \frac{1}{3} & -\frac{1}{3}\\[2mm] -\frac{1}{3} & \frac{2}{3} \end{pmatrix} \begin{pmatrix} \frac{1}{2} & \frac{1}{2}\\[2mm] \frac{1}{2} & \frac{1}{2} \end{pmatrix} \right)\\[4mm] & = \biggl(\frac{1}{3} \cdot \frac{1}{2} - \frac{1}{3} \cdot \frac{1}{2}\biggr) + \biggl(-\frac{1}{3}\cdot\frac{1}{2} + \frac{2}{3}\cdot\frac{1}{2}\biggr)\\ & = 0 + \frac{1}{6} = \frac{1}{6} \end{aligned}

Example 3: tetrahedral measurement

Define four single-qubit quantum state vectors as follows.

ϕ0=0ϕ1=130+231ϕ2=130+23e2πi/31ϕ3=130+23e2πi/31\begin{aligned} \vert\phi_0\rangle & = \vert 0 \rangle\\ \vert\phi_1\rangle & = \frac{1}{\sqrt{3}}\vert 0 \rangle + \sqrt{\frac{2}{3}} \vert 1\rangle \\ \vert\phi_2\rangle & = \frac{1}{\sqrt{3}}\vert 0 \rangle + \sqrt{\frac{2}{3}} e^{2\pi i/3} \vert 1\rangle \\ \vert\phi_3\rangle & = \frac{1}{\sqrt{3}}\vert 0 \rangle + \sqrt{\frac{2}{3}} e^{-2\pi i/3} \vert 1\rangle \end{aligned}

These four states are sometimes known as tetrahedral states because they're vertices of a regular tetrahedron inscribed within the Bloch sphere.

Illustration of a tetrahedron inscribed in the Bloch sphere

The Cartesian coordinates of these four states on the Bloch sphere are

(0,0,1),(223,0,13),(23,23,13),(23,23,13),(0,0,1),\\[2mm] \left( \frac{2\sqrt{2}}{3} , 0 , -\frac{1}{3} \right),\\[1mm] \left( -\frac{\sqrt{2}}{3} , \sqrt{\frac{2}{3}} , -\frac{1}{3} \right),\\[1mm] \left( -\frac{\sqrt{2}}{3} , -\sqrt{\frac{2}{3}} , -\frac{1}{3} \right),

which can be verified by expressing the density matrices representations of these states as linear combinations of Pauli matrices.

ϕ0ϕ0=(1000)=I+σz2\vert \phi_0 \rangle\langle \phi_0 \vert = \begin{pmatrix} 1 & 0\\[1mm] 0 & 0 \end{pmatrix} = \frac{\mathbb{I} + \sigma_z}{2} ϕ1ϕ1=(13232323)=I+223σx13σz2\vert \phi_1 \rangle\langle \phi_1 \vert = \begin{pmatrix} \frac{1}{3} & \frac{\sqrt{2}}{3} \\[2mm] \frac{\sqrt{2}}{3} & \frac{2}{3} \end{pmatrix} = \frac{\mathbb{I} + \frac{2\sqrt{2}}{3} \sigma_x - \frac{1}{3}\sigma_z}{2} ϕ2ϕ2=(13132i6132+i623)=I23σx+23σy13σz2\vert \phi_2 \rangle\langle \phi_2 \vert = \begin{pmatrix} \frac{1}{3} & -\frac{1}{3\sqrt{2}} - \frac{i}{\sqrt{6}} \\[2mm] -\frac{1}{3\sqrt{2}} + \frac{i}{\sqrt{6}} & \frac{2}{3} \end{pmatrix} = \frac{\mathbb{I} - \frac{\sqrt{2}}{3} \sigma_x + \sqrt{\frac{2}{3}} \sigma_y - \frac{1}{3}\sigma_z}{2} ϕ3ϕ3=(13132+i6132i623)=I23σx23σy13σz2\vert \phi_3 \rangle\langle \phi_3 \vert = \begin{pmatrix} \frac{1}{3} & -\frac{1}{3\sqrt{2}} + \frac{i}{\sqrt{6}} \\[2mm] -\frac{1}{3\sqrt{2}} - \frac{i}{\sqrt{6}} & \frac{2}{3} \end{pmatrix} = \frac{\mathbb{I} - \frac{\sqrt{2}}{3} \sigma_x - \sqrt{\frac{2}{3}} \sigma_y - \frac{1}{3}\sigma_z}{2}

These four states are perfectly spread out on the Bloch sphere, each one equidistant from the other three and with the angles between any two of them always being the same.

Now let us define a measurement {P0,P1,P2,P3}\{P_0,P_1,P_2,P_3\} of a qubit by setting PaP_a as follows for each a=0,,3.a=0,\ldots,3.

Pa=ϕaϕa2P_a = \frac{\vert\phi_a\rangle\langle\phi_a\vert}{2}

We can verify that this is a valid measurement as follows.

  1. Each PaP_a is evidently positive semidefinite, being a pure state divided by one-half. That is, each one is a Hermitian matrix and has one eigenvalue equal to 1/21/2 and all other eigenvalues zero.
  2. The sum of these matrices is the identity matrix: P0+P1+P2+P3=I.P_0 + P_1 + P_2 + P_3 = \mathbb{I}. The expressions of these matrices as linear combinations of Pauli matrices makes this straightforward to verify.

Measurements as channels

A second way to describe measurements in mathematical terms is as channels.

Classical information can be viewed as a special case of quantum information, insofar as we can identify probabilistic states with diagonal density matrices. So, in operational terms, we can think about measurements as being channels whose inputs are matrices describing states of whatever system is being measured and whose outputs are diagonal density matrices describing the resulting distribution of measurement outcomes.

We'll see shortly that any channel having this property can always be written in a simple, canonical form that ties directly to the description of measurements as collections of positive semidefinite matrices. Conversely, given an arbitrary measurement as a collection of matrices, there's always a valid channel having the diagonal output property that describes the given measurement as suggested in the previous paragraph. Putting these observations together, we find that the two descriptions of general measurements are equivalent.

Before proceeding further, let's be more precise about the measurement, how we're viewing it as a channel, and what assumptions we're making about it.

As before, we'll suppose that X\mathsf{X} is the system to be measured, and that the possible outcomes of the measurement are the integers 0,,m10,\ldots,m-1 for some positive integer m.m. We let Y\mathsf{Y} be the system that stores measurement outcomes, so its classical state set is {0,,m1},\{0,\ldots,m-1\}, and we represent the measurement as a channel named Φ\Phi from X\mathsf{X} to Y.\mathsf{Y}. Our assumption is that Y\mathsf{Y} is classical — which is to say that no matter what state we start with for X,\mathsf{X}, the state of Y\mathsf{Y} we obtain is represented by a diagonal density matrix.

We can express in mathematical terms that the output of Φ\Phi is always diagonal in the following way. First define the completely dephasing channel Δm\Delta_m on Y.\mathsf{Y}.

Δm(σ)=a=0m1aσaaa\Delta_m(\sigma) = \sum_{a = 0}^{m-1} \langle a \vert \sigma \vert a\rangle \,\vert a\rangle\langle a\vert

This channel is analogous to the completely dephasing qubit channel Δ\Delta from the previous lesson. As a linear mapping, it zeros out all of the off-diagonal entries of an input matrix and leaves the diagonal alone.

And now, a simple way to express that a given density matrix σ\sigma is diagonal is by the equation σ=Δm(σ).\sigma = \Delta_m(\sigma). In words, zeroing out all of the off-diagonal entries of a density matrix has no effect if and only if the off-diagonal entries were all zero to begin with. The channel Φ\Phi therefore satisfies our assumption — that Y\mathsf{Y} is classical — if and only if

Φ(ρ)=Δm(Φ(ρ))\Phi(\rho) = \Delta_m(\Phi(\rho))

for every density matrix ρ\rho representing a state of X.\mathsf{X}.


Equivalence of the formulations

Channels to matrices

Suppose that we have a channel from X\mathsf{X} to Y\mathsf{Y} with the property that

Φ(ρ)=Δm(Φ(ρ))\Phi(\rho) = \Delta_m(\Phi(\rho))

for every density matrix ρ.\rho. This may alternatively be expressed as follows.

Φ(ρ)=a=0m1aΦ(ρ)aaa(1)\Phi(\rho) = \sum_{a = 0}^{m-1} \langle a \vert \Phi(\rho) \vert a\rangle\, \vert a\rangle\langle a \vert \tag{1}

Like all channels, we can express Φ\Phi in Kraus form for some way of choosing Kraus matrices A0,,AN1.A_0,\ldots,A_{N-1}.

Φ(ρ)=k=0N1AkρAk\Phi(\rho) = \sum_{k = 0}^{N-1} A_k \rho A_k^{\dagger}

This provides us with an alternative expression for the diagonal entries of Φ(ρ) ⁣:\Phi(\rho)\!:

aΦ(ρ)a=k=0N1aAkρAka=k=0N1Tr(AkaaAkρ)=Tr(Paρ)\begin{aligned} \langle a \vert \Phi(\rho) \vert a\rangle & = \sum_{k = 0}^{N-1} \langle a \vert A_k \rho A_k^{\dagger} \vert a\rangle \\ & = \sum_{k = 0}^{N-1} \operatorname{Tr}\bigl( A_k^{\dagger} \vert a\rangle\langle a \vert A_k \rho\bigr)\\ & = \operatorname{Tr}\bigl(P_a\rho\bigr) \end{aligned}

for

Pa=k=0N1AkaaAk.P_a = \sum_{k = 0}^{N-1} A_k^{\dagger} \vert a\rangle\langle a \vert A_k.

Thus, for these same matrices P0,,Pm1P_0,\ldots,P_{m-1} we can express the channel Φ\Phi as follows.

Φ(ρ)=a=0m1Tr(Paρ)aa\Phi(\rho) = \sum_{a = 0}^{m-1} \operatorname{Tr}(P_a \rho) \vert a\rangle\langle a\vert

This expression is consistent with our description of general measurements in terms of matrices, as we see each measurement outcome appearing with probability Tr(Paρ).\operatorname{Tr}(P_a \rho).

Now let's observe that the two properties required of the collection of matrices {P0,,Pm1}\{P_0,\ldots,P_{m-1}\} to describe a general measurement are indeed satisfied. The first property is that they're all positive semidefinite matrices. One way to see this is to observe that, for every vector ψ\vert \psi\rangle having entries in correspondence with the classical state of X\mathsf{X} we have

ψPaψ=k=0N1ψAkaaAkψ=k=0N1aAkψ20.\langle \psi \vert P_a \vert \psi\rangle = \sum_{k = 0}^{N-1} \langle \psi \vert A_k^{\dagger} \vert a\rangle\langle a \vert A_k\vert \psi\rangle = \sum_{k = 0}^{N-1} \bigl\vert\langle a \vert A_k\vert \psi\rangle\bigr\vert^2 \geq 0.

The second property is that if we sum these matrices we get the identity matrix.

a=0m1Pa=a=0m1k=0N1AkaaAk=k=0N1Ak(a=0m1aa)Ak=k=0N1AkAk=IX\begin{aligned} \sum_{a = 0}^{m-1} P_a & = \sum_{a = 0}^{m-1} \sum_{k = 0}^{N-1} A_k^{\dagger} \vert a\rangle\langle a \vert A_k \\ & = \sum_{k = 0}^{N-1} A_k^{\dagger} \Biggl(\sum_{a = 0}^{m-1} \vert a\rangle\langle a \vert\Biggr) A_k \\ & = \sum_{k = 0}^{N-1} A_k^{\dagger} A_k \\ & = \mathbb{I}_{\mathsf{X}} \end{aligned}

The last equality follows from the fact that Φ\Phi is a channel, so its Kraus matrices must satisfy this condition.

Matrices to channels

Now let's verify that for any collection {P0,,Pm1}\{P_0,\ldots,P_{m-1}\} of positive semidefinite matrices satisfying P0++Pm1=IX,P_0 + \cdots + P_{m-1} = \mathbb{I}_{\mathsf{X}}, the mapping defined by

Φ(ρ)=a=0m1Tr(Paρ)aa\Phi(\rho) = \sum_{a = 0}^{m-1} \operatorname{Tr}(P_a \rho) \vert a \rangle\langle a\vert

is indeed a valid channel from X\mathsf{X} to Y.\mathsf{Y}.

One way to do this is to compute the Choi representation of this mapping.

J(Φ)=b,c=0n1bcΦ(bc)=b,c=0n1a=0m1bcTr(Pabc)aa=b,c=0n1a=0m1bbPaTccaa=a=0m1PaTaa\begin{aligned} J(\Phi) & = \sum_{b,c = 0}^{n-1} \vert b \rangle \langle c \vert \otimes \Phi(\vert b \rangle \langle c \vert)\\[1mm] & = \sum_{b,c = 0}^{n-1} \sum_{a = 0}^{m-1} \vert b \rangle \langle c \vert \otimes \operatorname{Tr}(P_a \vert b \rangle \langle c \vert) \vert a \rangle\langle a\vert\\[1mm] & = \sum_{b,c = 0}^{n-1} \sum_{a = 0}^{m-1} \vert b \rangle \langle b \vert P_a^T \vert c \rangle \langle c \vert \otimes \vert a \rangle\langle a\vert\\[1mm] & = \sum_{a = 0}^{m-1} P_a^T \otimes \vert a \rangle\langle a\vert \end{aligned}

The transpose of each PaP_a is introduced for the third equality because

cPab=bPaTc.\langle c \vert P_a \vert b\rangle = \langle b \vert P_a^T \vert c\rangle.

This allows for the expressions bb\vert b \rangle \langle b \vert and cc\vert c \rangle \langle c \vert to appear, which simplify to the identity matrix upon summing over bb and c,c, respectively.

By the assumption that P0,,Pm1P_0,\ldots,P_{m-1} are positive semidefinite, so too are P0T,,Pm1T.P_0^{T},\ldots,P_{m-1}^{T}. In particular, transposing a Hermitian matrix results in another Hermitian matrix, and the eigenvalues of any square matrix and its transpose always agree. It follows that J(Φ)J(\Phi) is positive semidefinite. Tracing out the output system Y\mathsf{Y} (which is the system on the right) yields

TrY(J(Φ))=a=0m1PaT=IXT=IX,\operatorname{Tr}_{\mathsf{Y}} (J(\Phi)) = \sum_{a = 0}^{m-1} P_a^T = \mathbb{I}_{\mathsf{X}}^T = \mathbb{I}_{\mathsf{X}},

and so we conclude that Φ\Phi is a channel.


Partial measurements

Suppose that we have multiple systems that are collectively in a quantum state, and a general measurement is performed on one of the systems. This results in one of the measurement outcomes, selected at random according to probabilities determined by the measurement and the state of the system prior to the measurement. The resulting state of the remaining systems will then, in general, depend on which measurement outcome was obtained.

Let's examine how this works for a pair of systems (X,Z)(\mathsf{X},\mathsf{Z}) when the system X\mathsf{X} is measured. (We're naming the system on the right Z\mathsf{Z} because we'll take Y\mathsf{Y} to be a system representing the classical output of the measurement when we view it as a channel.) We can then easily generalize to the situation in which the systems are swapped as well as to three or more systems.

Suppose the state of (X,Z)(\mathsf{X},\mathsf{Z}) prior to the measurement is described by a density matrix ρ,\rho, which we can write as follows.

ρ=b,c=0n1bcρb,c\rho = \sum_{b,c = 0}^{n-1} \vert b\rangle\langle c\vert \otimes \rho_{b,c}

In this expression we're assuming the classical states of X\mathsf{X} are 0,,n1.0,\ldots,n-1.

We'll assume that the measurement itself is described by the collection of matrices {P0,,Pm1}.\{P_0,\ldots,P_{m-1}\}. This measurement may alternatively be described as a channel Φ\Phi from X\mathsf{X} to Y,\mathsf{Y}, where Y\mathsf{Y} is a new system having classical state set {0,,m1}.\{0,\ldots,m-1\}. Specifically, the action of this channel can be expressed as follows.

Φ(ξ)=a=0m1Tr(Paξ)aa\Phi(\xi) = \sum_{a = 0}^{m-1} \operatorname{Tr}(P_a \xi)\, \vert a \rangle \langle a \vert

Outcome probabilities

We're considering a measurement of the system X,\mathsf{X}, so the probabilities with which different measurement outcomes are obtained can depend only on ρX,\rho_{\mathsf{X}}, the reduced state of X.\mathsf{X}. In particular, the probability for each outcome a{0,,m1}a\in\{0,\ldots,m-1\} to appear can be expressed in three equivalent ways.

Tr(PaρX)=Tr(PaTrZ(ρ))=Tr((PaIZ)ρ)\operatorname{Tr}\bigl( P_a \rho_{\mathsf{X}}\bigr) = \operatorname{Tr}\bigl( P_a \operatorname{Tr}_{\mathsf{Z}}(\rho)\bigr) = \operatorname{Tr}\bigl( (P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho \bigr)

The first expression naturally represents the probability to obtain the outcome aa based on what we already know about measurements of a single system. To get the second expression we're simply using the definition ρX=TrZ(ρ).\rho_{\mathsf{X}} = \operatorname{Tr}_{\mathsf{Z}}(\rho).

To get the third expression requires more thought — and learners are encouraged to convince themselves that it is true. Here's a hint: The equivalence between the second and third expressions does not depend on ρ\rho being a density matrix or on each PaP_a being positive semidefinite. Try showing it first for tensor products of the form ρ=MN\rho = M\otimes N and then conclude that it must be true in general by linearity.

While the equivalence of the first and third expressions in the previous equation may not be immediate, it does make sense. Starting from a measurement on X,\mathsf{X}, we're effectively defining a measurement of (X,Z),(\mathsf{X},\mathsf{Z}), where we simply throw away Z\mathsf{Z} and measure X.\mathsf{X}. Like all measurements, this new measurement can be described by a collection of matrices, and it's not surprising that this measurement is described by the collection

{P0IZ,,Pm1IZ}.\{P_0\otimes\mathbb{I}_{\mathsf{Z}}, \ldots, P_{m-1}\otimes\mathbb{I}_{\mathsf{Z}}\}.

States conditioned on measurement outcomes

If we want to determine not only the probabilities for the different outcomes but also the resulting state of Z\mathsf{Z} conditioned on each measurement outcome, we can look to the channel description of the measurement. In particular, let's examine the state we get when we apply Φ\Phi to X\mathsf{X} and do nothing to Z.\mathsf{Z}.

(ΦIdZ)(ρ)=b,c=0n1Φ(bc)ρb,c=a=0m1b,c=0n1Tr(Pabc)aaρb,c=a=0m1aab,c=0n1Tr(Pabc)ρb,c=a=0m1aab,c=0n1TrX((PaIZ)(bcρb,c))=a=0m1aaTrX((PaIZ)ρ)\begin{aligned} (\Phi\otimes\operatorname{Id}_{\mathsf{Z}})(\rho) & = \sum_{b,c = 0}^{n-1} \Phi(\vert b\rangle\langle c\vert) \otimes \rho_{b,c}\\ & = \sum_{a = 0}^{m-1} \sum_{b,c = 0}^{n-1} \operatorname{Tr}(P_a \vert b\rangle\langle c\vert) \,\vert a\rangle \langle a \vert \otimes \rho_{b,c}\\ & = \sum_{a = 0}^{m-1} \vert a\rangle \langle a \vert \otimes \sum_{b,c = 0}^{n-1} \operatorname{Tr}(P_a \vert b\rangle\langle c\vert) \rho_{b,c}\\ & = \sum_{a = 0}^{m-1} \vert a\rangle \langle a \vert \otimes \sum_{b,c = 0}^{n-1} \operatorname{Tr}_{\mathsf{X}}\bigl((P_a\otimes\mathbb{I}_{\mathsf{Z}}) (\vert b\rangle\langle c\vert\otimes\rho_{b,c})\bigr)\\ & = \sum_{a = 0}^{m-1} \vert a\rangle \langle a \vert \otimes \operatorname{Tr}_{\mathsf{X}}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho\bigr) \end{aligned}

Note that this is a density matrix by virtue of the fact that Φ\Phi is a channel, so each matrix TrX((PaIZ)ρ)\operatorname{Tr}_{\mathsf{X}}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho) is necessarily positive semidefinite.

One final step transforms this expression into one that reveals what we're looking for.

a=0m1Tr((PaIZ)ρ)aaTrX((PaIZ)ρ)Tr((PaIZ)ρ)\sum_{a = 0}^{m-1} \operatorname{Tr}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho)\, \vert a\rangle \langle a \vert \otimes \frac{\operatorname{Tr}_{\mathsf{X}}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho)}{\operatorname{Tr}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho)}

This is an example of a classical-quantum state,

a=0m1p(a)aaσa,\sum_{a = 0}^{m-1} p(a)\, \vert a\rangle\langle a\vert \otimes \sigma_a,

like we saw in the Density matrices lesson. For each measurement outcome a{0,,m1},a\in\{0,\ldots,m-1\}, we have with probability

p(a)=Tr((PaIZ)ρ)p(a) = \operatorname{Tr}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho)

that Y\mathsf{Y} is in the classical state aa\vert a \rangle \langle a \vert and Z\mathsf{Z} is in the state

σa=TrX((PaIZ)ρ)Tr((PaIZ)ρ).(2)\sigma_a = \frac{\operatorname{Tr}_{\mathsf{X}}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho)}{\operatorname{Tr}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho)}. \tag{2}

That is, this is the density matrix we obtain by normalizing

TrX((PaIZ)ρ)\operatorname{Tr}_{\mathsf{X}}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho)

by dividing it by its trace. (Formally speaking, the state σa\sigma_a is only defined when the probability p(a)p(a) is nonzero; when p(a)=0p(a) = 0 this state is irrelevant, for it refers to a discrete event that occurs with probability zero.)

Naturally, the outcome probabilities are consistent with our previous observations.

In summary, this is what happens when the measurement {P0,,Pm1}\{P_0,\ldots,P_{m-1}\} is performed on X\mathsf{X} when (X,Z)(\mathsf{X},\mathsf{Z}) is in the state ρ.\rho.

  1. Each outcome aa appears with probability p(a)=Tr((PaIZ)ρ).p(a) = \operatorname{Tr}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho).
  2. Conditioned on obtaining the outcome a,a, the state of Z\mathsf{Z} is then represented by the density matrix σa\sigma_a shown in the equation (2),(2), which is obtained by normalizing TrX((PaIZ)ρ).\operatorname{Tr}_{\mathsf{X}}\bigl((P_a \otimes \mathbb{I}_{\mathsf{Z}}) \rho).

Generalization

We can adapt this description to other situations, such as when the ordering of the systems is reversed or when there are three or more systems. Conceptually it is straightforward, although it can become cumbersome to write down the formulas.

In general, if we have rr systems X1,,Xr,\mathsf{X}_1,\ldots,\mathsf{X}_r, the state of the compound system (X1,,Xr)(\mathsf{X}_1,\ldots,\mathsf{X}_r) is ρ,\rho, and the measurement {P0,,Pm1}\{P_0,\ldots,P_{m-1}\} is performed on Xk\mathsf{X}_k, the following happens.

  1. Each outcome aa appears with probability

    p(a)=Tr((IX1IXk1PaIXk+1IXr)ρ).p(a) = \operatorname{Tr}\bigl((\mathbb{I}_{\mathsf{X}_1}\otimes \cdots \otimes\mathbb{I}_{\mathsf{X}_{k-1}} \otimes P_a \otimes \mathbb{I}_{\mathsf{X}_{k+1}} \otimes \cdots \otimes\mathbb{I}_{\mathsf{X}_r}) \rho\bigr).
  2. Conditioned on obtaining the outcome a,a, the state of (X1,,Xk1,Xk+1,,Xr)(\mathsf{X}_1,\ldots,\mathsf{X}_{k-1},\mathsf{X}_{k+1},\ldots,\mathsf{X}_r) is then represented by the following density matrix.

    TrXk((IX1IXk1PaIXk+1IXr)ρ)Tr((IX1IXk1PaIXk+1IXr)ρ)\frac{\operatorname{Tr}_{\mathsf{X}_k}\bigl((\mathbb{I}_{\mathsf{X}_1}\otimes \cdots \otimes\mathbb{I}_{\mathsf{X}_{k-1}} \otimes P_a \otimes \mathbb{I}_{\mathsf{X}_{k+1}} \otimes \cdots \otimes\mathbb{I}_{\mathsf{X}_r}) \rho\bigr)}{\operatorname{Tr}\bigl((\mathbb{I}_{\mathsf{X}_1}\otimes \cdots \otimes\mathbb{I}_{\mathsf{X}_{k-1}} \otimes P_a \otimes \mathbb{I}_{\mathsf{X}_{k+1}} \otimes \cdots \otimes\mathbb{I}_{\mathsf{X}_r}) \rho\bigr)}
Was this page helpful?
Report a bug or request content on GitHub.