{/* cspell:ignore operatorname */}

# Purifications

## Definition of purifications

Let us begin with a precise mathematical definition for purifications.

> **Definition.** Suppose $\mathsf{X}$ is a system in a state represented by a density matrix $\rho,$ and $\vert\psi\rangle$ is a quantum state vector of a pair $(\mathsf{X},\mathsf{Y})$ that leaves $\rho$ when $\mathsf{Y}$ is traced out:
>
> $$
> \rho = \operatorname{Tr}_{\mathsf{Y}} \bigl( \vert \psi\rangle\langle\psi\vert\bigr).
> $$
>
> The state vector $\vert\psi\rangle$ is then said to be a *purification* of $\rho.$

The pure state $\vert\psi\rangle\langle\psi\vert,$ expressed as a density matrix rather than a quantum state vector, is also commonly referred to as a purification of $\rho$ when the equation in the definition is true, but we'll generally use the term to refer to a quantum state vector.

The term *purification* is also used more generally when the ordering of the systems is reversed, when the names of the systems and states are different (of course), and when there are more than two systems.
For instance, if $\vert \psi \rangle$ is a quantum state vector representing a pure state of a compound system $(\mathsf{A},\mathsf{B},\mathsf{C}),$ and the equation

$$
\rho = \operatorname{Tr}_{\mathsf{B}} \bigl(\vert\psi\rangle\langle\psi\vert\bigr)
$$

is true for a density matrix $\rho$ representing a state of the system $(\mathsf{A},\mathsf{C}),$ then $\vert\psi\rangle$ is still referred to as a purification of $\rho.$

For the purposes of this lesson, however, we'll focus on the specific form described in the definition.
Properties and facts concerning purifications, according to this definition, can typically be generalized to more than two systems by re-ordering and partitioning the systems into two compound systems, one playing the role of $\mathsf{X}$ and the other playing the role of $\mathsf{Y}.$

## Existence of purifications

Suppose that $\mathsf{X}$ and $\mathsf{Y}$ are any two systems and $\rho$ is a given state of $\mathsf{X}.$
We will prove that there exists a quantum state vector $\vert\psi\rangle$ of $(\mathsf{X},\mathsf{Y})$ that *purifies* $\rho$ — which is another way of saying that $\vert\psi\rangle$ is a purification of $\rho$ — provided that the system $\mathsf{Y}$ is large enough.
In particular, if $\mathsf{Y}$ has at least as many classical states as $\mathsf{X},$ then a purification of this form necessarily exists for every state $\rho.$
Fewer classical states of $\mathsf{Y}$ are required for some states $\rho;$
in general, $\operatorname{rank}(\rho)$ classical states of $\mathsf{Y}$ are necessary and sufficient for the existence of a quantum state vector of $(\mathsf{X},\mathsf{Y})$ that purifies $\rho.$

Consider first any expression of $\rho$ as a convex combination of $n$ pure states, for any positive integer $n.$

$$
\rho = \sum_{a = 0}^{n-1} p_a \vert\phi_a\rangle\langle\phi_a\vert
$$

In this expression, $(p_0,\ldots,p_{n-1})$ is a probability vector and $\vert\phi_0\rangle,\ldots,\vert\phi_{n-1}\rangle$ are quantum state vectors of $\mathsf{X}.$

One way to obtain such an expression is through the spectral theorem, in which case $n$ is the number of classical states of $\mathsf{X},$ $p_0,\ldots,p_{n-1}$ are the eigenvalues of $\rho,$ and $\vert\phi_0\rangle,\ldots,\vert\phi_{n-1}\rangle$ are orthonormal eigenvectors corresponding to these eigenvalues.

There's actually no need to include the terms corresponding to the zero eigenvalues of $\rho$ in the sum, which allows us to alternatively choose $n = \operatorname{rank}(\rho)$ and $p_0,\ldots,p_{n-1}$ to be the non-zero eigenvalues of $\rho.$
This is the minimum value of $n$ for which an expression of $\rho$ taking the form above exists.

To be clear, it is *not necessary* that the chosen expression of $\rho,$ as a convex combination of pure states, comes from the spectral theorem — this is just one way to obtain such an expression.
In particular, $n$ could be any positive integer, the unit vectors $\vert\phi_0\rangle,\ldots,\vert\phi_{n-1}\rangle$ need not be orthogonal, and the probabilities $p_0,\ldots,p_{n-1}$ need not be eigenvalues of $\rho.$

We can now identify a purification of $\rho$ as follows.

$$
\vert\psi\rangle = \sum_{a = 0}^{n-1} \sqrt{p_a} \, \vert\phi_a\rangle \otimes \vert a \rangle
$$

Here we're making the assumption that the classical states of $\mathsf{Y}$ include $0,\ldots,n-1.$
If they do not, an arbitrary choice for $n$ distinct classical states of $\mathsf{Y}$ can be substituted for $0,\ldots,n-1.$
Verifying that this is indeed a purification of $\rho$ is a simple matter of computing the partial trace, which can be done in the following two equivalent ways.

$$
\operatorname{Tr}_{\mathsf{Y}} \bigl(\vert\psi\rangle\langle\psi\vert\bigr) =
\sum_{a = 0}^{n-1} (\mathbb{I}_{\mathsf{X}} \otimes \langle a\vert) \vert\psi\rangle\langle\psi\vert
(\mathbb{I}_{\mathsf{X}} \otimes \vert a\rangle) = \sum_{a = 0}^{n-1} p_a \vert\phi_a\rangle\langle\phi_a\vert = \rho
$$

$$
\operatorname{Tr}_{\mathsf{Y}} \bigl(\vert\psi\rangle\langle\psi\vert\bigr) =
\sum_{a,b = 0}^{n-1} \sqrt{p_a} \sqrt{p_b} \, \vert\phi_a\rangle\langle \phi_b\vert
\, \operatorname{Tr}(\vert a \rangle \langle b \vert)
= \sum_{a = 0}^{n-1} p_a \, \vert\phi_a\rangle\langle \phi_a\vert = \rho
$$

More generally, for any orthonormal set of vectors $\{\vert\gamma_0\rangle,\ldots,\vert\gamma_{n-1}\rangle\},$ the quantum state vector

$$
\vert\psi\rangle = \sum_{a = 0}^{n-1} \sqrt{p_a} \, \vert\phi_a\rangle \otimes \vert \gamma_a \rangle
$$

is a purification of $\rho.$

### Example

Suppose that $\mathsf{X}$ and $\mathsf{Y}$ are both qubits and

$$
\rho = \begin{pmatrix}
\frac{3}{4} & \frac{1}{4}\\[2mm]
\frac{1}{4} & \frac{1}{4}
\end{pmatrix}
$$

is a density matrix representing a state of $\mathsf{X}.$

We can use the spectral theorem to express $\rho$ as

$$
\rho =
\cos^2(\pi/8) \vert \psi_{\pi/8}\rangle\langle\psi_{\pi/8}\vert +
\sin^2(\pi/8) \vert \psi_{5\pi/8}\rangle\langle\psi_{5\pi/8}\vert,
$$

where $\vert \psi_{\theta} \rangle = \cos(\theta) \vert 0\rangle + \sin(\theta)\vert 1\rangle.$
The quantum state vector

$$
\cos(\pi/8) \vert \psi_{\pi/8}\rangle \otimes \vert 0\rangle +
\sin(\pi/8) \vert \psi_{5\pi/8}\rangle \otimes \vert 1\rangle
$$

which describes a pure state of the pair $(\mathsf{X},\mathsf{Y}),$ is therefore a purification of $\rho.$

Alternatively, we can write

$$
\rho = \frac{1}{2} \vert 0\rangle\langle 0\vert + \frac{1}{2} \vert +\rangle\langle +\vert.
$$

This is a convex combination of pure states but not a spectral decomposition because $\vert 0\rangle$ and $\vert +\rangle$ are not orthogonal and $1/2$ is not an eigenvalue of $\rho.$
Nevertheless, the quantum state vector

$$
\frac{1}{\sqrt{2}} \vert 0 \rangle \otimes \vert 0\rangle +
\frac{1}{\sqrt{2}} \vert + \rangle \otimes \vert 1\rangle
$$

is a purification of $\rho.$

## Schmidt decompositions

Next, we will discuss *Schmidt decompositions*, which are expressions of quantum state vectors of *pairs* of systems that take a certain form.
Schmidt decompositions are closely connected with purifications, and they're very useful in their own right.
Indeed, when reasoning about a given quantum state vector $\vert\psi\rangle$ of a pair of systems, the first step is often to identify or consider a Schmidt decomposition of this state.

> **Definition.**
> Let $\vert \psi\rangle$ be a given quantum state vector of a pair of systems $(\mathsf{X},\mathsf{Y}).$ A *Schmidt decomposition* of $\vert\psi\rangle$ is an expression of the form
>
> $$
> \vert \psi\rangle = \sum_{a = 0}^{r-1} \sqrt{p_a}\, \vert x_a\rangle \otimes \vert y_a \rangle,
> $$
>
> where $p_0,\ldots,p_{r-1}$ are positive real numbers summing to $1$ and *both* of the sets $\{\vert x_0\rangle,\ldots,\vert x_{r-1}\rangle\}$ and $\{\vert y_0\rangle,\ldots,\vert y_{r-1}\rangle\}$ are orthonormal.

The values

$$
\sqrt{p_0},\ldots,\sqrt{p_{r-1}}
$$

in a Schmidt decomposition of $\vert\psi\rangle$ are known as its *Schmidt coefficients*, which are uniquely determined (up to their ordering) — they're the only positive real numbers that can appear in such an expression of $\vert\psi\rangle.$
The sets

$$
\{\vert x_0\rangle,\ldots,\vert x_{r-1}\rangle\} \quad\text{and}\quad
\{\vert y_0\rangle,\ldots,\vert y_{r-1}\rangle\},
$$

on the other hand, are not uniquely determined, and the freedom one has in choosing these sets of vectors will be clarified in the explanation that follows.

We'll now verify that a given quantum state vector $\vert\psi\rangle$ does indeed have a Schmidt decomposition, and in the process, we'll learn how to find one.

Consider first an arbitrary (not necessarily orthogonal) basis $\{\vert x_0\rangle, \ldots, \vert x_{n-1}\rangle\}$ of the vector space corresponding to the system $\mathsf{X}.$
Because this is a basis, there will always exist a uniquely determined selection of vectors $\vert z_0\rangle,\ldots,\vert z_{n-1}\rangle$ for which the following equation is true.

$$
\vert \psi\rangle = \sum_{a = 0}^{n-1} \vert x_a\rangle \otimes \vert z_a \rangle
\tag{1}
$$

For example, suppose $\{\vert x_0\rangle,\ldots,\vert x_{n-1}\rangle\}$ is the standard basis associated with $\mathsf{X}.$
Assuming the classical state set of $\mathsf{X}$ is $\{0,\ldots,n-1\},$ this means that $\vert x_a\rangle = \vert a\rangle$ for each $a\in\{0,\ldots,n-1\},$ and we find that

$$
\vert\psi\rangle = \sum_{a = 0}^{n-1} \vert a\rangle \otimes \vert z_a\rangle
$$

when

$$
\vert z_a \rangle = ( \langle a \vert \otimes \mathbb{I}_{\mathsf{Y}}) \vert \psi\rangle
$$

for each $a\in\{0,\ldots,n-1\}.$
We frequently consider expressions like this when contemplating a standard basis measurement of $\mathsf{X}.$

It's important to note that the formula

$$
\vert z_a \rangle = ( \langle a \vert \otimes \mathbb{I}_{\mathsf{Y}}) \vert \psi\rangle
$$

for the vectors $\vert z_0\rangle,\ldots,\vert z_{n-1}\rangle$ in this example only works because $\{\vert 0\rangle,\ldots,\vert n-1\rangle\}$ is an *orthonormal* basis.
In general, if $\{\vert x_0\rangle,\ldots,\vert x_{n-1}\rangle\}$ is a basis that is not necessarily orthonormal, then the vectors $\vert z_0\rangle,\ldots,\vert z_{n-1}\rangle$ are still uniquely determined by the equation $(1),$ but a different formula is needed.
One way to find them is first to identify vectors $\vert w_0\rangle,\ldots,\vert w_{n-1}\rangle$ so that the equation

$$
\langle w_a \vert x_b \rangle = \begin{cases} 1 & a=b\\ 0 & a\neq b \end{cases}
$$

is satisfied for all $a,b\in\{0,\ldots,n-1\},$ at which point we have

$$
\vert z_a \rangle = (\langle w_a \vert \otimes \mathbb{I}_{\mathsf{Y}}) \vert \psi\rangle.
$$

For a given basis $\{\vert x_0\rangle,\ldots,\vert x_{n-1}\rangle\}$ of the vector space corresponding to $\mathsf{X},$ the uniquely determined vectors $\vert z_0\rangle,\ldots,\vert z_{n-1}\rangle$ for which the equation $(1)$ is satisfied won't necessarily satisfy any special properties, even if $\{\vert x_0\rangle,\ldots,\vert x_{n-1}\rangle\}$ happens to be an orthonormal basis.
If, however, we choose $\{\vert x_0\rangle, \ldots, \vert x_{n-1}\rangle\}$ to be an orthonormal basis of *eigenvectors* of the reduced state

$$
\rho = \operatorname{Tr}_{\mathsf{Y}} \bigl( \vert \psi\rangle \langle \psi \vert \bigr),
$$

then something interesting happens.
Specifically, for the uniquely determined collection $\{\vert z_0\rangle,\ldots,\vert z_{n-1}\rangle\}$ for which the equation $(1)$ is true, we find that this collection must be *orthogonal.*

In greater detail, consider a spectral decomposition of $\rho.$

$$
\rho = \sum_{a = 0}^{n-1} p_a \vert x_a \rangle \langle x_a \vert
$$

Here we're denoting the eigenvalues of $\rho$ by $p_0,\ldots,p_{n-1}$ in recognition of the fact that $\rho$ is a density matrix — so the vector of eigenvalues $(p_0,\ldots,p_{n-1})$ forms a probability vector — while $\{\vert x_0\rangle,\ldots,\vert x_{n-1}\rangle\}$ is an orthonormal basis of eigenvectors corresponding to these eigenvalues.
To see that the unique collection $\{\vert z_0\rangle,\ldots,\vert z_{n-1}\rangle\}$ for which the equation $(1)$ is true is necessarily orthogonal, we can begin by computing the partial trace.

$$
\begin{aligned}
\operatorname{Tr}_{\mathsf{Y}} (\vert\psi\rangle\langle\psi\vert)
& = \sum_{a,b = 0}^{n-1} \vert x_a\rangle\langle x_b\vert \operatorname{Tr}(\vert z_a\rangle\langle z_b\vert)\\
& = \sum_{a,b = 0}^{n-1} \langle z_b\vert z_a\rangle \, \vert x_a\rangle\langle x_b\vert.
\end{aligned}
$$

This expression must agree with the spectral decomposition of $\rho.$
Because $\{\vert x_0\rangle,\ldots,\vert x_{n-1}\rangle\}$ is a basis, we conclude that the set of matrices

$$
\bigl\{ \vert x_a\rangle\langle x_b\vert \,:\, a,b\in\{0,\ldots,n-1\} \bigr\}
$$

is linearly independent, and so it follows that

$$
\langle z_b \vert z_a\rangle =
\begin{cases}
  p_a & a=b\\[1mm]
  0 & a\neq b,
\end{cases}
$$

establishing that $\{\vert z_0\rangle,\ldots,\vert z_{n-1}\rangle\}$ is orthogonal.

We've nearly obtained a Schmidt decomposition of $\vert\psi\rangle.$
It remains to discard those terms in $(1)$ for which $p_a = 0$ and then write $\vert z_a\rangle = \sqrt{p_a}\vert y_a\rangle$ for a unit vector $\vert y_a\rangle$ for each of the remaining terms.

A convenient way to do this begins with the observation that we're free to number the eigenvalue/eigenvector pairs in a spectral decomposition of the reduced state $\rho$ however we wish — so we may assume that the eigenvalues are sorted in decreasing order:

$$
p_0 \geq p_1 \geq \cdots \geq p_{n-1}.
$$

Letting $r = \operatorname{rank}(\rho),$ we find that $p_0,\ldots,p_{r-1} > 0$ and $p_r = \cdots = p_{n-1} = 0.$
So, we have

$$
\rho = \sum_{a = 0}^{r-1} p_a \vert x_a \rangle \langle x_a \vert,
$$

and we can write the quantum state vector $\vert \psi \rangle$ as

$$
\vert\psi\rangle = \sum_{a = 0}^{r-1} \vert x_a\rangle \otimes \vert z_a\rangle.
$$

Given that

$$
\| \vert z_a \rangle \|^2 = \langle z_a \vert z_a \rangle = p_a > 0
$$

for $a=0,\ldots,r-1,$ we can define unit vectors $\vert y_0 \rangle,\ldots,\vert y_{r-1}\rangle$ as

$$
\vert y_a\rangle = \frac{\vert z_a\rangle}{\|\vert z_a\rangle\|} = \frac{\vert z_a\rangle}{\sqrt{p_a}},
$$

so that $\vert z_a\rangle = \sqrt{p_a}\vert y_a\rangle$ for each $a\in\{0,\ldots,r-1\}.$
Because the vectors $\{\vert z_0\rangle, \ldots, \vert z_{r-1}\rangle\}$ are orthogonal and nonzero, it follows that
$\{\vert y_0\rangle, \ldots, \vert y_{r-1}\rangle\}$ is an *orthonormal* set, and so we have obtained a Schmidt decomposition of $\vert\psi\rangle.$

$$
\vert \psi\rangle = \sum_{a = 0}^{r-1} \sqrt{p_a}\, \vert x_a\rangle \otimes \vert y_a \rangle
$$

Concerning the choice of the vectors
$\{\vert x_0\rangle,\ldots,\vert x_{r-1}\rangle\}$ and
$\{\vert y_0\rangle,\ldots,\vert y_{r-1}\rangle\},$
we can select $\{\vert x_0\rangle,\ldots,\vert x_{r-1}\rangle\}$ to be any orthonormal set of eigenvectors corresponding to the nonzero eigenvalues of the reduced state $\operatorname{Tr}_{\mathsf{Y}}(\vert\psi\rangle\langle\psi\vert)$ (as we have done above), in which case the vectors $\{\vert y_0\rangle,\ldots,\vert y_{r-1}\rangle\}$ are uniquely determined.

The situation is symmetric between the two systems, so we can alternatively choose $\{\vert y_0\rangle,\ldots,\vert y_{r-1}\rangle\}$ to be any orthonormal set of eigenvectors corresponding to the nonzero eigenvalues of the reduced state $\operatorname{Tr}_{\mathsf{X}}(\vert\psi\rangle\langle\psi\vert),$ in which case the vectors $\{\vert x_0\rangle,\ldots,\vert x_{r-1}\rangle\}$ will be uniquely determined.

Notice, however, that once one of the sets is selected, as a set of eigenvectors of the corresponding reduced state as just described, the other is determined — so they cannot be chosen independently.

Although it won't come up again in this series, it is noteworthy that the non-zero eigenvalues $p_0,\ldots,p_{r-1}$ of the reduced state $\operatorname{Tr}_{\mathsf{X}}(\vert\psi\rangle\langle\psi\vert)$ must always agree with the nonzero eigenvalues of the reduced state $\operatorname{Tr}_{\mathsf{Y}}(\vert\psi\rangle\langle\psi\vert)$ for any pure state $\vert\psi\rangle$ of a pair of systems $(\mathsf{X},\mathsf{Y}).$

Intuitively speaking, the reduced states of $\mathsf{X}$ and $\mathsf{Y}$ have exactly the same amount of randomness in them when the pair $(\mathsf{X},\mathsf{Y})$ is in a pure state.
This fact is revealed by the Schmidt decomposition: in both cases the eigenvalues of the reduced states must agree with the squares of the Schmidt coefficients of the pure state.

## Unitary equivalence of purifications

We can use Schmidt decompositions to establish a fundamentally important fact concerning purifications known as the *unitary equivalence of purifications*.

> **Theorem** (Unitary equivalence of purifications). Suppose that $\mathsf{X}$ and $\mathsf{Y}$ are systems, and $\vert\psi\rangle$ and $\vert\phi\rangle$ are quantum state vectors of $(\mathsf{X},\mathsf{Y})$ that both purify the same state of $\mathsf{X}.$ In symbols,
>
> $$
> \operatorname{Tr}_{\mathsf{Y}} (\vert\psi\rangle\langle\psi\vert) = \rho = \operatorname{Tr}_{\mathsf{Y}} (\vert\phi\rangle\langle\phi\vert)
> $$
>
> for some density matrix $\rho$ representing a state of $\mathsf{X}.$
> There must then exist a unitary operation $U$ on $\mathsf{Y}$ alone that transforms the first purification into the second:
>
> $$
> (\mathbb{I}_{\mathsf{X}} \otimes U) \vert\psi\rangle = \vert\phi\rangle.
> $$

We'll discuss a few implications of this theorem as the lesson continues, but first let's see how it follows from our previous discussion of Schmidt decompositions.

Our assumption is that $\vert\psi\rangle$ and $\vert\phi\rangle$ are quantum state vectors
of a pair of systems $(\mathsf{X},\mathsf{Y})$ that satisfy the equation

$$
\operatorname{Tr}_{\mathsf{Y}} (\vert\psi\rangle\langle\psi\vert) = \rho =
\operatorname{Tr}_{\mathsf{Y}} (\vert\phi\rangle\langle\phi\vert)
$$

for some density matrix $\rho$ representing a state of $\mathsf{X}.$

Consider a spectral decomposition of $\rho.$

$$
\rho = \sum_{a = 0}^{n-1} p_a \vert x_a\rangle\langle x_a\vert
$$

Here $\{\vert x_0\rangle,\ldots,\vert x_{n-1}\rangle\}$ is an orthonormal basis of eigenvectors of $\rho.$
By following the prescription described previously we can obtain Schmidt decompositions for both $\vert\psi\rangle$ and $\vert\phi\rangle$ having the following form.

$$
\begin{aligned}
\vert\psi\rangle & = \sum_{a = 0}^{r-1} \sqrt{p_a} \, \vert x_a\rangle \otimes \vert u_a\rangle\\[1mm]
\vert\phi\rangle & = \sum_{a = 0}^{r-1} \sqrt{p_a} \, \vert x_a\rangle \otimes \vert v_a\rangle
\end{aligned}
$$

In these expressions $r$ is the rank of $\rho$ and
$\{\vert u_0\rangle,\ldots,\vert u_{r-1}\rangle\}$ and
$\{\vert v_0\rangle,\ldots,\vert v_{r-1}\rangle\}$ are orthonormal sets of vectors in the space corresponding to $\mathsf{Y}.$

For any two orthonormal sets in the same space that have the same number of elements, there's always a unitary matrix that transforms the first set into the second, so we can choose a unitary matrix $U$ so that $U \vert u_a\rangle = \vert v_a\rangle$ for $a = 0,\ldots,r-1.$
In particular, to find such a matrix $U$ we can first use the Gram-Schmidt orthogonalization process to extend our orthonormal sets to orthonormal bases
$\{\vert u_0\rangle,\ldots,\vert u_{m-1}\rangle\}$ and
$\{\vert v_0\rangle,\ldots,\vert v_{m-1}\rangle\},$ where $m$ is the dimension of the space corresponding to $\mathsf{Y},$ and then take

$$
U = \sum_{a = 0}^{m-1} \vert v_a\rangle\langle u_a\vert.
$$

We now find that

$$
\begin{aligned}
(\mathbb{I}_{\mathsf{X}} \otimes U) \vert\psi\rangle
& = \sum_{a = 0}^{r-1} \sqrt{p_a} \, \vert x_a\rangle \otimes U \vert u_a\rangle\\
& = \sum_{a = 0}^{r-1} \sqrt{p_a} \, \vert x_a\rangle \otimes \vert v_a\rangle\\
& = \vert\phi\rangle,
\end{aligned}
$$

which completes the proof.

Here are just a few of many interesting examples and implications connected with the unitary equivalence of purifications.
We'll see another critically important one later in the lesson, in the context of fidelity, known as *Uhlmann's theorem*.

### Superdense coding

In the superdense coding protocol, Alice and Bob share an e-bit, meaning that Alice holds a qubit $\mathsf{A},$ Bob holds a qubit $\mathsf{B},$ and together the pair $(\mathsf{A},\mathsf{B})$ is in the $\vert\phi^{+}\rangle$ Bell state.
The protocol describes how Alice can transform this shared state into any one of the four Bell states, $\vert\phi^+\rangle,$ $\vert\phi^-\rangle,$ $\vert\psi^+\rangle,$ and
$\vert\psi^-\rangle,$ by applying a unitary operation to her qubit $\mathsf{A}.$
Once she has done this, she sends $\mathsf{A}$ to Bob, and then Bob performs a measurement on the pair $(\mathsf{A},\mathsf{B})$ to see which Bell state he holds.

For all four Bell states, the reduced state of Bob's qubit $\mathsf{B}$ is the completely mixed state.

$$
\operatorname{Tr}_{\mathsf{A}}(\vert\phi^+\rangle\langle\phi^+\vert) =
\operatorname{Tr}_{\mathsf{A}}(\vert\phi^-\rangle\langle\phi^-\vert) =
\operatorname{Tr}_{\mathsf{A}}(\vert\psi^+\rangle\langle\psi^+\vert) =
\operatorname{Tr}_{\mathsf{A}}(\vert\psi^-\rangle\langle\psi^-\vert) =
\frac{\mathbb{I}}{2}
$$

By the unitary equivalence of purifications, we immediately conclude that for each Bell state there must exist a unitary operation on Alice's qubit $\mathsf{A}$ alone that transforms $\vert\phi^+\rangle$ into the chosen Bell state.
Although this does not reveal the precise details of the protocol, the unitary equivalence of purifications does immediately imply that superdense coding is possible.

We can also conclude that generalizations of superdense coding to larger systems are always possible, provided that we replace the Bell states with any orthonormal basis of purifications of the completely mixed state.

### Cryptographic implications

The unitary equivalence of purifications has implications concerning the implementation of cryptographic primitives using quantum information.
For instance, the unitary equivalence of purifications reveals that it is impossible to implement an ideal form of *bit commitment* using quantum information.

The bit commitment primitive involves two participants, Alice and Bob (who don't trust one another), and has two phases.

*   The first phase is the *commit* phase, through which Alice commits to a binary value $b\in\{0,1\}.$
    This commitment must be *binding*, which means that Alice cannot change her mind, as well as *concealing*, which means that Bob can't tell which value Alice has committed to.
*   The second phase is the *reveal* phase, in which the bit committed by Alice becomes known to Bob, who should then be convinced that it was truly the committed value that was revealed.

In intuitive, operational terms, the first phase of bit commitment should function as if Alice writes a binary value on a piece of paper, locks the paper inside of a safe, and gives the safe to Bob while keeping the key for herself.
Alice has committed to the binary value written on the paper because the safe is in Bob's possession (so it's binding), but because Bob can't open the safe he can't tell which value Alice committed to (so it's concealing).
The second phase should work as if Alice hands the key to the safe to Bob, so that he can open the safe to reveal the value to which Alice committed.

As it turns out, it is impossible to implement a perfect bit commitment protocol by means of quantum information alone, for this contradicts the unitary equivalence of purifications.
Here is a high-level summary of an argument that establishes this.

To begin, we can assume Alice and Bob only perform unitary operations or introduce new initialized systems as the protocol is executed.
The fact that every channel has a Stinespring representation allows us to make this assumption.

At the end of the commit phase of the protocol, Bob holds in his possession some compound system that must be in one of two quantum states: $\rho_0$ if Alice committed to the value $0$ and $\rho_1$ if Alice committed to the value $1.$
In order for the protocol to be perfectly concealing, Bob should not be able to tell the difference between these two states — so it must be that $\rho_0 = \rho_1.$
(Otherwise there would be a measurement that discriminates these states probabilistically.)

However, because Alice and Bob have only used unitary operations, the state of all of the systems involved in the protocol together after the commit phase must be in a pure state.
In particular, suppose that $\vert\psi_0\rangle$ is the pure state of all of the systems involved in the protocol when Alice commits to $0,$ and $\vert\psi_1\rangle$ is the pure state of all of the systems involved in the protocol when Alice commits to $1.$
If we write $\mathsf{A}$ and $\mathsf{B}$ to denote Alice and Bob's (possibly compound) systems, then

$$
\begin{aligned}
\rho_0 & = \operatorname{Tr}_{\mathsf{A}}(\vert\psi_0\rangle\langle\psi_0\vert)\\[1mm]
\rho_1 & = \operatorname{Tr}_{\mathsf{A}}(\vert\psi_1\rangle\langle\psi_1\vert).
\end{aligned}
$$

Given the requirement that $\rho_0 = \rho_1$ for a perfectly concealing protocol, we find that $\vert\psi_0\rangle$ and $\vert\psi_1\rangle$ are purifications of the same state — and so, by the unitary equivalence of purifications, there must exist a unitary operation $U$ on $\mathsf{A}$ alone such that

$$
(U\otimes\mathbb{I}_{\mathsf{B}})\vert\psi_0\rangle = \vert\psi_1\rangle.
$$

Alice is therefore free to change her commitment from $0$ to $1$ by applying $U$ to $\mathsf{A},$
or from $1$ to $0$ by applying $U^{\dagger},$ and so the hypothetical protocol being considered completely fails to be binding.

### Hughston-Jozsa-Wootters theorem

The last implication of the unitary equivalence of purifications that we'll discuss in this portion of the lesson is the following theorem known as the Hughston-Jozsa-Wootters theorem.
(This is, in fact, a slightly simplified statement of the theorem known by this name.)

> **Theorem** (Hughston-Jozsa-Wootters).
> Let $\mathsf{X}$ and $\mathsf{Y}$ be systems and let $\vert\phi\rangle$ be a quantum state vector of the pair $(\mathsf{X},\mathsf{Y}).$
> Also let $N$ be an arbitrary positive integer, let $(p_0,\ldots,p_{N-1})$ be a probability vector, and let $\vert\psi_0\rangle,\ldots,\vert\psi_{N-1}\rangle$ be quantum state vectors representing states of $\mathsf{X}$ such that
>
> $$
> \operatorname{Tr}_{\mathsf{Y}}\bigl(\vert\phi\rangle\langle\phi\vert\bigr) = \sum_{a = 0}^{N-1} p_a \vert\psi_a\rangle\langle\psi_a\vert.
> $$
>
> There exists a (general) measurement $\{P_0,\ldots,P_{N-1}\}$ on $\mathsf{Y}$ such that the following two statements are true when this measurement is performed on $\mathsf{Y}$ when $(\mathsf{X},\mathsf{Y})$ is in the state $\vert\phi\rangle:$
>
> 1.  Each measurement outcome $a\in\{0,\ldots,N-1\}$ appears with probability $p_a$.
> 2.  Conditioned on obtaining the measurement outcome $a,$ the state of $\mathsf{X}$ becomes $\vert\psi_a\rangle.$

Intuitively speaking, this theorem says that as long as we have a pure state of two systems, then for *any* way of thinking about the reduced state of the first system as a convex combination of pure states, there is a measurement of the second system that effectively makes this way of thinking about the first system a reality.
Notice that the number $N$ is not necessarily bounded by the number of classical states of $\mathsf{X}$ or $\mathsf{Y}.$
For instance, it could be that $N = 1,000,000$ while $\mathsf{X}$ and $\mathsf{Y}$ are qubits.

We shall prove this theorem using the unitary equivalence of purifications, beginning with the introduction of a new system $\mathsf{Z}$ whose classical state set is $\{0,\ldots,N-1\}.$
Consider the following two quantum state vectors of the triple $(\mathsf{X},\mathsf{Y},\mathsf{Z}).$

$$
\begin{aligned}
\vert\gamma_0\rangle & = \vert\phi\rangle_{\mathsf{XY}}\otimes\vert 0\rangle_{\mathsf{Z}}\\[1mm]
\vert\gamma_1\rangle & = \sum_{a = 0}^{N-1} \sqrt{p_a}\, \vert\psi_a\rangle_{\mathsf{X}} \otimes \vert 0\rangle_{\mathsf{Y}} \otimes \vert a\rangle_{\mathsf{Z}}
\end{aligned}
$$

The first vector $\vert\gamma_0\rangle$ is simply the given quantum state vector $\vert\phi\rangle$ tensored with $\vert 0\rangle$ for the new system $\mathsf{Z}.$
For the second vector $\vert\gamma_1\rangle,$ we essentially have a quantum state vector that would make the theorem trivial — at least if $\mathsf{Y}$ were replaced by $\mathsf{Z}$ — because a standard basis measurement performed on $\mathsf{Z}$ clearly yields each outcome $a$ with probability $p_a,$ and conditioned on obtaining this outcome the state of $\mathsf{X}$ becomes $\vert\psi_a\rangle.$

By thinking about the pair $(\mathsf{Y},\mathsf{Z})$ as a single, compound system that can be traced out to leave $\mathsf{X},$ we find that we have identified two different purifications of the state

$$
\rho = \sum_{a = 0}^{N-1} p_a \vert\psi_a\rangle\langle\psi_a\vert.
$$

Specifically, for the first one we have

$$
\operatorname{Tr}_{\mathsf{YZ}} (\vert\gamma_0\rangle\langle\gamma_0\vert)
= \operatorname{Tr}_{\mathsf{Y}} (\vert\phi\rangle\langle\phi\vert) = \rho
$$

and for the second one we have

$$
\begin{aligned}
\operatorname{Tr}_{\mathsf{YZ}} (\vert\gamma_1\rangle\langle\gamma_1\vert)
& = \sum_{a,b = 0}^{N-1} \sqrt{p_a}\sqrt{p_b} \, \vert\psi_a\rangle\langle\psi_a\vert
\operatorname{Tr}(\vert 0\rangle\langle 0\vert \otimes \vert a\rangle\langle b\vert)\\
& = \sum_{a = 0}^{N-1} p_a \, \vert\psi_a\rangle\langle\psi_a\vert\\
& = \rho.
\end{aligned}
$$

There must therefore exist a unitary operation $U$ on $(\mathsf{Y},\mathsf{Z})$ satisfying

$$
(\mathbb{I}_{\mathsf{X}} \otimes U) \vert \gamma_0 \rangle = \vert\gamma_1\rangle
$$

by the unitary equivalence of purifications.

Using this unitary operation $U,$ we can implement a measurement that satisfies the requirements of the theorem as the following diagram illustrates.
In words, we introduce the new system $\mathsf{Z}$ initialized to the $\vert 0\rangle$ state, apply $U$ to $(\mathsf{Y},\mathsf{Z}),$ which transforms the state of $(\mathsf{X},\mathsf{Y},\mathsf{Z})$ from $\vert\gamma_0\rangle$ into $\vert\gamma_1\rangle,$ and then measure $\mathsf{Z}$ with a standard basis measurement, which we've already observed gives the desired behavior.

![A quantum circuit implementation of a measurement for the HSW theorem](/learning/images/courses/general-formulation-of-quantum-information/purifications-and-fidelity/HSW-measurement.svg)

The dotted rectangle in the figure represents an implementation of this measurement, which can be described as a collection of positive semidefinite matrices $\{P_0,\ldots,P_{N-1}\}$ as follows.

$$
P_a = (\mathbb{I}_{\mathsf{Y}} \otimes \langle 0\vert) U^{\dagger}
(\mathbb{I}_{\mathsf{Y}} \otimes \vert a\rangle\langle a \vert)U (\mathbb{I}_{\mathsf{Y}} \otimes
\vert 0\rangle)
$$



© IBM Corp., 2017-2025