{/* cspell:ignore operatorname */}

{/* cspell:ignore dephases */}

{/* cspell:ignore mapsto */}

# Quantum channel basics

In mathematical terms, channels are linear mappings from density matrices to density matrices that satisfy certain requirements.
Throughout this lesson we'll use uppercase Greek letters, including $\Phi$ and $\Psi,$ as well as some other letters in specific cases, to refer to channels.

Every channel $\Phi$ has an input system and an output system, and we'll typically use the name $\mathsf{X}$ to refer to the input system and $\mathsf{Y}$ to refer to the output system.
It's common that the output system of a channel is the same as the input system, and in this case we can use the same letter $\mathsf{X}$ to refer to both.

## Channels are linear mappings

Channels are described by *linear* mappings, just like probabilistic operations in the standard formulation of classical information and unitary operations in the simplified formulation of quantum information.

If a channel $\Phi$ is performed on an input system $\mathsf{X}$ whose state is described by a density matrix $\rho,$ then the output system of the channel is described by the density matrix $\Phi(\rho).$
In the situation in which the output system of $\Phi$ is also $\mathsf{X},$ we can simply view that the channel represents a change in the state of $\mathsf{X},$ from $\rho$ to $\Phi(\rho).$
When the output system of $\Phi$ is a different system, $\mathsf{Y},$ rather than $\mathsf{X},$ it should be understood that $\mathsf{Y}$ is a new system that is created by the process of applying the channel, and that the input system, $\mathsf{X},$ is no longer available once the channel is applied — as if the channel itself transformed $\mathsf{X}$ into $\mathsf{Y},$ leaving it in the state $\Phi(\rho).$

The assumption that channels are described by *linear* mappings can be viewed as being an axiom — or in other words, a basic postulate of the theory rather than something that is proved.
We can, however, see the need for channels to act linearly on convex combinations of density matrix inputs in order for them to be consistent with probability theory and what we've already learned about density matrices.

To be more specific, suppose that we have a channel $\Phi$ and we apply it to a system when it's in one of the two states represented by the density matrices $\rho$ and $\sigma.$
If we apply the channel to $\rho$ we obtain the density matrix $\Phi(\rho)$ and if we apply it to $\sigma$ we obtain the density matrix $\Phi(\sigma).$
Thus, if we randomly choose the input state of $\mathsf{X}$ to be $\rho$ with probability $p$ and $\sigma$ with probability $1-p,$ we'll obtain the output state $\Phi(\rho)$ with probability $p,$ and $\Phi(\sigma)$ with probability $1-p,$ which we represent by a weighted average of density matrices as $p\Phi(\rho) + (1-p)\Phi(\sigma).$

On the other hand, we could think about the input state of the channel as being represented by the weighted average $p\rho + (1-p)\sigma,$ in which case the output is $\Phi(p\rho + (1-p)\sigma).$
It's the same state regardless of how we choose to think about it, so we must have

$$
\Phi(p\rho + (1-p)\sigma) = p\Phi(\rho) + (1-p)\Phi(\sigma).
$$

Whenever we have a mapping that satisfies this condition for every choice of density matrices $\rho$ and $\sigma$ and scalars $p\in [0,1],$ there's always a unique way to extend that mapping to every matrix input (i.e., not just density matrix inputs) so that it's linear.

## Channels transform density matrices into density matrices

Naturally, in addition to being linear mappings, channels must also transform density matrices into density matrices.
If a channel $\Phi$ is applied to an input system while this system is in a state represented by a density matrix $\rho,$ then we obtain a system whose state is represented by $\Phi(\rho),$ which must be a valid density matrix in order for us to interpret it as a state.

It is critically important, though, that we consider a more general situation, where a channel $\Phi$ transforms a system $\mathsf{X}$ into a system $\mathsf{Y}$ in the presence of an additional system $\mathsf{Z}$ to which nothing happens.
That is, if we start with the pair of systems $(\mathsf{Z},\mathsf{X})$ in a state described by some density matrix, and then apply $\Phi$ just to $\mathsf{X},$ transforming it into $\mathsf{Y},$ we must obtain a density matrix describing a state of the pair $(\mathsf{Z},\mathsf{Y}).$

We can describe in mathematical terms how a channel $\Phi,$ having an input system $\mathsf{X}$ and an output system $\mathsf{Y},$ transforms a state of the pair $(\mathsf{Z},\mathsf{X})$ into a state of $(\mathsf{Z},\mathsf{Y})$ when nothing is done to $\mathsf{Z}.$
To keep things simple, we'll assume that the classical state set of $\mathsf{Z}$ is $\{0,\ldots,m-1\}.$
This allows us to write an arbitrary density matrix $\rho,$ representing a state of $(\mathsf{Z},\mathsf{X}),$ in the following form.

$$
\rho = \sum_{a,b = 0}^{m-1} \vert a\rangle\langle b\vert \otimes \rho_{a,b}
= \begin{pmatrix}
\rho_{0,0} & \rho_{0,1} & \cdots & \rho_{0,m-1} \\[1mm]
\rho_{1,0} & \rho_{1,1} & \cdots & \rho_{1,m-1} \\[1mm]
\vdots & \vdots & \ddots & \vdots\\[1mm]
\rho_{m-1,0} & \rho_{m-1,1} & \cdots & \rho_{m-1,m-1}
\end{pmatrix}
$$

On the right-hand side of this equation we have a block matrix, which we can think of as a matrix of matrices except that the inner parentheses are removed.
This leaves us with an ordinary matrix that can alternatively be described using Dirac notation as we have in the middle expression.
Each matrix $\rho_{a,b}$ has rows and columns corresponding to the classical states of $\mathsf{X},$ and these matrices can be determined by a simple formula.

$$
\rho_{a,b} = \bigl(\langle a \vert \otimes \mathbb{I}_{\mathsf{X}} \bigr) \rho \bigl(\vert b \rangle \otimes \mathbb{I}_{\mathsf{X}} \bigr)
$$

Note that these are not density matrices in general — it's only when they're arranged together to form $\rho$ that we obtain a density matrix.

The following equation describes the state of $(\mathsf{Z},\mathsf{Y})$ that is obtained when $\Phi$ is applied to $\mathsf{X}.$

$$
\sum_{a,b = 0}^{m-1} \vert a\rangle\langle b\vert \otimes \Phi(\rho_{a,b})
= \begin{pmatrix}
\Phi(\rho_{0,0}) & \Phi(\rho_{0,1}) & \cdots & \Phi(\rho_{0,m-1}) \\[1mm]
\Phi(\rho_{1,0}) & \Phi(\rho_{1,1}) & \cdots & \Phi(\rho_{1,m-1}) \\[1mm]
\vdots & \vdots & \ddots & \vdots\\[1mm]
\Phi(\rho_{m-1,0}) & \Phi(\rho_{m-1,1}) & \cdots & \Phi(\rho_{m-1,m-1})
\end{pmatrix}
$$

Notice that, in order to evaluate this expression for a given choice of $\Phi$ and $\rho,$ we must understand how $\Phi$ works as a linear mapping on non-density matrix inputs, as each $\rho_{a,b}$ generally won't be a density matrix on its own.
The equation is consistent with the expression $(\operatorname{Id}_{\mathsf{Z}} \otimes \,\Phi)(\rho),$
in which $\operatorname{Id}_{\mathsf{Z}}$ denotes the *identity channel* on the system $\mathsf{Z}.$
This presumes that we've extended the notion of a tensor product to linear mappings from matrices to matrices, which is straightforward — but it isn't really essential to the lesson and won't be explained further.

Reiterating a statement made above, in order for a linear mapping $\Phi$ to be a valid channel it must be the case that, for every choice for $\mathsf{Z}$ and every density matrix $\rho$ of the pair $(\mathsf{Z},\mathsf{X}),$ we always obtain a density matrix when $\Phi$ is applied to $\mathsf{X}.$
In mathematical terms, the properties a mapping must possess to be a channel are that it must be *trace-preserving* — so that the matrix we obtain by applying the channel has trace equal to one — as well as *completely positive* — so that the resulting matrix is positive semidefinite.
These are both important properties that can be considered and studied separately, but it isn't critical for the sake of this lesson to consider these properties in isolation.

There are, in fact, linear mappings that always output a density matrix when given a density matrix as input, but fail to map density matrices to density matrices for compound systems, so we do eliminate some linear mappings from the class of channels in this way.
(The linear mapping given by matrix transposition is the simplest example.)

We have an analogous formula to one above in the case that the two systems $\mathsf{X}$ and $\mathsf{Z}$ are swapped, so that $\Phi$ is applied to the system on the left rather than the right.

$$
\bigl(\Phi\otimes\operatorname{Id}_{\mathsf{Z}}\bigr)(\rho)
= \sum_{a,b = 0}^{m-1} \Phi(\rho_{a,b}) \otimes \vert a\rangle\langle b\vert
$$

This assumes that $\rho$ is a state of $(\mathsf{X},\mathsf{Z})$ rather than $(\mathsf{Z},\mathsf{X}).$
This time the block matrix description doesn't work because the matrices $\rho_{a,b}$ don't fall into consecutive rows and columns in $\rho,$ but it's the same underlying mathematical structure.

Any linear mapping that satisfies the requirement that it always transforms density matrices into density matrices, even when it's applied to just one part of a compound systems, represents a valid channel.
So, in an abstract sense, the notion of a channel is determined by the notion of a density matrix, together with the assumption that channels act linearly.
In this regard, channels are analogous to unitary operations in the simplified formulation of quantum information, which are precisely the linear mappings that always transform quantum state vectors to quantum state vectors for a given system; as well as to probabilistic operations (represented by stochastic matrices) in the standard formulation of classical information, which are precisely the linear mappings that always transform probability vectors into probability vectors.

## Unitary operations as channels

Suppose $\mathsf{X}$ is a system and $U$ is a unitary matrix representing an operation on $\mathsf{X}.$
The channel $\Phi$ that describes this operation on density matrices is defined as follows for every
density matrix $\rho$ representing a quantum state of $\mathsf{X}.$

$$
\Phi(\rho) = U \rho U^{\dagger}
\tag{1}
$$

This action, where we multiply by $U$ on the left and $U^{\dagger}$ on the right, is commonly referred to as *conjugation* by the matrix $U.$

This description is consistent with the fact that the density matrix that represents a given
quantum state vector $\vert\psi\rangle$ is $\vert\psi\rangle\langle\psi\vert.$
In particular, if the unitary operation $U$ is performed on $\vert\psi\rangle,$ then the output state is
represented by the vector $U\vert\psi\rangle,$ and so the density matrix describing this state is equal to

$$
(U \vert \psi \rangle )( U \vert \psi \rangle )^{\dagger} = U \vert\psi\rangle\langle\psi\vert U^{\dagger}.
$$

Once we know that, as a channel, the operation $U$ has the
action $\vert\psi\rangle\langle \psi\vert \mapsto U \vert\psi\rangle\langle\psi\vert U^{\dagger}$ on pure
states, we can conclude by linearity that it must work as is specified by the equation $(1)$ above for any
density matrix $\rho.$

The particular channel we obtain when we take $U = \mathbb{I}$ is the *identity channel*$\;\operatorname{Id},$ which we can also give a subscript (such as $\operatorname{Id}_{\mathsf{Z}},$ as we've already encountered) when we wish to indicate explicitly what system this channel acts on.
Its output is always equal to its input: $\operatorname{Id}(\rho) = \rho.$
This might not seem like an interesting channel, but it's actually a very important one — and it's fitting that this is our first example.
The identity channel is the *perfect* channel in some contexts, representing an ideal memory or a perfect, noiseless transmission of information from a sender to a receiver.

Every channel defined by a unitary operation in this way is indeed a valid channel:
conjugation by a matrix $U$ gives us a linear map; and if $\rho$ is a density matrix of a system $(\mathsf{Z},\mathsf{X})$ and $U$ is unitary, then the result, which we can express as

$$
(\mathbb{I}_{\mathsf{Z}} \otimes U) \rho (\mathbb{I}_{\mathsf{Z}} \otimes U^{\dagger}),
$$

is also a density matrix.
Specifically, this matrix must be positive semidefinite, for if $\rho = M^{\dagger} M$ then

$$
(\mathbb{I}_{\mathsf{Z}} \otimes U) \rho (\mathbb{I}_{\mathsf{Z}} \otimes U^{\dagger}) = K^{\dagger} K
$$

for $K = M (\mathbb{I}_{\mathsf{Z}} \otimes U^{\dagger}),$
and it must have unit trace by the cyclic property of the trace.

$$
\operatorname{Tr}\bigl((\mathbb{I}_{\mathsf{Z}} \otimes U) \rho (\mathbb{I}_{\mathsf{Z}} \otimes U^{\dagger})\bigr)
= \operatorname{Tr}\bigl((\mathbb{I}_{\mathsf{Z}} \otimes U^{\dagger})(\mathbb{I}_{\mathsf{Z}} \otimes U) \rho \bigr)
= \operatorname{Tr}\bigl((\mathbb{I}_{\mathsf{Z}} \otimes \mathbb{I}_{\mathsf{X}}) \rho \bigr)
= \operatorname{Tr}(\rho) = 1
$$

## Convex combinations of channels

Suppose we have two channels, $\Phi_0$ and $\Phi_1,$ that share the same input system and the same output system.
For any real number $p\in[0,1],$ we could decide to apply $\Phi_0$ with probability $p$ and
$\Phi_1$ with probability $1-p,$ which gives us a new channel that can be written as $p \Phi_0 + (1-p) \Phi_1.$
Explicitly, the way that this channel acts on a given density matrix is specified by the following simple equation.

$$
(p \Phi_0 + (1-p) \Phi_1)(\rho) = p \Phi_0(\rho) + (1-p) \Phi_1(\rho)
$$

More generally, if we have channels $\Phi_{0},\ldots,\Phi_{m-1}$ and a probability vector
$(p_0,\ldots, p_{m-1}),$ then we can average these channels together to obtain a new channel.

$$
\sum_{k = 0}^{m-1} p_k \Phi_k
$$

This is a *convex combination* of channels, and we always obtain a valid channel through this process.
A simple way to say this in mathematical terms is that, for a given choice of an input and output system, the set of all channels is a *convex set*.

As an example, we could choose to apply one of a collection of *unitary* operations to a certain system.
We obtain what's known as a *mixed unitary* channel, which is a channel that can be expressed in the following form.

$$
\Phi(\rho) = \sum_{k=0}^{m-1} p_k U_k \rho U_k^{\dagger}
$$

Mixed unitary channels for which all of the unitary operations are Pauli matrices (or tensor products of Pauli matrices) are called *Pauli channels*, and are commonly encountered in quantum computing.

## Examples of qubit channels

Now we'll take a look at a few specific examples of channels that aren't unitary.
For all of these examples, the input and output systems are both single qubits, which is to say that these are examples of *qubit channels*.

### The qubit reset channel

This channel does something very simple: it resets a qubit to the $\vert 0\rangle$ state.
As a linear mapping this channel can be expressed as follows for every qubit density matrix $\rho.$

$$
\Lambda(\rho) = \operatorname{Tr}(\rho) \vert 0\rangle\langle 0\vert
$$

Although the trace of every density matrix $\rho$ is equal to $1,$ writing the channel in this way makes it
clear that it's a linear mapping that could be applied to any $2\times 2$ matrix, not just a density matrix.
As we already observed, we need to understand how channels work as linear mappings on non-density matrix inputs to describe what happens when they're applied to just one part of a compound system.

For example, suppose that $\mathsf{A}$ and $\mathsf{B}$ are qubits and together the pair $(\mathsf{A},\mathsf{B})$ is in the Bell state $\vert \phi^+\rangle.$
As a density matrix, this state is given by

$$
\vert \phi^+\rangle\langle \phi^+ \vert =
\begin{pmatrix}
\frac{1}{2} & 0 & 0 & \frac{1}{2} \\[1mm]
0 & 0 & 0 & 0 \\[1mm]
0 & 0 & 0 & 0 \\[1mm]
\frac{1}{2} & 0 & 0 & \frac{1}{2}
\end{pmatrix}.
$$

Using Dirac notation we can alternatively express this state as follows.

$$
\vert \phi^+\rangle\langle \phi^+ \vert =
\frac{1}{2} \vert 0 \rangle \langle 0 \vert \otimes \vert 0 \rangle \langle 0 \vert +
\frac{1}{2} \vert 0 \rangle \langle 1 \vert \otimes \vert 0 \rangle \langle 1 \vert +
\frac{1}{2} \vert 1 \rangle \langle 0 \vert \otimes \vert 1 \rangle \langle 0 \vert +
\frac{1}{2} \vert 1 \rangle \langle 1 \vert \otimes \vert 1 \rangle \langle 1 \vert
$$

By applying the qubit reset channel to $\mathsf{A}$ and doing nothing to $\mathsf{B}$ we obtain the following state.

$$
\begin{aligned}
\frac{1}{2} \Lambda(\vert 0 \rangle \langle 0 \vert) \otimes \vert 0 \rangle \langle 0 \vert +
\frac{1}{2} \Lambda(\vert 0 \rangle \langle 1 \vert) \otimes \vert 0 \rangle \langle 1 \vert +
\frac{1}{2} \Lambda(\vert 1 \rangle \langle 0 \vert) \otimes \vert 1 \rangle \langle 0 \vert +
\frac{1}{2} \Lambda(\vert 1 \rangle \langle 1 \vert) \otimes \vert 1 \rangle \langle 1 \vert \qquad &
\\[1mm]
= \frac{1}{2} \vert 0 \rangle \langle 0 \vert \otimes \vert 0 \rangle \langle 0 \vert
+ \frac{1}{2} \vert 0 \rangle \langle 0 \vert \otimes \vert 1 \rangle \langle 1 \vert
= \vert 0\rangle \langle 0\vert \otimes \frac{\mathbb{I}}{2} &
\end{aligned}
$$

It might be tempting to say that resetting $\mathsf{A}$ has had an effect on $\mathsf{B},$ causing it to become completely mixed — but in some sense it's actually the opposite.
Before $\mathsf{A}$ was reset, the reduced state of $\mathsf{B}$ was the completely mixed state, and that doesn't change as a result of resetting $\mathsf{A}.$

### The completely dephasing channel

Here's an example of a qubit channel called $\Delta,$ described by its action on $2\times 2$ matrices:

$$
\Delta
\begin{pmatrix}
   \alpha_{00} & \alpha_{01}\\[1mm]
   \alpha_{10} & \alpha_{11}
\end{pmatrix}
= \begin{pmatrix}
   \alpha_{00} & 0\\[1mm]
   0 & \alpha_{11}
\end{pmatrix}.
$$

In words, $\Delta$ zeros out the off-diagonal entries of a $2\times 2$ matrix.
This example can be generalized to arbitrary systems, as opposed to qubits: for whatever density matrix is
input, the channel zeros out all of the off-diagonal entries and leaves the diagonal alone.

This channel is called the *completely dephasing channel*, and it can be thought of as representing an
extreme form of the process known as *decoherence* — which essentially ruins quantum superpositions and
turns them into classical probabilistic states.

Another way to think about this channel is that it describes a standard basis measurement on a qubit, where
an input qubit is measured and then discarded, and where the output is a density matrix describing the measurement
outcome.
Alternatively, but equivalently, we can imagine that the measurement outcome is discarded, leaving the qubit in its post-measurement state.

Let us again consider an e-bit, and see what happens when $\Delta$ is applied to just one of the two qubits.
Specifically, we have qubits $\mathsf{A}$ and $\mathsf{B}$ for which $(\mathsf{A},\mathsf{B})$ is in the state $\vert\phi^+\rangle,$ and this time let's apply the channel to the second qubit.
Here's the state we obtain.

$$
\begin{aligned}
\frac{1}{2} \vert 0 \rangle \langle 0 \vert \otimes \Delta(\vert 0 \rangle \langle 0 \vert) +
\frac{1}{2} \vert 0 \rangle \langle 1 \vert \otimes \Delta(\vert 0 \rangle \langle 1 \vert) +
\frac{1}{2} \vert 1 \rangle \langle 0 \vert \otimes \Delta(\vert 1 \rangle \langle 0 \vert) +
\frac{1}{2} \vert 1 \rangle \langle 1 \vert \otimes \Delta(\vert 1 \rangle \langle 1 \vert) \qquad & \\[1mm]
= \frac{1}{2} \vert 0 \rangle \langle 0 \vert \otimes \vert 0 \rangle \langle 0 \vert
+ \frac{1}{2} \vert 1 \rangle \langle 1 \vert \otimes \vert 1 \rangle \langle 1 \vert &
\end{aligned}
$$

Alternatively we can express this equation using block matrices.

$$
\begin{pmatrix}
\Delta\begin{pmatrix}
\frac{1}{2} & 0\\[1mm]
0 & 0
\end{pmatrix}
& \Delta\begin{pmatrix}
0 & \frac{1}{2}\\[1mm]
0 & 0
\end{pmatrix} \\[4mm]
\Delta\begin{pmatrix}
0 & 0\\[1mm]
\frac{1}{2} & 0
\end{pmatrix}
& \Delta\begin{pmatrix}
0 & 0\\[1mm]
0 & \frac{1}{2}
\end{pmatrix}
\end{pmatrix}
= \begin{pmatrix}
\frac{1}{2} & 0 & 0 & 0\\[1mm]
0 & 0 & 0 & 0\\[1mm]
0 & 0 & 0 & 0\\[1mm]
0 & 0 & 0 & \frac{1}{2}
\end{pmatrix}
$$

We can also consider a qubit channel that only slightly dephases a qubit, as opposed to completely dephasing it, which is a less extreme form of decoherence than what is represented by the completely dephasing channel.
In particular, suppose that $\varepsilon \in (0,1)$ is a small but nonzero real number.
We can define a channel

$$
\Delta_{\varepsilon} = (1 - \varepsilon) \operatorname{Id} + \varepsilon \Delta,
$$

which transforms a given qubit density matrix $\rho$ like this:

$$
\Delta_{\varepsilon}(\rho) = (1 - \varepsilon) \rho + \varepsilon \Delta(\rho).
$$

That is, nothing happens with probability $1-\varepsilon,$ and with probability $\varepsilon,$ the qubit dephases.
In terms of matrices, this action can be expressed as follows, where the diagonal entries are left alone and the off-diagonal entries are multiplied by $1-\varepsilon.$

$$
\rho =
\begin{pmatrix}
\langle 0\vert \rho \vert 0 \rangle & \langle 0\vert \rho \vert 1 \rangle \\[1mm]
\langle 1\vert \rho \vert 0 \rangle & \langle 1\vert \rho \vert 1 \rangle
\end{pmatrix}
\mapsto
\begin{pmatrix}
\langle 0\vert \rho \vert 0 \rangle & (1-\varepsilon) \langle 0\vert \rho \vert 1 \rangle \\[1mm]
(1-\varepsilon) \langle 1\vert \rho \vert 0 \rangle & \langle 1\vert \rho \vert 1 \rangle
\end{pmatrix}
$$

### The completely depolarizing channel

Here's another example of a qubit channel called $\Omega.$

$$
\Omega(\rho) = \operatorname{Tr}(\rho) \frac{\mathbb{I}}{2}
$$

Here $\mathbb{I}$ denotes the $2\times 2$ identity matrix.
In words, for any density matrix input $\rho,$ the channel $\Omega$ outputs the completely mixed state.
It doesn't get any noisier than this!
This channel is called the *completely depolarizing channel*, and like the completely dephasing channel it can
be generalized to arbitrary systems in place of qubits.

We can also consider a less extreme variant of this channel where depolarizing happens with probability $\varepsilon,$ similar to what we saw for the dephasing channel.

$$
\Omega_{\varepsilon}(\rho) = (1 - \varepsilon) \rho + \varepsilon \Omega(\rho).
$$



© IBM Corp., 2017-2025