Convex combinations of density matrices
Probabilistic selections of density matrices
A key aspect of density matrices is that probabilistic selections of quantum states are represented by convex combinations of their associated density matrices.
For example, if we have two density matrices, and representing quantum states of a system and we prepare the system in the state with probability and with probability then the resulting quantum state is represented by the density matrix
More generally, if we have quantum states represented by density matrices and a system is prepared in the state with probability for some probability vector the resulting state is represented by the density matrix
This is a convex combination of the density matrices
It follows that if we have quantum state vectors and we prepare a system in the state with probability for each the state we obtain is represented by the density matrix
For example, if a qubit is prepared in the state with probability and in the state with probability the density matrix representation of the state we obtain is given by
In the simplified formulation of quantum information, averaging quantum state vectors like this doesn't work. For instance, the vector
is not a valid quantum state vector because its Euclidean norm is not equal to A more extreme example that shows that this doesn't work for quantum state vectors is that we fix any quantum state vector that we wish, and then we take our state to be with probability and with probability These states differ by a global phase, so they're actually the same state — but averaging gives us the zero vector, which is not a valid quantum state vector.
The completely mixed state
Suppose we set the state of a qubit to be or randomly, each with probability The density matrix representing the resulting state is as follows.
(In this equation the symbol denotes the identity matrix.) This is a special state known as the completely mixed state. It represents complete uncertainty about the state of a qubit, similar to a uniform random bit in the probabilistic setting.
Now suppose that we change the procedure: in place of the states and we'll use the states and We can compute the density matrix that describes the resulting state in a similar way.
It's the same density matrix as before, even though we changed the states. In fact, we would again obtain the same result — the completely mixed state — by substituting any two orthogonal qubit state vectors for and
This is a feature, not a bug! We do in fact obtain exactly the same state either way. That is, there's no way to distinguish the two procedures by measuring the qubit they produce, even in a statistical sense. Our two different procedures are simply different ways to prepare this state.
We can verify that this makes sense by thinking about what we could hope to learn given a random selection of a state from one of the two possible state sets and To keep things simple, let's suppose that we perform a unitary operation on our qubit and then measure in the standard basis.
In the first scenario, the state of the qubit is chosen uniformly from the set If the state is we obtain the outcomes and with probabilities
respectively. If the state is we obtain the outcomes and with probabilities
Because the two possibilities each happen with probability we obtain the outcome with probability
and the outcome with probability
Both of these expressions are equal to One way to argue this is to use a fact from linear algebra that can be seen as a generalization of the Pythagorean theorem.
Theorem. Suppose is an orthonormal basis of a (real or complex) vector space For every vector we have
We can apply this theorem to determine the probabilities as follows. The probability to get is
and the probability to get is
Because is unitary, we know that is unitary as well, implying that both and are unit vectors. Both probabilities are therefore equal to This means that no matter how we choose we're just going to get a uniform random bit from the measurement.
We can perform a similar verification for any other pair of orthonormal states in place of and For example, because is an orthonormal basis, the probability to obtain the measurement outcome in the second procedure is
and the probability to get is
In particular, we obtain exactly the same output statistics as we did for the states and
Probabilistic states
Classical states can be represented by density matrices. In particular, for each classical state of a system the density matrix
represents being definitively in the classical state For qubits we have
and in general we have a single on the diagonal in the position corresponding to the classical state we have in mind, with all other entries zero.
We can then take convex combinations of these density matrices to represent probabilistic states. Supposing for simplicity that our classical state set is if is in the state with probability for each then the density matrix we obtain is
Going in the other direction, any diagonal density matrix can naturally be identified with the probabilistic state we obtain by simply reading the probability vector off from the diagonal.
To be clear, when a density matrix is diagonal, it's not necessarily the case that we're talking about a classical system, or that the system must have been prepared through the random selection of a classical state, but rather that the state could have been obtained through the random selection of a classical state.
The fact that probabilistic states are represented by diagonal density matrices is consistent with the intuition suggested at the start of the lesson that off-diagonal entries describe the degree to which the two classical states corresponding to the row and column of that entry are in quantum superposition. Here, all of the off-diagonal entries are zero, so we just have classical randomness and nothing is in quantum superposition.
Density matrices and the spectral theorem
We've seen that if we take a convex combination of pure states,
we obtain a density matrix. Every density matrix in fact, can be expressed as a convex combination of pure states like this. That is, there will always exist a collection of unit vectors and a probability vector for which the equation above is true.
We can, moreover, always choose the number so that it agrees with the number of classical states of the system being considered, and we can select the quantum state vectors to be orthogonal. The spectral theorem, which we encountered in the "Foundations of quantum algorithms" course, allows us to conclude this. Here's a restatement of the spectral theorem for convenience.
Theorem (spectral theorem). Let be a normal complex matrix. There exists an orthonormal basis of dimensional complex vectors along with complex numbers such that
(Recall that a matrix is normal if it satisfies In words, normal matrices are matrices that commute with their own conjugate transpose.)
We can apply the spectral theorem to any given density matrix because density matrices are always Hermitian and therefore normal. This allows us to write
for some orthonormal basis It remains to verify that is a probability vector, which we can then rename to if we wish.
The numbers are the eigenvalues of and because is positive semidefinite, these numbers must therefore be nonnegative real numbers. We can conclude that from the fact that has trace equal to Going through the details will give us an opportunity to point out the following important and very useful property of the trace.
Theorem (cyclic property of the trace). For any two matrices and that give us a square matrix by multiplying, the equality is true.
Note that this theorem works even if and are not themselves square matrices. That is, we may have that is and is for some choice of positive integers and so that is an square matrix and is
In particular, if we let be a column vector and let be the row vector then we see that
The second equality follows from the fact that is a scalar, which we can also think of as a matrix whose trace is its single entry. Using this fact, we can conclude that by the linearity of the trace function.
Alternatively, we can reach the same conclusion by using the fact that the trace of a square matrix (even one that isn't normal) is equal to the sum of its eigenvalues.
We have therefore concluded that any given density matrix can be expressed as a convex combination of pure states. We also see that we can, moreover, take the pure states to be orthogonal. This means, in particular, that we never need the number to be larger than the size of the classical state set of
In general, it must be understood that there will be different ways to write a density matrix as a convex combination of pure states, not just the ways that the spectral theorem provides. A previous example illustrates this.
This is not a spectral decomposition of this matrix because and are not orthogonal. Here's a spectral decomposition:
where The eigenvalues are numbers that will likely look familiar:
The eigenvectors can be written explicitly like this.
As another, more general example, suppose are quantum state vectors representing states of a single qubit, chosen arbitrarily — so we're not assuming any particular relationships among these vectors. We could then consider the state we obtain by choosing one of these states uniformly at random:
Because we're talking about a qubit, the density matrix is so by the spectral theorem we could alternatively write
for some real number and an orthonormal basis — but naturally the existence of this expression doesn't prohibit us from writing as an average of 100 pure states if we choose to do that.