Inner products and projections
To better prepare ourselves to explore the capabilities and limitations of quantum circuits, we now introduce some additional mathematical concepts — namely the inner product between vectors (and its connection to the Euclidean norm), the notions of orthogonality and orthonormality for sets of vectors, and projection matrices, which will allow us to introduce a handy generalization of standard basis measurements.
Inner products
Recall that when we use the Dirac notation to refer to an arbitrary column vector as a ket, such as
the corresponding bra vector is the conjugate transpose of this vector:
Alternatively, if we have some classical state set in mind, and we express a column vector as a ket, such as
then the corresponding row (or bra) vector is the conjugate transpose
We also have that the product of a bra vector and a ket vector, viewed as matrices either having a single row or a single column, results in a scalar. Specifically, if we have two column vectors
so that the row vector is as in equation then
Alternatively, if we have two column vectors that we have written as
so that is the row vector we find that
where the last equality follows from the observation that and for classical states and satisfying
The value is called the inner product between the vectors and Inner products are critically important in quantum information and computation; we would not get far in understanding quantum information at a mathematical level without them.
Let us now collect together some basic facts about inner products of vectors.
-
Relationship to the Euclidean norm. The inner product of any vector
with itself is
Thus, the Euclidean norm of a vector may alternatively be expressed as
Notice that the Euclidean norm of a vector must always be a nonnegative real number. Moreover, the only way the Euclidean norm of a vector can be equal to zero is if every one of the entries is equal to zero, which is to say that the vector is the zero vector.
We can summarize these observations like this: for every vector we have
with if and only if This property of the inner product is sometimes referred to as positive definiteness.
-
Conjugate symmetry. For any two vectors
we have
and therefore
-
Linearity in the second argument (and conjugate linearity in the first). Let us suppose that and are vectors and and are complex numbers. If we define a new vector
then
That is to say, the inner product is linear in the second argument. This can be verified either through the formulas above or simply by noting that matrix multiplication is linear in each argument (and specifically in the second argument).
Combining this fact with conjugate symmetry reveals that the inner product is conjugate linear in the first argument. That is, if and are vectors and and are complex numbers, and we define
then
-
The Cauchy–Schwarz inequality. For every choice of vectors and having the same number of entries, we have
This is an incredibly handy inequality that gets used quite extensively in quantum information (and in many other topics of study).
Orthogonal and orthonormal sets
Two vectors and are said to be orthogonal if their inner product is zero:
Geometrically, we can think about orthogonal vectors as vectors at right angles to each other.
A set of vectors is called an orthogonal set if every vector in the set is orthogonal to every other vector in the set. That is, this set is orthogonal if
for all choices of for which
A set of vectors is called an orthonormal set if it is an orthogonal set and, in addition, every vector in the set is a unit vector. Alternatively, this set is an orthonormal set if we have
for all choices of
Finally, a set is an orthonormal basis if, in addition to being an orthonormal set, it forms a basis. This is equivalent to being an orthonormal set and being equal to the dimension of the space from which are drawn.
For example, for any classical state set the set of all standard basis vectors
is an orthonormal basis. The set is an orthonormal basis for the -dimensional space corresponding to a single qubit, and the Bell basis is an orthonormal basis for the -dimensional space corresponding to two qubits.
Extending orthonormal sets to orthonormal bases
Suppose that are vectors that live in an -dimensional space, and assume moreover that is an orthonormal set. Orthonormal sets are always linearly independent sets, so these vectors necessarily span a subspace of dimension From this we conclude that because the dimension of the subspace spanned by these vectors cannot be larger than the dimension of the entire space from which they're drawn.
If it is the case that then it is always possible to choose an additional vectors so that forms an orthonormal basis. A procedure known as the Gram–Schmidt orthogonalization process can be used to construct these vectors.
Orthonormal sets and unitary matrices
Orthonormal sets of vectors are closely connected with unitary matrices. One way to express this connection is to say that the following three statements are logically equivalent (meaning that they are all true or all false) for any choice of a square matrix :
- The matrix is unitary (i.e., ).
- The rows of form an orthonormal set.
- The columns of form an orthonormal set.
This equivalence is actually pretty straightforward when we think about how matrix multiplication and the conjugate transpose work. Suppose, for instance, that we have a matrix like this:
The conjugate transpose of looks like this:
Multiplying the two matrices, with the conjugate transpose on the left-hand side, gives us this matrix:
If we form three vectors from the columns of
then we can alternatively express the product above as follows:
Referring to the equation we now see that the condition that this matrix is equal to the identity matrix is equivalent to the orthonormality of the set
This argument generalizes to unitary matrices of any size. The fact that the rows of a matrix form an orthonormal basis if and only if the matrix is unitary then follows from the fact that a matrix is unitary if and only if its transpose is unitary.
Given the equivalence described above, together with the fact that every orthonormal set can be extended to form an orthonormal basis, we conclude the following useful fact: Given any orthonormal set of vectors drawn from an -dimensional space, there exists a unitary matrix whose first columns are the vectors Pictorially, we can always find a unitary matrix having this form:
Here, the last columns are filled in with any choice of vectors that make an orthonormal basis.
Projections and projective measurements
Projection matrices
A square matrix is called a projection if it satisfies two properties:
Matrices that satisfy the first condition — that they are equal to their own conjugate transpose — are called Hermitian matrices, and matrices that satisfy the second condition — that squaring them leaves them unchanged — are called idempotent matrices.
As a word of caution, the word projection is sometimes used to refer to any matrix that satisfies just the second condition but not necessarily the first, and when this is done the term orthogonal projection is typically used to refer to matrices satisfying both properties. In the context of quantum information and computation, however, the terms projection and projection matrix more typically refer to matrices satisfying both conditions.
An example of a projection is the matrix
for any unit vector We can see that this matrix is Hermitian as follows:
Here, to obtain the second equality, we have used the formula
which is always true, for any two matrices and for which the product makes sense.
To see that the matrix in is idempotent, we can use the assumption that is a unit vector, so that it satisfies Thus, we have
More generally, if is any orthonormal set of vectors, then the matrix
is a projection. Specifically, we have
and
where the orthonormality of implies the second-to-last equality.
In fact, this exhausts all of the possibilities: every projection can be written in the form for some choice of an orthonormal set (Technically speaking, the zero matrix which is a projection, is a special case. To fit it into the general form we must allow the possibility that the sum is empty, resulting in the zero matrix.)
Projective measurements
The notion of a measurement of a quantum system is more general than just standard basis measurements. Projective measurements are measurements that are described by a collection of projections whose sum is equal to the identity matrix. In symbols, a collection of projection matrices describes a projective measurement if
When such a measurement is performed on a system while it is in some state two things happen:
-
For each the outcome of the measurement is with probability equal to
-
For whichever outcome the measurement produces, the state of becomes
We can also choose outcomes other than for projective measurements if we wish. More generally, for any finite and nonempty set if we have a collection of projection matrices
that satisfies the condition
then this collection describes a projective measurement whose possible outcomes coincide with the set where the rules are the same as before:
-
For each the outcome of the measurement is with probability equal to
-
For whichever outcome the measurement produces, the state of becomes
For example, standard basis measurements are equivalent to projective measurements, where is the set of classical states of whatever system we're talking about and our set of projection matrices is
Another example of a projective measurement, this time on two qubits is given by the set where
If we have multiple systems that are jointly in some quantum state and a projective measurement is performed on just one of the systems, the action is similar to what we had for standard basis measurements — and in fact we can now describe this action in much simpler terms than we could before.
To be precise, let us suppose that we have two systems in a quantum state and a projective measurement described by a collection is performed on the system while nothing is done to Doing this is then equivalent to performing the projective measurement described by the collection
on the joint system Each measurement outcome results with probability
and conditioned on the result appearing, the state of the joint system becomes
Implementing projective measurements
Arbitrary projective measurements can be implemented using unitary operations, standard basis measurements, and an extra workspace system, as will now be explained.
Let us suppose that is a system and is a projective measurement on We can easily generalize this discussion to projective measurements having different sets of outcomes, but in the interest of convenience and simplicity we will assume the set of possible outcomes for our measurement is
Let us note explicitly that is not necessarily equal to the number of classical states of — we'll let be the number of classical states of which means that each matrix is an projection matrix.
Because we assume that represents a projective measurement, it is necessarily the case that
Our goal is to perform a process that has the same effect as performing this projective measurement on but to do this using only unitary operations and standard basis measurements.
We will make use of an extra workspace system to do this, and specifically we'll take the classical state set of to be which is the same as the set of outcomes of the projective measurement. The idea is that we will perform a standard basis measurement on and interpret the outcome of this measurement as being equivalent to the outcome of the projective measurement on We'll need to assume that is initialized to some fixed state, which we'll choose to be (Any other choice of fixed quantum state vector could be made to work, but choosing makes the explanation to follow much simpler.)
Of course, in order for a standard basis measurement of to tell us anything about we will need to allow and to interact somehow before measuring by performing a unitary operation on the system First consider this matrix:
Expressed explicitly as a so-called block matrix, which is essentially a matrix of matrices that we interpret as a single, larger matrix, looks like this:
Here, each represents an matrix filled entirely with zeros, so that the entire matrix is an matrix.
Now, is certainly not a unitary matrix (unless in which case giving in this trivial case) because unitary matrices cannot have any columns (or rows) that are entirely unitary matrices have columns that form orthonormal bases, and the all-zero vector is not a unit vector.
However, it is the case that the first columns of are orthonormal, and we get this from the assumption that is a measurement. To verify this claim, notice that for each the vector formed by column number of is as follows:
Note that here we're numbering the columns starting from column Taking the inner product of column with column when gives
which is what we needed to show.
Thus, because the first columns of the matrix are orthonormal, we can replace all of the remaining zero entries by some different choice of complex number entries so that the entire matrix is unitary.
If we're given the matrices we can compute suitable matrices to fill in for the blocks marked in the equation — using the Gram–Schmidt process — but it does not matter specifically what these matrices are for the sake of this discussion.
Finally we can describe the measurement process: we first perform on the joint system and then measure with respect to a standard basis measurement. For an arbitrary state of we obtain the state
where the first equality follows from the fact that and agree on their first columns. When we perform a projective measurement on we obtain each outcome with probability
in which case the state of becomes
Thus, stores a copy of the measurement outcome and changes precisely as it would had the projective measurement described by been performed directly on