Skip to main content
IBM Quantum Platform

Inner products and projections

To better prepare ourselves to explore the capabilities and limitations of quantum circuits, we now introduce some additional mathematical concepts — namely the inner product between vectors (and its connection to the Euclidean norm), the notions of orthogonality and orthonormality for sets of vectors, and projection matrices, which will allow us to introduce a handy generalization of standard basis measurements.


Inner products

Recall that when we use the Dirac notation to refer to an arbitrary column vector as a ket, such as

ψ=(α1α2αn),\vert \psi \rangle = \begin{pmatrix} \alpha_1\\ \alpha_2\\ \vdots\\ \alpha_n \end{pmatrix},

the corresponding bra vector is the conjugate transpose of this vector:

ψ=(ψ)=(α1α2αn).(1)\langle \psi \vert = \bigl(\vert \psi \rangle \bigr)^{\dagger} = \begin{pmatrix} \overline{\alpha_1} & \overline{\alpha_2} & \cdots & \overline{\alpha_n} \end{pmatrix}. \tag{1}

Alternatively, if we have some classical state set Σ\Sigma in mind, and we express a column vector as a ket, such as

ψ=aΣαaa,\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle,

then the corresponding row (or bra) vector is the conjugate transpose

ψ=aΣαaa.(2)\langle \psi \vert = \sum_{a\in\Sigma} \overline{\alpha_a} \langle a \vert. \tag{2}

We also have that the product of a bra vector and a ket vector, viewed as matrices either having a single row or a single column, results in a scalar. Specifically, if we have two column vectors

ψ=(α1α2αn)andϕ=(β1β2βn),\vert \psi \rangle = \begin{pmatrix} \alpha_1\\ \alpha_2\\ \vdots\\ \alpha_n \end{pmatrix} \quad\text{and}\quad \vert \phi \rangle = \begin{pmatrix} \beta_1\\ \beta_2\\ \vdots\\ \beta_n \end{pmatrix},

so that the row vector ψ\langle \psi \vert is as in equation (1),(1), then

ψϕ=ψϕ=(α1α2αn)(β1β2βn)=α1β1++αnβn.\langle \psi \vert \phi \rangle = \langle \psi \vert \vert \phi \rangle = \begin{pmatrix} \overline{\alpha_1} & \overline{\alpha_2} & \cdots & \overline{\alpha_n} \end{pmatrix} \begin{pmatrix} \beta_1\\ \beta_2\\ \vdots\\ \beta_n \end{pmatrix} = \overline{\alpha_1} \beta_1 + \cdots + \overline{\alpha_n}\beta_n.

Alternatively, if we have two column vectors that we have written as

ψ=aΣαaaandϕ=bΣβbb,\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle \quad\text{and}\quad \vert \phi \rangle = \sum_{b\in\Sigma} \beta_b \vert b \rangle,

so that ψ\langle \psi \vert is the row vector (2),(2), we find that

ψϕ=ψϕ=(aΣαaa)(bΣβbb)=aΣbΣαaβbab=aΣαaβa,\begin{aligned} \langle \psi \vert \phi \rangle & = \langle \psi \vert \vert \phi \rangle\\ & = \Biggl(\sum_{a\in\Sigma} \overline{\alpha_a} \langle a \vert\Biggr) \Biggl(\sum_{b\in\Sigma} \beta_b \vert b\rangle\Biggr)\\ & = \sum_{a\in\Sigma}\sum_{b\in\Sigma} \overline{\alpha_a} \beta_b \langle a \vert b \rangle\\ & = \sum_{a\in\Sigma} \overline{\alpha_a} \beta_a, \end{aligned}

where the last equality follows from the observation that aa=1\langle a \vert a \rangle = 1 and ab=0\langle a \vert b \rangle = 0 for classical states aa and bb satisfying ab.a\neq b.

The value ψϕ\langle \psi \vert \phi \rangle is called the inner product between the vectors ψ\vert \psi\rangle and ϕ.\vert \phi \rangle. Inner products are critically important in quantum information and computation; we would not get far in understanding quantum information at a mathematical level without them.

Let us now collect together some basic facts about inner products of vectors.

  1. Relationship to the Euclidean norm. The inner product of any vector

    ψ=aΣαaa\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle

    with itself is

    ψψ=aΣαaαa=aΣαa2=ψ2.\langle \psi \vert \psi \rangle = \sum_{a\in\Sigma} \overline{\alpha_a} \alpha_a = \sum_{a\in\Sigma} \vert\alpha_a\vert^2 = \bigl\| \vert \psi \rangle \bigr\|^2.

    Thus, the Euclidean norm of a vector may alternatively be expressed as

    ψ=ψψ.\bigl\| \vert \psi \rangle \bigr\| = \sqrt{ \langle \psi \vert \psi \rangle }.

    Notice that the Euclidean norm of a vector must always be a nonnegative real number. Moreover, the only way the Euclidean norm of a vector can be equal to zero is if every one of the entries is equal to zero, which is to say that the vector is the zero vector.

    We can summarize these observations like this: for every vector ψ\vert \psi \rangle we have

    ψψ0,\langle \psi \vert \psi \rangle \geq 0,

    with ψψ=0\langle \psi \vert \psi \rangle = 0 if and only if ψ=0.\vert \psi \rangle = 0. This property of the inner product is sometimes referred to as positive definiteness.

  2. Conjugate symmetry. For any two vectors

    ψ=aΣαaaandϕ=bΣβbb,\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle \quad\text{and}\quad \vert \phi \rangle = \sum_{b\in\Sigma} \beta_b \vert b \rangle,

    we have

    ψϕ=aΣαaβaandϕψ=aΣβaαa,\langle \psi \vert \phi \rangle = \sum_{a\in\Sigma} \overline{\alpha_a} \beta_a \quad\text{and}\quad \langle \phi \vert \psi \rangle = \sum_{a\in\Sigma} \overline{\beta_a} \alpha_a,

    and therefore

    ψϕ=ϕψ.\overline{\langle \psi \vert \phi \rangle} = \langle \phi \vert \psi \rangle.
  3. Linearity in the second argument (and conjugate linearity in the first). Let us suppose that ψ,\vert \psi \rangle, ϕ1,\vert \phi_1 \rangle, and ϕ2\vert \phi_2 \rangle are vectors and α1\alpha_1 and α2\alpha_2 are complex numbers. If we define a new vector

    ϕ=α1ϕ1+α2ϕ2,\vert \phi\rangle = \alpha_1 \vert \phi_1\rangle + \alpha_2 \vert \phi_2\rangle,

    then

    ψϕ=ψ(α1ϕ1+α2ϕ2)=α1ψϕ1+α2ψϕ2.\langle \psi \vert \phi \rangle = \langle \psi \vert \bigl( \alpha_1\vert \phi_1 \rangle + \alpha_2\vert \phi_2 \rangle\bigr) = \alpha_1 \langle \psi \vert \phi_1 \rangle + \alpha_2 \langle \psi \vert \phi_2 \rangle.

    That is to say, the inner product is linear in the second argument. This can be verified either through the formulas above or simply by noting that matrix multiplication is linear in each argument (and specifically in the second argument).

    Combining this fact with conjugate symmetry reveals that the inner product is conjugate linear in the first argument. That is, if ψ1,\vert \psi_1 \rangle, ψ2,\vert \psi_2 \rangle, and ϕ\vert \phi \rangle are vectors and α1\alpha_1 and α2\alpha_2 are complex numbers, and we define

    ψ=α1ψ1+α2ψ2,\vert \psi \rangle = \alpha_1 \vert \psi_1\rangle + \alpha_2 \vert \psi_2 \rangle,

    then

    ψϕ=(α1ψ1+α2ψ2)ϕ=α1ψ1ϕ+α2ψ2ϕ.\langle \psi \vert \phi \rangle = \bigl( \overline{\alpha_1} \langle \psi_1 \vert + \overline{\alpha_2} \langle \psi_2 \vert \bigr) \vert\phi\rangle = \overline{\alpha_1} \langle \psi_1 \vert \phi \rangle + \overline{\alpha_2} \langle \psi_2 \vert \phi \rangle.
  4. The Cauchy–Schwarz inequality. For every choice of vectors ϕ\vert \phi \rangle and ψ\vert \psi \rangle having the same number of entries, we have

    ψϕψϕ.\bigl\vert \langle \psi \vert \phi \rangle\bigr| \leq \bigl\| \vert\psi \rangle \bigr\| \bigl\| \vert \phi \rangle \bigr\|.

    This is an incredibly handy inequality that gets used quite extensively in quantum information (and in many other topics of study).


Orthogonal and orthonormal sets

Two vectors ϕ\vert \phi \rangle and ψ\vert \psi \rangle are said to be orthogonal if their inner product is zero:

ψϕ=0.\langle \psi \vert \phi \rangle = 0.

Geometrically, we can think about orthogonal vectors as vectors at right angles to each other.

A set of vectors {ψ1,,ψm}\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\} is called an orthogonal set if every vector in the set is orthogonal to every other vector in the set. That is, this set is orthogonal if

ψjψk=0\langle \psi_j \vert \psi_k\rangle = 0

for all choices of j,k{1,,m}j,k\in\{1,\ldots,m\} for which jk.j\neq k.

A set of vectors {ψ1,,ψm}\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\} is called an orthonormal set if it is an orthogonal set and, in addition, every vector in the set is a unit vector. Alternatively, this set is an orthonormal set if we have

ψjψk={1j=k0jk(3)\langle \psi_j \vert \psi_k\rangle = \begin{cases} 1 & j = k\\[1mm] 0 & j\neq k \end{cases} \tag{3}

for all choices of j,k{1,,m}.j,k\in\{1,\ldots,m\}.

Finally, a set {ψ1,,ψm}\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\} is an orthonormal basis if, in addition to being an orthonormal set, it forms a basis. This is equivalent to {ψ1,,ψm}\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\} being an orthonormal set and mm being equal to the dimension of the space from which ψ1,,ψm\vert \psi_1\rangle,\ldots,\vert\psi_m\rangle are drawn.

For example, for any classical state set Σ,\Sigma, the set of all standard basis vectors

{a:aΣ}\big\{ \vert a \rangle \,:\, a\in\Sigma\bigr\}

is an orthonormal basis. The set {+,}\{\vert+\rangle,\vert-\rangle\} is an orthonormal basis for the 22-dimensional space corresponding to a single qubit, and the Bell basis {ϕ+,ϕ,ψ+,ψ}\{\vert\phi^+\rangle, \vert\phi^-\rangle, \vert\psi^+\rangle, \vert\psi^-\rangle\} is an orthonormal basis for the 44-dimensional space corresponding to two qubits.

Extending orthonormal sets to orthonormal bases

Suppose that ψ1,,ψm\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle are vectors that live in an nn-dimensional space, and assume moreover that {ψ1,,ψm}\{\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle\} is an orthonormal set. Orthonormal sets are always linearly independent sets, so these vectors necessarily span a subspace of dimension m.m. From this we conclude that mnm\leq n because the dimension of the subspace spanned by these vectors cannot be larger than the dimension of the entire space from which they're drawn.

If it is the case that m<n,m<n, then it is always possible to choose an additional nmn-m vectors ψm+1,,ψn\vert \psi_{m+1}\rangle,\ldots,\vert\psi_n\rangle so that {ψ1,,ψn}\{\vert\psi_1\rangle,\ldots,\vert\psi_n\rangle\} forms an orthonormal basis. A procedure known as the Gram–Schmidt orthogonalization process can be used to construct these vectors.

Orthonormal sets and unitary matrices

Orthonormal sets of vectors are closely connected with unitary matrices. One way to express this connection is to say that the following three statements are logically equivalent (meaning that they are all true or all false) for any choice of a square matrix UU:

  1. The matrix UU is unitary (i.e., UU=I=UUU^{\dagger} U = \mathbb{I} = U U^{\dagger}).
  2. The rows of UU form an orthonormal set.
  3. The columns of UU form an orthonormal set.

This equivalence is actually pretty straightforward when we think about how matrix multiplication and the conjugate transpose work. Suppose, for instance, that we have a 3×33\times 3 matrix like this:

U=(α1,1α1,2α1,3α2,1α2,2α2,3α3,1α3,2α3,3)U = \begin{pmatrix} \alpha_{1,1} & \alpha_{1,2} & \alpha_{1,3} \\[1mm] \alpha_{2,1} & \alpha_{2,2} & \alpha_{2,3} \\[1mm] \alpha_{3,1} & \alpha_{3,2} & \alpha_{3,3} \end{pmatrix}

The conjugate transpose of UU looks like this:

U=(α1,1α2,1α3,1α1,2α2,2α3,2α1,3α2,3α3,3)U^{\dagger} = \begin{pmatrix} \overline{\alpha_{1,1}} & \overline{\alpha_{2,1}} & \overline{\alpha_{3,1}} \\[1mm] \overline{\alpha_{1,2}} & \overline{\alpha_{2,2}} & \overline{\alpha_{3,2}} \\[1mm] \overline{\alpha_{1,3}} & \overline{\alpha_{2,3}} & \overline{\alpha_{3,3}} \end{pmatrix}

Multiplying the two matrices, with the conjugate transpose on the left-hand side, gives us this matrix:

(α1,1α2,1α3,1α1,2α2,2α3,2α1,3α2,3α3,3)(α1,1α1,2α1,3α2,1α2,2α2,3α3,1α3,2α3,3)=(α1,1α1,1+α2,1α2,1+α3,1α3,1α1,1α1,2+α2,1α2,2+α3,1α3,2α1,1α1,3+α2,1α2,3+α3,1α3,3α1,2α1,1+α2,2α2,1+α3,2α3,1α1,2α1,2+α2,2α2,2+α3,2α3,2α1,2α1,3+α2,2α2,3+α3,2α3,3α1,3α1,1+α2,3α2,1+α3,3α3,1α1,3α1,2+α2,3α2,2+α3,3α3,2α1,3α1,3+α2,3α2,3+α3,3α3,3)\begin{aligned} &\begin{pmatrix} \overline{\alpha_{1,1}} & \overline{\alpha_{2,1}} & \overline{\alpha_{3,1}} \\[1mm] \overline{\alpha_{1,2}} & \overline{\alpha_{2,2}} & \overline{\alpha_{3,2}} \\[1mm] \overline{\alpha_{1,3}} & \overline{\alpha_{2,3}} & \overline{\alpha_{3,3}} \end{pmatrix} \begin{pmatrix} \alpha_{1,1} & \alpha_{1,2} & \alpha_{1,3} \\[1mm] \alpha_{2,1} & \alpha_{2,2} & \alpha_{2,3} \\[1mm] \alpha_{3,1} & \alpha_{3,2} & \alpha_{3,3} \end{pmatrix}\\[4mm] \quad &= \begin{pmatrix} \overline{\alpha_{1,1}}\alpha_{1,1} + \overline{\alpha_{2,1}}\alpha_{2,1} + \overline{\alpha_{3,1}}\alpha_{3,1} & \overline{\alpha_{1,1}}\alpha_{1,2} + \overline{\alpha_{2,1}}\alpha_{2,2} + \overline{\alpha_{3,1}}\alpha_{3,2} & \overline{\alpha_{1,1}}\alpha_{1,3} + \overline{\alpha_{2,1}}\alpha_{2,3} + \overline{\alpha_{3,1}}\alpha_{3,3} \\[2mm] \overline{\alpha_{1,2}}\alpha_{1,1} + \overline{\alpha_{2,2}}\alpha_{2,1} + \overline{\alpha_{3,2}}\alpha_{3,1} & \overline{\alpha_{1,2}}\alpha_{1,2} + \overline{\alpha_{2,2}}\alpha_{2,2} + \overline{\alpha_{3,2}}\alpha_{3,2} & \overline{\alpha_{1,2}}\alpha_{1,3} + \overline{\alpha_{2,2}}\alpha_{2,3} + \overline{\alpha_{3,2}}\alpha_{3,3} \\[2mm] \overline{\alpha_{1,3}}\alpha_{1,1} + \overline{\alpha_{2,3}}\alpha_{2,1} + \overline{\alpha_{3,3}}\alpha_{3,1} & \overline{\alpha_{1,3}}\alpha_{1,2} + \overline{\alpha_{2,3}}\alpha_{2,2} + \overline{\alpha_{3,3}}\alpha_{3,2} & \overline{\alpha_{1,3}}\alpha_{1,3} + \overline{\alpha_{2,3}}\alpha_{2,3} + \overline{\alpha_{3,3}}\alpha_{3,3} \end{pmatrix} \end{aligned}

If we form three vectors from the columns of U,U,

ψ1=(α1,1α2,1α3,1),ψ2=(α1,2α2,2α3,2),ψ3=(α1,3α2,3α3,3),\vert \psi_1\rangle = \begin{pmatrix} \alpha_{1,1}\\ \alpha_{2,1}\\ \alpha_{3,1} \end{pmatrix}, \quad \vert \psi_2\rangle = \begin{pmatrix} \alpha_{1,2}\\ \alpha_{2,2}\\ \alpha_{3,2} \end{pmatrix}, \quad \vert \psi_3\rangle = \begin{pmatrix} \alpha_{1,3}\\ \alpha_{2,3}\\ \alpha_{3,3} \end{pmatrix},

then we can alternatively express the product above as follows:

UU=(ψ1ψ1ψ1ψ2ψ1ψ3ψ2ψ1ψ2ψ2ψ2ψ3ψ3ψ1ψ3ψ2ψ3ψ3)U^{\dagger} U = \begin{pmatrix} \langle \psi_1\vert \psi_1 \rangle & \langle \psi_1\vert \psi_2 \rangle & \langle \psi_1\vert \psi_3 \rangle \\ \langle \psi_2\vert \psi_1 \rangle & \langle \psi_2\vert \psi_2 \rangle & \langle \psi_2\vert \psi_3 \rangle \\ \langle \psi_3\vert \psi_1 \rangle & \langle \psi_3\vert \psi_2 \rangle & \langle \psi_3\vert \psi_3 \rangle \end{pmatrix}

Referring to the equation (3),(3), we now see that the condition that this matrix is equal to the identity matrix is equivalent to the orthonormality of the set {ψ1,ψ2,ψ3}.\{\vert\psi_1\rangle,\vert\psi_2\rangle,\vert\psi_3\rangle\}.

This argument generalizes to unitary matrices of any size. The fact that the rows of a matrix form an orthonormal basis if and only if the matrix is unitary then follows from the fact that a matrix is unitary if and only if its transpose is unitary.

Given the equivalence described above, together with the fact that every orthonormal set can be extended to form an orthonormal basis, we conclude the following useful fact: Given any orthonormal set of vectors {ψ1,,ψm}\{\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle\} drawn from an nn-dimensional space, there exists a unitary matrix UU whose first mm columns are the vectors ψ1,,ψm.\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle. Pictorially, we can always find a unitary matrix having this form:

U=(ψ1ψ2ψmψm+1ψn).U = \left( \begin{array}{ccccccc} \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt}\\ \vert\psi_1\rangle & \vert\psi_2\rangle & \cdots & \vert\psi_m\rangle & \vert\psi_{m+1}\rangle & \cdots & \vert\psi_n\rangle\\[2mm] \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt} \end{array} \right).

Here, the last nmn-m columns are filled in with any choice of vectors ψm+1,,ψn\vert\psi_{m+1}\rangle,\ldots,\vert\psi_n\rangle that make {ψ1,,ψn}\{\vert\psi_1\rangle,\ldots,\vert\psi_n\rangle\} an orthonormal basis.


Projections and projective measurements

Projection matrices

A square matrix Π\Pi is called a projection if it satisfies two properties:

  1. Π=Π.\Pi = \Pi^{\dagger}.
  2. Π2=Π.\Pi^2 = \Pi.

Matrices that satisfy the first condition — that they are equal to their own conjugate transpose — are called Hermitian matrices, and matrices that satisfy the second condition — that squaring them leaves them unchanged — are called idempotent matrices.

As a word of caution, the word projection is sometimes used to refer to any matrix that satisfies just the second condition but not necessarily the first, and when this is done the term orthogonal projection is typically used to refer to matrices satisfying both properties. In the context of quantum information and computation, however, the terms projection and projection matrix more typically refer to matrices satisfying both conditions.

An example of a projection is the matrix

Π=ψψ(4)\Pi = \vert \psi \rangle \langle \psi \vert \tag{4}

for any unit vector ψ.\vert \psi\rangle. We can see that this matrix is Hermitian as follows:

Π=(ψψ)=(ψ)(ψ)=ψψ=Π.\Pi^{\dagger} = \bigl( \vert \psi \rangle \langle \psi \vert \bigr)^{\dagger} = \bigl( \langle \psi \vert \bigr)^{\dagger}\bigl( \vert \psi \rangle \bigr)^{\dagger} = \vert \psi \rangle \langle \psi \vert = \Pi.

Here, to obtain the second equality, we have used the formula

(AB)=BA,(A B)^{\dagger} = B^{\dagger} A^{\dagger},

which is always true, for any two matrices AA and BB for which the product ABAB makes sense.

To see that the matrix Π\Pi in (4)(4) is idempotent, we can use the assumption that ψ\vert\psi\rangle is a unit vector, so that it satisfies ψψ=1.\langle \psi \vert \psi\rangle = 1. Thus, we have

Π2=(ψψ)2=ψψψψ=ψψ=Π.\Pi^2 = \bigl( \vert\psi\rangle\langle \psi\vert \bigr)^2 = \vert\psi\rangle\langle \psi\vert\psi\rangle\langle\psi\vert = \vert\psi\rangle\langle\psi\vert = \Pi.

More generally, if {ψ1,,ψm}\{\vert \psi_1\rangle,\ldots,\vert \psi_m\rangle\} is any orthonormal set of vectors, then the matrix

Π=k=1mψkψk(5)\Pi = \sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert \tag{5}

is a projection. Specifically, we have

Π=(k=1mψkψk)=k=1m(ψkψk)=k=1mψkψk=Π,\begin{aligned} \Pi^{\dagger} &= \biggl(\sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\biggr)^{\dagger} \\ &= \sum_{k = 1}^m \bigl(\vert\psi_k\rangle\langle\psi_k\vert\bigr)^{\dagger} \\ &= \sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\\ &= \Pi, \end{aligned}

and

Π2=(j=1mψjψj)(k=1mψkψk)=j=1mk=1mψjψjψkψk=k=1mψkψk=Π,\begin{aligned} \Pi^2 & = \biggl( \sum_{j = 1}^m \vert \psi_j\rangle \langle \psi_j \vert\Bigr)\Bigl(\sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\biggr) \\ & = \sum_{j = 1}^m\sum_{k = 1}^m \vert \psi_j\rangle \langle \psi_j \vert \psi_k\rangle \langle \psi_k \vert \\ & = \sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\\ & = \Pi, \end{aligned}

where the orthonormality of {ψ1,,ψm}\{\vert \psi_1\rangle,\ldots,\vert \psi_m\rangle\} implies the second-to-last equality.

In fact, this exhausts all of the possibilities: every projection Π\Pi can be written in the form (5)(5) for some choice of an orthonormal set {ψ1,,ψm}.\{\vert \psi_1\rangle,\ldots,\vert \psi_m\rangle\}. (Technically speaking, the zero matrix Π=0,\Pi=0, which is a projection, is a special case. To fit it into the general form (5)(5) we must allow the possibility that the sum is empty, resulting in the zero matrix.)

Projective measurements

The notion of a measurement of a quantum system is more general than just standard basis measurements. Projective measurements are measurements that are described by a collection of projections whose sum is equal to the identity matrix. In symbols, a collection {Π0,,Πm1}\{\Pi_0,\ldots,\Pi_{m-1}\} of projection matrices describes a projective measurement if

Π0++Πm1=I.\Pi_0 + \cdots + \Pi_{m-1} = \mathbb{I}.

When such a measurement is performed on a system X\mathsf{X} while it is in some state ψ,\vert\psi\rangle, two things happen:

  1. For each k{0,,m1},k\in\{0,\ldots,m-1\}, the outcome of the measurement is kk with probability equal to

    Pr(outcome is k)=Πkψ2.\operatorname{Pr}\bigl(\text{outcome is $k$}\bigr) = \bigl\| \Pi_k \vert \psi \rangle \bigr\|^2.
  2. For whichever outcome kk the measurement produces, the state of X\mathsf{X} becomes

    ΠkψΠkψ.\frac{\Pi_k \vert\psi\rangle}{\bigl\|\Pi_k \vert\psi\rangle\bigr\|}.

We can also choose outcomes other than {0,,m1}\{0,\ldots,m-1\} for projective measurements if we wish. More generally, for any finite and nonempty set Σ,\Sigma, if we have a collection of projection matrices

{Πa:aΣ}\{\Pi_a:a\in\Sigma\}

that satisfies the condition

aΣΠa=I,\sum_{a\in\Sigma} \Pi_a = \mathbb{I},

then this collection describes a projective measurement whose possible outcomes coincide with the set Σ,\Sigma, where the rules are the same as before:

  1. For each aΣ,a\in\Sigma, the outcome of the measurement is aa with probability equal to

    Pr(outcome is a)=Πaψ2.\operatorname{Pr}\bigl(\text{outcome is $a$}\bigr) = \bigl\| \Pi_a \vert \psi \rangle \bigr\|^2.
  2. For whichever outcome aa the measurement produces, the state of X\mathsf{X} becomes

    ΠaψΠaψ.\frac{\Pi_a \vert\psi\rangle}{\bigl\|\Pi_a \vert\psi\rangle\bigr\|}.

For example, standard basis measurements are equivalent to projective measurements, where Σ\Sigma is the set of classical states of whatever system X\mathsf{X} we're talking about and our set of projection matrices is {aa:aΣ}.\{\vert a\rangle\langle a\vert:a\in\Sigma\}.

Another example of a projective measurement, this time on two qubits (X,Y),(\mathsf{X},\mathsf{Y}), is given by the set {Π0,Π1},\{\Pi_0,\Pi_1\}, where

Π0=ϕ+ϕ++ϕϕ+ψ+ψ+andΠ1=ψψ.\Pi_0 = \vert \phi^+\rangle\langle \phi^+ \vert + \vert \phi^-\rangle\langle \phi^- \vert + \vert \psi^+\rangle\langle \psi^+ \vert \quad\text{and}\quad \Pi_1 = \vert\psi^-\rangle\langle\psi^-\vert.

If we have multiple systems that are jointly in some quantum state and a projective measurement is performed on just one of the systems, the action is similar to what we had for standard basis measurements — and in fact we can now describe this action in much simpler terms than we could before.

To be precise, let us suppose that we have two systems (X,Y)(\mathsf{X},\mathsf{Y}) in a quantum state ψ,\vert\psi\rangle, and a projective measurement described by a collection {Πa:aΣ}\{\Pi_a : a\in\Sigma\} is performed on the system X,\mathsf{X}, while nothing is done to Y.\mathsf{Y}. Doing this is then equivalent to performing the projective measurement described by the collection

{ΠaI:aΣ}\bigl\{ \Pi_a \otimes \mathbb{I} \,:\, a\in\Sigma\bigr\}

on the joint system (X,Y).(\mathsf{X},\mathsf{Y}). Each measurement outcome aa results with probability

(ΠaI)ψ2,\bigl\| (\Pi_a \otimes \mathbb{I})\vert \psi\rangle \bigr\|^2,

and conditioned on the result aa appearing, the state of the joint system (X,Y)(\mathsf{X},\mathsf{Y}) becomes

(ΠaI)ψ(ΠaI)ψ.\frac{(\Pi_a \otimes \mathbb{I})\vert \psi\rangle}{\bigl\| (\Pi_a \otimes \mathbb{I})\vert \psi\rangle \bigr\|}.

Implementing projective measurements

Arbitrary projective measurements can be implemented using unitary operations, standard basis measurements, and an extra workspace system, as will now be explained.

Let us suppose that X\mathsf{X} is a system and {Π0,,Πm1}\{\Pi_0,\ldots,\Pi_{m-1}\} is a projective measurement on X.\mathsf{X}. We can easily generalize this discussion to projective measurements having different sets of outcomes, but in the interest of convenience and simplicity we will assume the set of possible outcomes for our measurement is {0,,m1}.\{0,\ldots,m-1\}.

Let us note explicitly that mm is not necessarily equal to the number of classical states of X\mathsf{X} — we'll let nn be the number of classical states of X,\mathsf{X}, which means that each matrix Πk\Pi_k is an n×nn\times n projection matrix.

Because we assume that {Π0,Πm1}\{\Pi_0\ldots,\Pi_{m-1}\} represents a projective measurement, it is necessarily the case that

k=0m1Πk=In.\sum_{k = 0}^{m-1} \Pi_k = \mathbb{I}_n.

Our goal is to perform a process that has the same effect as performing this projective measurement on X,\mathsf{X}, but to do this using only unitary operations and standard basis measurements.

We will make use of an extra workspace system Y\mathsf{Y} to do this, and specifically we'll take the classical state set of Y\mathsf{Y} to be {0,,m1},\{0,\ldots,m-1\}, which is the same as the set of outcomes of the projective measurement. The idea is that we will perform a standard basis measurement on Y,\mathsf{Y}, and interpret the outcome of this measurement as being equivalent to the outcome of the projective measurement on X.\mathsf{X}. We'll need to assume that Y\mathsf{Y} is initialized to some fixed state, which we'll choose to be 0.\vert 0\rangle. (Any other choice of fixed quantum state vector could be made to work, but choosing 0\vert 0\rangle makes the explanation to follow much simpler.)

Of course, in order for a standard basis measurement of Y\mathsf{Y} to tell us anything about X,\mathsf{X}, we will need to allow X\mathsf{X} and Y\mathsf{Y} to interact somehow before measuring Y,\mathsf{Y}, by performing a unitary operation on the system (Y,X).(\mathsf{Y},\mathsf{X}). First consider this matrix:

M=k=0m1k0Πk.M = \sum_{k = 0}^{m-1} \vert k \rangle \langle 0 \vert \otimes \Pi_k.

Expressed explicitly as a so-called block matrix, which is essentially a matrix of matrices that we interpret as a single, larger matrix, MM looks like this:

M=(Π000Π100Πm100).M = \begin{pmatrix} \Pi_0 & 0 & \cdots & 0\\[1mm] \Pi_1 & 0 & \cdots & 0\\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \Pi_{m-1} & 0 & \cdots & 0 \end{pmatrix}.

Here, each 00 represents an n×nn\times n matrix filled entirely with zeros, so that the entire matrix MM is an nm×nmnm\times nm matrix.

Now, MM is certainly not a unitary matrix (unless m=1,m=1, in which case Π0=I,\Pi_0 = \mathbb{I}, giving M=IM = \mathbb{I} in this trivial case) because unitary matrices cannot have any columns (or rows) that are entirely 0;0; unitary matrices have columns that form orthonormal bases, and the all-zero vector is not a unit vector.

However, it is the case that the first nn columns of MM are orthonormal, and we get this from the assumption that {Π0,,Πm1}\{\Pi_0,\ldots,\Pi_{m-1}\} is a measurement. To verify this claim, notice that for each j{0,,n1},j\in\{0,\ldots,n-1\}, the vector formed by column number jj of MM is as follows:

ψj=M0,j=k=0m1kΠkj.\vert \psi_j\rangle = M \vert 0, j\rangle = \sum_{k = 0}^{m-1} \vert k \rangle \otimes \Pi_k \vert j\rangle.

Note that here we're numbering the columns starting from column 0.0. Taking the inner product of column ii with column jj when i,j{0,,n1}i,j\in\{0,\ldots,n-1\} gives

ψiψj=(k=0m1kΠki)(l=0m1lΠlj)=k=0m1l=0m1kliΠkΠlj=k=0m1iΠkΠkj=k=0m1iΠkj=iIj={1i=j0ij,\begin{aligned} \langle \psi_i \vert \psi_j \rangle & = \biggl(\sum_{k = 0}^{m-1} \vert k \rangle \otimes \Pi_k \vert i\rangle\biggr)^{\dagger} \biggl(\sum_{l = 0}^{m-1} \vert l \rangle \otimes \Pi_l \vert j\rangle\biggr) \\ & = \sum_{k = 0}^{m-1} \sum_{l = 0}^{m-1} \langle k \vert l \rangle \langle i \vert \Pi_k \Pi_l \vert j\rangle\\ & = \sum_{k = 0}^{m-1} \langle i \vert \Pi_k \Pi_k \vert j\rangle\\ & = \sum_{k = 0}^{m-1} \langle i \vert \Pi_k \vert j\rangle\\ & = \langle i \vert \mathbb{I} \vert j \rangle\\ & = \begin{cases} 1 & i = j\\ 0 & i\neq j, \end{cases} \end{aligned}

which is what we needed to show.

Thus, because the first nn columns of the matrix MM are orthonormal, we can replace all of the remaining zero entries by some different choice of complex number entries so that the entire matrix is unitary.

U=(Π0??Π1??Πm1??)U = \begin{pmatrix} \Pi_0 & \fbox{?} & \cdots & \fbox{?}\\[1mm] \Pi_1 & \fbox{?} & \cdots & \fbox{?}\\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \Pi_{m-1} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix}

If we're given the matrices Π0,,Πm1,\Pi_0,\ldots,\Pi_{m-1}, we can compute suitable matrices to fill in for the blocks marked ?\fbox{?} in the equation — using the Gram–Schmidt process — but it does not matter specifically what these matrices are for the sake of this discussion.

Finally we can describe the measurement process: we first perform UU on the joint system (Y,X)(\mathsf{Y},\mathsf{X}) and then measure Y\mathsf{Y} with respect to a standard basis measurement. For an arbitrary state ϕ\vert \phi \rangle of X,\mathsf{X}, we obtain the state

U(0ϕ)=M(0ϕ)=k=0m1kΠkϕ,U \bigl( \vert 0\rangle \vert \phi\rangle\bigr) = M \bigl( \vert 0\rangle \vert \phi\rangle\bigr) = \sum_{k = 0}^{m-1} \vert k\rangle \otimes \Pi_k \vert\phi\rangle,

where the first equality follows from the fact that UU and MM agree on their first nn columns. When we perform a projective measurement on Y,\mathsf{Y}, we obtain each outcome kk with probability

Πkϕ2,\bigl\| \Pi_k \vert \phi\rangle \bigr\|^2,

in which case the state of (Y,X)(\mathsf{Y},\mathsf{X}) becomes

kΠkϕΠkϕ.\vert k\rangle \otimes \frac{\Pi_k \vert \phi\rangle}{\bigl\| \Pi_k \vert \phi\rangle \bigr\|}.

Thus, Y\mathsf{Y} stores a copy of the measurement outcome and X\mathsf{X} changes precisely as it would had the projective measurement described by {Π0,,Πm1}\{\Pi_0,\ldots,\Pi_{m-1}\} been performed directly on X.\mathsf{X}.

Was this page helpful?
Report a bug or request content on GitHub.