{
  "cells": [
    {
      "cell_type": "markdown",
      "id": "348bf846",
      "metadata": {},
      "source": [
        "---\n",
        "title: Naimark's theorem\n",
        "description: A free IBM course on quantum information and computation\n",
        "---\n",
        "\n",
        "{/* cspell:ignore orthogonalization */}\n",
        "\n",
        "# Naimark's theorem\n",
        "\n",
        "*Naimark's theorem* is a fundamental fact concerning measurements.\n",
        "It states that every general measurement can be implemented in a simple way that's reminiscent of Stinespring representations of channels:\n",
        "\n",
        "1. The system to be measured is first combined with an initialized workspace system, forming a compound system.\n",
        "2. A unitary operation is then performed on the compound system.\n",
        "3. Finally, the workspace system is *measured* with respect to a *standard basis measurement*, yielding the outcome of the original general measurement.\n",
        "\n",
        "## Theorem statement and proof\n",
        "\n",
        "Let $\\mathsf{X}$ be a system and let $\\{P_0,\\ldots,P_{m-1}\\}$ be a collection of positive semidefinite matrices satisfying\n",
        "\n",
        "$$\n",
        "P_0 + \\cdots + P_{m-1} = \\mathbb{I}_{\\mathsf{X}},\n",
        "$$\n",
        "\n",
        "which is to say that they describe a measurement of $\\mathsf{X}.$\n",
        "Also let $\\mathsf{Y}$ be a system whose classical state set is $\\{0,\\ldots,m-1\\},$ which is the set of possible outcomes of this measurement.\n",
        "\n",
        "Naimark's theorem states that there exists a unitary operation $U$ on the compound system $(\\mathsf{Y},\\mathsf{X})$ so that the implementation suggested by the following figure yields measurement outcomes that agree with the given measurement $\\{P_0,\\ldots,P_{m-1}\\},$ meaning that the probabilities for the different possible measurement outcomes are precisely in agreement.\n",
        "\n",
        "![An implementation of a general measurement as in Naimark's theorem](https://quantum.cloud.ibm.com/learning/images/courses/general-formulation-of-quantum-information/general-measurements/Naimark.svg)\n",
        "\n",
        "To be clear, the system $\\mathsf{X}$ starts out in some arbitrary state $\\rho$ while $\\mathsf{Y}$ is initialized to the $\\vert 0\\rangle$ state.\n",
        "The unitary operation $U$ is applied to $(\\mathsf{Y},\\mathsf{X})$ and then the system $\\mathsf{Y}$ is measured with a standard basis measurement, yielding some outcome $a\\in\\{0,\\ldots,m-1\\}.$\n",
        "\n",
        "The system $\\mathsf{X}$ is pictured as part of the output of the circuit, but for now we won't concern ourselves with the state of $\\mathsf{X}$ after $U$ is performed, and can imagine that it is traced out.\n",
        "We'll be interested in the state of $\\mathsf{X}$ after $U$ is performed later in the lesson, though.\n",
        "\n",
        "An implementation of a measurement in this way is clearly reminiscent of a Stinespring representation of a channel, and the mathematical underpinnings are similar as well.\n",
        "The difference here is that the workspace system is measured rather than being traced out like in the case of a Stinespring representation.\n",
        "\n",
        "The fact that every measurement can be implemented in this way is pretty simple to prove, but we're going to need a fact concerning positive semidefinite matrices first.\n",
        "\n",
        "<Figure title=\"Fact\">\n",
        "  Suppose $P$ is an $n \\times n$ positive semidefinite matrix. There exists a unique $n\\times n$ positive semidefinite matrix $Q$ for which $Q^2 = P.$ This unique positive semidefinite matrix is called the *square root* of $P$ and is denoted $\\sqrt{P}.$\n",
        "</Figure>\n",
        "\n",
        "One way to find the square root of a positive semidefinite matrix is to first compute a spectral decomposition.\n",
        "\n",
        "$$\n",
        "P = \\sum_{k=0}^{n-1} \\lambda_k \\vert \\psi_k \\rangle \\langle \\psi_k \\vert\n",
        "$$\n",
        "\n",
        "Because $P$ is positive semidefinite, its eigenvalues must be nonnegative real numbers, and by replacing them with their square roots we obtain an expression for the square root of $P.$\n",
        "\n",
        "$$\n",
        "\\sqrt{P} = \\sum_{k=0}^{n-1} \\sqrt{\\lambda_k} \\vert \\psi_k \\rangle \\langle \\psi_k \\vert\n",
        "$$\n",
        "\n",
        "With this concept in hand, we're ready to prove Naimark's theorem.\n",
        "Under the assumption that $\\mathsf{X}$ has $n$ classical states, a unitary operation $U$ on the pair $(\\mathsf{Y},\\mathsf{X})$ can be represented by an $nm\\times nm$ matrix, which we can view as an $m\\times m$ block matrix whose blocks are $n\\times n.$\n",
        "The key to the proof is to take $U$ to be any unitary matrix that matches the following pattern.\n",
        "\n",
        "$$\n",
        "U =\n",
        "\\begin{pmatrix}\n",
        "\\sqrt{P_0} & \\fbox{?} & \\cdots & \\fbox{?} \\\\[1mm]\n",
        "\\sqrt{P_1} & \\fbox{?} & \\cdots & \\fbox{?} \\\\[1mm]\n",
        "\\vdots & \\vdots & \\ddots & \\vdots\\\\[1mm]\n",
        "\\sqrt{P_{m-1}} & \\fbox{?} & \\cdots & \\fbox{?}\n",
        "\\end{pmatrix}\n",
        "$$\n",
        "\n",
        "For it to be possible to fill in the blocks marked with a question mark so that $U$ is unitary, it's both necessary and sufficient that the first $n$ columns, which are formed by the blocks $\\sqrt{P_0},\\ldots,\\sqrt{P_{m-1}},$ are orthonormal.\n",
        "We can then use the Gram-Schmidt orthogonalization process to fill in the remaining columns, just like we encountered in the previous lesson.\n",
        "\n",
        "The first $n$ columns of $U$ can be expressed as vectors in the following way, where $c = 0,\\ldots,n-1$ refers to the column number starting from $0.$\n",
        "\n",
        "$$\n",
        "\\vert\\gamma_c\\rangle = \\sum_{a = 0}^{m-1} \\vert a \\rangle \\otimes \\sqrt{P_a} \\vert c\\rangle\n",
        "$$\n",
        "\n",
        "We can compute the inner product between any two of them as follows.\n",
        "\n",
        "$$\n",
        "\\langle \\gamma_c \\vert \\gamma_d \\rangle =\n",
        "\\sum_{a,b = 0}^{m-1} \\langle a \\vert b \\rangle \\cdot \\langle c \\vert \\sqrt{P_a}\\sqrt{P_b}\\, \\vert d\\rangle\n",
        "= \\langle c \\vert \\Biggl(\\sum_{a = 0}^{m-1}  P_a \\Biggr) \\vert d\\rangle\n",
        "= \\langle c \\vert d\\rangle\n",
        "$$\n",
        "\n",
        "This shows that these columns are in fact orthonormal, so we can fill in the remaining columns of $U$ in a way that guarantees the entire matrix is unitary.\n",
        "\n",
        "It remains to check that the measurement outcome probabilities for the simulation are consistent with the original measurement.\n",
        "For a given initial state $\\rho$ of $\\mathsf{X},$ the measurement described by the collection $\\{P_0,\\ldots,P_{m-1}\\}$ results in each outcome $a\\in\\{0,\\ldots,m-1\\}$ with probability $\\operatorname{Tr}(P_a \\rho).$\n",
        "\n",
        "To obtain the outcome probabilities for the simulation, let's first give the name $\\sigma$ to the state of $(\\mathsf{Y},\\mathsf{X})$ after $U$ has been performed.\n",
        "This state can be expressed as follows.\n",
        "\n",
        "$$\n",
        "\\sigma =\n",
        "U \\bigl(\\vert 0\\rangle \\langle 0 \\vert \\otimes \\rho\\bigr) U^{\\dagger}\n",
        "= \\sum_{a,b=0}^{m-1} \\vert a\\rangle \\langle b \\vert \\otimes \\sqrt{P_a} \\rho \\sqrt{P_b}\n",
        "$$\n",
        "\n",
        "Equivalently, in a block matrix form, we have the following equation.\n",
        "\n",
        "$$\n",
        "\\begin{aligned}\n",
        "\\sigma & =\n",
        "\\begin{pmatrix}\n",
        "\\sqrt{P_0} & \\fbox{?} & \\cdots & \\fbox{?} \\\\[1mm]\n",
        "\\sqrt{P_1} & \\fbox{?} & \\cdots & \\fbox{?} \\\\[1mm]\n",
        "\\vdots & \\vdots & \\ddots & \\vdots\\\\[1mm]\n",
        "\\sqrt{P_{m-1}} & \\fbox{?} & \\cdots & \\fbox{?}\n",
        "\\end{pmatrix}\n",
        "\\begin{pmatrix}\n",
        "\\rho & 0 & \\cdots & 0 \\\\[1mm]\n",
        "0 & 0 & \\cdots & 0 \\\\[1mm]\n",
        "\\vdots & \\vdots & \\ddots & \\vdots\\\\[1mm]\n",
        "0 & 0 & \\cdots & 0\n",
        "\\end{pmatrix}\n",
        "\\begin{pmatrix}\n",
        "\\sqrt{P_0} & \\sqrt{P_1} & \\cdots & \\sqrt{P_{m-1}} \\\\[1mm]\n",
        "\\fbox{?} & \\fbox{?} & \\cdots & \\fbox{?} \\\\[1mm]\n",
        "\\vdots & \\vdots & \\ddots & \\vdots\\\\[1mm]\n",
        "\\fbox{?} & \\fbox{?} & \\cdots & \\fbox{?}\n",
        "\\end{pmatrix}\\\\[5mm]\n",
        "& = \\begin{pmatrix}\n",
        "\\sqrt{P_0}\\rho\\sqrt{P_0} & \\cdots & \\sqrt{P_0}\\rho\\sqrt{P_{m-1}} \\\\[1mm]\n",
        "\\vdots & \\ddots & \\vdots\\\\[1mm]\n",
        "\\sqrt{P_{m-1}}\\rho\\sqrt{P_0} & \\cdots & \\sqrt{P_{m-1}}\\rho\\sqrt{P_{m-1}}\n",
        "\\end{pmatrix}\n",
        "\\end{aligned}\n",
        "$$\n",
        "\n",
        "Notice that the entries of $U$ falling into the blocks marked with a question mark have no influence on the outcome by virtue of the fact that we're conjugating a matrix of the form $\\vert 0 \\rangle \\langle 0 \\vert \\otimes \\rho$\n",
        "— so the question mark entries are always multiplied by zero entries of $\\vert 0 \\rangle \\langle 0 \\vert \\otimes \\rho$ when the matrix product is computed.\n",
        "\n",
        "Now we can analyze what happens when a standard basis measurement is performed on $\\mathsf{Y}.$\n",
        "The probabilities of the possible outcomes are given by the diagonal entries of the reduced state $\\sigma_{\\mathsf{Y}}$ of $\\mathsf{Y}.$\n",
        "\n",
        "$$\n",
        "\\sigma_{\\mathsf{Y}} = \\sum_{a,b=0}^{m-1} \\operatorname{Tr}\\Bigl(\\sqrt{P_a} \\rho \\sqrt{P_b}\\Bigr) \\vert a\\rangle \\langle b \\vert\n",
        "$$\n",
        "\n",
        "In particular, using the cyclic property of the trace, we see that the probability to obtain a given outcome $a\\in\\{0,\\ldots,m-1\\}$ is as follows.\n",
        "\n",
        "$$\n",
        "\\langle a \\vert \\sigma_{\\mathsf{Y}} \\vert a \\rangle\n",
        "= \\operatorname{Tr}\\Bigl(\\sqrt{P_a} \\rho \\sqrt{P_a}\\Bigr)\n",
        "= \\operatorname{Tr}(P_a \\rho)\n",
        "$$\n",
        "\n",
        "This matches with the original measurement, establishing the correctness of the simulation.\n",
        "\n",
        "## Non-destructive measurements\n",
        "\n",
        "So far in the lesson, we've concerned ourselves with *destructive* measurements, where the output consists of the classical measurement result alone and there is no specification of the post-measurement quantum state of the system that was measured.\n",
        "\n",
        "*Non-destructive* measurements, on the other hand, do precisely this.\n",
        "Specifically, non-destructive measurements describe not only the classical measurement outcome probabilities, but also the state of the system that was measured conditioned on each possible measurement outcome.\n",
        "Note that the term *non-destructive* refers to the *system* being measured but not necessarily its state, which could change significantly as a result of the measurement.\n",
        "\n",
        "In general, for a given destructive measurement, there will be multiple (in fact infinitely many) non-destructive measurements that are *compatible* with the given destructive measurement, meaning that the classical measurement outcome probabilities match precisely with the destructive measurement.\n",
        "So, there isn't a unique way to define the post-measurement quantum state of a system for a given measurement.\n",
        "\n",
        "It is, in fact, possible to generalize non-destructive measurements even further, so that they produce a classical measurement outcome along with a quantum state output of a system that isn't necessarily the same as the input system.\n",
        "\n",
        "The notion of a non-destructive measurement is an interesting and useful abstraction.\n",
        "It should, however, be recognized that non-destructive measurements can always be described as compositions of channels and destructive measurements — so there is a sense in which the notion of a destructive measurement is the more fundamental one.\n",
        "\n",
        "### From Naimark's theorem\n",
        "\n",
        "Consider the simulation of a general measurement like we have in Naimark's theorem.\n",
        "A simple way to obtain a non-destructive measurement from this simulation is revealed by the figure from before, where the system $\\mathsf{X}$ is not traced out, but is part of the output.\n",
        "This yields both a classical measurement outcome $a\\in\\{0,\\ldots,m-1\\}$ as well as a post-measurement quantum state of $\\mathsf{X}.$\n",
        "\n",
        "Let's describe these states in mathematical terms.\n",
        "We're assuming that the initial state of $\\mathsf{X}$ is $\\rho,$ so that after the initialized system $\\mathsf{Y}$ is introduced and $U$ is performed, we have that $(\\mathsf{Y},\\mathsf{X})$ is in the state\n",
        "\n",
        "$$\n",
        "\\sigma =\n",
        "U \\bigl(\\vert 0\\rangle \\langle 0 \\vert \\otimes \\rho\\bigr) U^{\\dagger}\n",
        "= \\sum_{a,b=0}^{m-1} \\vert a\\rangle \\langle b \\vert \\otimes \\sqrt{P_a} \\rho \\sqrt{P_b}.\n",
        "$$\n",
        "\n",
        "The probabilities for the different classical outcomes to appear are the same as before — they can't change as a result of us deciding to ignore or not ignore $\\mathsf{X}.$\n",
        "That is, we obtain each $a\\in\\{0,\\ldots,m-1\\}$ with probability $\\operatorname{Tr}(P_a \\rho).$\n",
        "\n",
        "Conditioned upon having obtained a particular measurement outcome $a,$ the resulting state of $\\mathsf{X}$ is given by this expression.\n",
        "\n",
        "$$\n",
        "\\frac{\\sqrt{P_a} \\rho \\sqrt{P_a}}{\\operatorname{Tr}(P_a \\rho)}\n",
        "$$\n",
        "\n",
        "One way to see this is to represent a standard basis measurement of $\\mathsf{Y}$ by the completely dephasing channel $\\Delta_m,$ where the channel output describes classical measurement outcomes as (diagonal) density matrices.\n",
        "An expression of the state we obtain is as follows.\n",
        "\n",
        "$$\n",
        "\\sum_{a,b=0}^{m-1} \\Delta_m(\\vert a\\rangle \\langle b \\vert) \\otimes \\sqrt{P_a} \\rho \\sqrt{P_b}\n",
        "= \\sum_{a=0}^{m-1} \\vert a\\rangle \\langle a \\vert \\otimes \\sqrt{P_a} \\rho \\sqrt{P_a}.\n",
        "$$\n",
        "\n",
        "We can then write this state as a convex combination of product states,\n",
        "\n",
        "$$\n",
        "\\sum_{a=0}^{m-1} \\operatorname{Tr}(P_a \\rho)\\, \\vert a\\rangle \\langle a \\vert \\otimes \\frac{\\sqrt{P_a} \\rho \\sqrt{P_a}}{\\operatorname{Tr}(P_a \\rho)},\n",
        "$$\n",
        "\n",
        "which is consistent with the expression we've obtained for the state of $\\mathsf{X}$ conditioned on each possible measurement outcome.\n",
        "\n",
        "### From a Kraus representation\n",
        "\n",
        "There are alternative selections for $U$ in the context of Naimark's theorem that produce the same measurement outcome probabilities but give entirely different output states of $\\mathsf{X}.$\n",
        "\n",
        "For instance, one option is to substitute $(\\mathbb{I}_{\\mathsf{Y}} \\otimes V) U$ for $U,$ where $V$ is any unitary operation on $\\mathsf{X}.$\n",
        "The application of $V$ to $\\mathsf{X}$ commutes with the measurement of $\\mathsf{Y}$ so the classical outcome probabilities do not change, but now the state of $\\mathsf{X}$ conditioned on the outcome $a$ becomes\n",
        "\n",
        "$$\n",
        "\\frac{V \\sqrt{P_a} \\rho \\sqrt{P_a}V^{\\dagger}}{\\operatorname{Tr}(P_a \\rho)}.\n",
        "$$\n",
        "\n",
        "More generally, we could replace $U$ by the unitary matrix\n",
        "\n",
        "$$\n",
        "\\Biggl(\\sum_{a=0}^{m-1} \\vert a\\rangle\\langle a \\vert \\otimes V_a\\Biggr) U\n",
        "$$\n",
        "\n",
        "for any choice of unitary operations $V_0,\\ldots,V_{m-1}$ on $\\mathsf{X}.$\n",
        "Again, the classical outcome probabilities are unchanged, but now the state of $\\mathsf{X}$ conditioned on the outcome $a$ becomes\n",
        "\n",
        "$$\n",
        "\\frac{V_a \\sqrt{P_a} \\rho \\sqrt{P_a}V_a^{\\dagger}}{\\operatorname{Tr}(P_a \\rho)}.\n",
        "$$\n",
        "\n",
        "An equivalent way to express this freedom is connected with Kraus representations.\n",
        "That is, we can describe an $m$-outcome non-destructive measurement of a system having $n$ classical states by a selection of $n\\times n$ Kraus matrices $A_0,\\ldots,A_{m-1}$ satisfying the typical condition for Kraus matrices.\n",
        "\n",
        "$$\n",
        "\\sum_{a = 0}^{m-1} A_a^{\\dagger} A_a = \\mathbb{I}_{\\mathsf{X}}\n",
        "\\tag{1}\n",
        "$$\n",
        "\n",
        "Assuming that the initial state of $\\mathsf{X}$ is $\\rho,$ the classical measurement outcome is $a$ with probability\n",
        "\n",
        "$$\n",
        "\\operatorname{Tr}\\bigl(A_a \\rho A_a^{\\dagger}\\bigr)\n",
        "= \\operatorname{Tr}\\bigl(A_a^{\\dagger} A_a \\rho \\bigr)\n",
        "$$\n",
        "\n",
        "and conditioned upon the outcome being $a$ the state of $\\mathsf{X}$ becomes\n",
        "\n",
        "$$\n",
        "\\frac{A_a \\rho A_a^{\\dagger}}{\\operatorname{Tr}(A_a^{\\dagger}A_a \\rho)}.\n",
        "$$\n",
        "\n",
        "Note that this is equivalent to choosing the unitary operation $U$ in Naimark's theorem as follows.\n",
        "\n",
        "$$\n",
        "U =\n",
        "\\begin{pmatrix}\n",
        "A_{0} & \\fbox{?} & \\cdots & \\fbox{?} \\\\[1mm]\n",
        "A_{1} & \\fbox{?} & \\cdots & \\fbox{?} \\\\[1mm]\n",
        "\\vdots & \\vdots & \\ddots & \\vdots\\\\[1mm]\n",
        "A_{m-1} & \\fbox{?} & \\cdots & \\fbox{?}\n",
        "\\end{pmatrix}\n",
        "$$\n",
        "\n",
        "In the previous lesson we observed that the columns formed by the blocks $A_0,\\ldots,A_{m-1}$ are necessarily orthogonal, by virtue of the\n",
        "condition $(1).$\n",
        "\n",
        "### Generalizations\n",
        "\n",
        "There are even more general ways to formulate non-destructive measurements than the ways we've discussed.\n",
        "The notion of a *quantum instrument* (which won't be described here) represents one way to do this.\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "id": "a1b8767d",
      "source": "© IBM Corp., 2017-2026"
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}