{
  "cells": [
    {
      "cell_type": "markdown",
      "id": "224d0a06",
      "metadata": {},
      "source": [
        "---\n",
        "title: Quantum state discrimination and tomography\n",
        "description: A free IBM course on quantum information and computation\n",
        "---\n",
        "\n",
        "{/* cspell:ignore operatorname */}\n",
        "\n",
        "# Quantum state discrimination and tomography\n",
        "\n",
        "In the last part of the lesson, we'll briefly consider two tasks associated with measurements: *quantum state discrimination* and *quantum state tomography*.\n",
        "\n",
        "1. *Quantum state discrimination*\n",
        "\n",
        "   For quantum state discrimination, we have a known collection of quantum states $\\rho_0,\\ldots,\\rho_{m-1},$ along with\n",
        "   probabilities $p_0,\\ldots,p_{m-1}$ associated with these states.\n",
        "   A succinct way of expressing this is to say that we have an *ensemble*\n",
        "\n",
        "   $$\n",
        "   \\{(p_0,\\rho_0),\\ldots,(p_{m-1},\\rho_{m-1})\\}\n",
        "   $$\n",
        "\n",
        "   of quantum states.\n",
        "\n",
        "   A number $a\\in\\{0,\\ldots,m-1\\}$ is chosen randomly according to the probabilities $(p_0,\\ldots,p_{m-1})$ and the system $\\mathsf{X}$\n",
        "   is prepared in the state $\\rho_a.$\n",
        "   The goal is to determine, by means of a measurement of $\\mathsf{X}$ alone, which value of $a$ was chosen.\n",
        "\n",
        "   Thus, we have a finite number of alternatives, along with a *prior* — which is our knowledge of the probability for each $a$ to\n",
        "   be selected — and the goal is to determine which alternative actually happened.\n",
        "   This may be easy for some choices of states and probabilities, and for others it may not be possible without some chance of making an error.\n",
        "\n",
        "2. *Quantum state tomography*\n",
        "\n",
        "   For quantum state tomography, we have an *unknown* quantum state of a system —\n",
        "   so unlike in quantum state discrimination there's typically no prior or any\n",
        "   information about possible alternatives.\n",
        "\n",
        "   This time, however, it's not a single copy of the state that's made available,\n",
        "   but rather many *independent* copies are made available.\n",
        "   That is, $N$ identical systems $\\mathsf{X}_1,\\ldots,\\mathsf{X}_N$ are each\n",
        "   independently prepared in the state $\\rho$ for some (possibly large) number $N.$\n",
        "   The goal is to find an approximation of the unknown state, as a density matrix,\n",
        "   by measuring the systems.\n",
        "\n",
        "## Discriminating between two states\n",
        "\n",
        "The simplest case for quantum state discrimination is that there are two states,\n",
        "$\\rho_0$ and $\\rho_1,$ that are to be discriminated.\n",
        "\n",
        "Imagine a situation in which a bit $a$ is chosen randomly: $a = 0$ with probability $p$ and $a = 1$ with probability $1 - p.$\n",
        "A system $\\mathsf{X}$ is prepared in the state $\\rho_a,$ meaning $\\rho_0$ or $\\rho_1$ depending on the value of $a,$ and given to us.\n",
        "Our goal is to correctly guess the value of $a$ by means of a measurement on $\\mathsf{X}.$\n",
        "To be precise, we shall aim to maximize the probability that our guess is correct.\n",
        "\n",
        "### An optimal measurement\n",
        "\n",
        "An optimal way to solve this problem begins with a spectral decomposition of a weighted difference between $\\rho_0$ and $\\rho_1,$ where the weights are the corresponding probabilities.\n",
        "\n",
        "$$\n",
        "p \\rho_0 - (1-p) \\rho_1 = \\sum_{k = 0}^{n-1} \\lambda_k \\vert \\psi_k \\rangle \\langle \\psi_k \\vert\n",
        "$$\n",
        "\n",
        "Notice that we have a minus sign rather than a plus sign in this expression: this is a weighted *difference* not a weighted sum.\n",
        "\n",
        "We can maximize the probability of a correct guess by selecting a projective measurement $\\{\\Pi_0,\\Pi_1\\}$ as follows.\n",
        "First let's partition the elements of $\\{0,\\ldots,n-1\\}$ into two disjoint sets $S_0$ and $S_1$ depending upon whether the corresponding eigenvalue of the weighted difference is nonnegative or negative.\n",
        "\n",
        "$$\n",
        "\\begin{gathered}\n",
        "S_0 = \\{k\\in\\{0,\\ldots,n-1\\} : \\lambda_k \\geq 0 \\}\\\\[2mm]\n",
        "S_1 = \\{k\\in\\{0,\\ldots,n-1\\} : \\lambda_k < 0 \\}\n",
        "\\end{gathered}\n",
        "$$\n",
        "\n",
        "We can then choose a *projective* measurement as follows.\n",
        "\n",
        "$$\n",
        "\\Pi_0 = \\sum_{k \\in S_0} \\vert \\psi_k \\rangle \\langle \\psi_k \\vert\n",
        "\\quad\\text{and}\\quad\n",
        "\\Pi_1 = \\sum_{k \\in S_1} \\vert \\psi_k \\rangle \\langle \\psi_k \\vert\n",
        "$$\n",
        "\n",
        "(It doesn't actually matter in which set $S_0$ or $S_1$ we include the values of $k$ for which $\\lambda_k = 0.$\n",
        "Here we're choosing arbitrarily to include these values in $S_0.$)\n",
        "\n",
        "This is an optimal measurement in the situation at hand that minimizes the probability of an incorrect determination of the selected state.\n",
        "\n",
        "### Correctness probability\n",
        "\n",
        "Now we will determine the probability of correctness for the measurement $\\{\\Pi_0,\\Pi_1\\}.$\n",
        "\n",
        "To begin we don't really need to be concerned with the specific choice we've made for $\\Pi_0$ and $\\Pi_1,$ though it may be helpful to keep it in mind.\n",
        "For *any* measurement $\\{P_0,P_1\\}$ (not necessarily projective) we can write the correctness probability as follows.\n",
        "\n",
        "$$\n",
        "p \\operatorname{Tr}(P_0 \\rho_0) + (1 - p) \\operatorname{Tr}(P_1 \\rho_1)\n",
        "$$\n",
        "\n",
        "Using the fact that $\\{P_0,P_1\\}$ is a measurement, so $P_1 = \\mathbb{I} - P_0,$ we can rewrite this expression as follows.\n",
        "\n",
        "$$\n",
        "p \\operatorname{Tr}(P_0 \\rho_0) + (1 - p) \\operatorname{Tr}((\\mathbb{I} - P_0) \\rho_1)\\hspace*{3cm}\\\\[1mm]\n",
        "\\begin{aligned}\n",
        "& = p \\operatorname{Tr}(P_0 \\rho_0) - (1 - p) \\operatorname{Tr}(P_0 \\rho_1) + (1-p) \\operatorname{Tr}(\\rho_1)\\\\[1mm]\n",
        "& = \\operatorname{Tr}\\bigl( P_0 (p \\rho_0 - (1-p)\\rho_1) \\bigr) + 1 - p\n",
        "\\end{aligned}\n",
        "$$\n",
        "\n",
        "On the other hand, we could have made the substitution $P_0 = \\mathbb{I} - P_1$ instead.\n",
        "That wouldn't change the value but it does give us an alternative expression.\n",
        "\n",
        "$$\n",
        "p \\operatorname{Tr}((\\mathbb{I} - P_1) \\rho_0) + (1 - p) \\operatorname{Tr}(P_1 \\rho_1)\\hspace*{3cm}\\\\[1mm]\n",
        "\\begin{aligned}\n",
        "& = p \\operatorname{Tr}(\\rho_0) - p \\operatorname{Tr}(P_1 \\rho_0) + (1 - p) \\operatorname{Tr}(P_1 \\rho_1)\\\\[1mm]\n",
        "& = p - \\operatorname{Tr}\\bigl( P_1 (p \\rho_0 - (1-p)\\rho_1) \\bigr)\n",
        "\\end{aligned}\n",
        "$$\n",
        "\n",
        "The two expressions have the same value, so we can average them to give yet another expression for this value.\n",
        "(Averaging the two expressions is just a trick to simplify the resulting expression.)\n",
        "\n",
        "$$\n",
        "\\frac{1}{2} \\bigl(\\operatorname{Tr}\\bigl( P_0 (p \\rho_0 - (1-p)\\rho_1) \\bigr) + 1-p\\bigr)\n",
        "+ \\frac{1}{2} \\bigl(p - \\operatorname{Tr}\\bigl( P_1 (p \\rho_0 - (1-p)\\rho_1) \\bigr)\\bigr)\\\\\n",
        "= \\frac{1}{2} \\operatorname{Tr}\\bigl( (P_0-P_1) (p \\rho_0 - (1-p)\\rho_1)\\bigr)  + \\frac{1}{2}\n",
        "$$\n",
        "\n",
        "Now we can see why it makes sense to choose the projections $\\Pi_0$ and $\\Pi_1$ (as specified above) for $P_0$ and $P_1,$ respectively — because that's how we can make the trace in the final expression as large as possible.\n",
        "In particular,\n",
        "\n",
        "$$\n",
        "(\\Pi_0-\\Pi_1) (p \\rho_0 - (1-p)\\rho_1) = \\sum_{k = 0}^{n-1} \\vert\\lambda_k\\vert \\cdot \\vert \\psi_k \\rangle \\langle \\psi_k \\vert.\n",
        "$$\n",
        "\n",
        "So, when we take the trace, we obtain the sum of the *absolute values* of the eigenvalues — which is equal to what's known as the *trace norm* of the weighted difference.\n",
        "\n",
        "$$\n",
        "\\operatorname{Tr}\\bigl( (\\Pi_0-\\Pi_1) (p \\rho_0 - (1-p)\\rho_1)\\bigr)\n",
        "= \\sum_{k = 0}^{n-1} \\vert\\lambda_k\\vert = \\bigl\\| p \\rho_0 - (1-p)\\rho_1 \\bigr\\|_1\n",
        "$$\n",
        "\n",
        "Thus, the probability that the measurement $\\{\\Pi_0,\\Pi_1\\}$ leads to a correct discrimination of $\\rho_0$ and $\\rho_1,$ given with probabilities $p$ and $1-p,$ respectively, is as follows.\n",
        "\n",
        "$$\n",
        "\\frac{1}{2} + \\frac{1}{2} \\bigl\\| p \\rho_0 - (1-p)\\rho_1 \\bigr\\|_1\n",
        "$$\n",
        "\n",
        "The fact that this is the optimal probability for a correct discrimination of $\\rho_0$ and $\\rho_1,$ given with probabilities $p$ and $1-p,$ is commonly referred to as the *Helstrom–Holevo theorem* (or sometimes just *Helstrom's theorem*).\n",
        "\n",
        "## Discriminating three or more states\n",
        "\n",
        "For quantum state discrimination when there are three or more states, there is no known closed-form solution for an optimal measurement, although it is possible to formulate the problem as a semidefinite program — which allows for efficient numerical approximations of optimal measurements with the help of a computer.\n",
        "\n",
        "It is also possible to *verify* (or *falsify*) optimality of a given measurement in a state discrimination task through a condition known as the *Holevo-Yuen-Kennedy-Lax* condition.\n",
        "In particular, for the state discrimination task defined by the ensemble\n",
        "\n",
        "$$\n",
        "\\{(p_0,\\rho_0),\\ldots,(p_{m-1},\\rho_{m-1})\\},\n",
        "$$\n",
        "\n",
        "the measurement $\\{P_0,\\ldots,P_{m-1}\\}$ is optimal if and only if the matrix\n",
        "\n",
        "$$\n",
        "Q_a = \\sum_{b = 0}^{m-1} p_b \\rho_b P_b - p_a \\rho_a\n",
        "$$\n",
        "\n",
        "is positive semidefinite for every $a\\in\\{0,\\ldots,m-1\\}.$\n",
        "\n",
        "For example, consider the quantum state discrimination task in which one of the four tetrahedral states $\\vert\\phi_0\\rangle,\\ldots,\\vert\\phi_3\\rangle$ is selected uniformly at random.\n",
        "The tetrahedral measurement $\\{P_0,P_1,P_2,P_3\\}$ succeeds with probability\n",
        "\n",
        "$$\n",
        "\\frac{1}{4} \\operatorname{Tr}(P_0 \\vert\\phi_0\\rangle\\langle \\phi_0 \\vert) +\n",
        "\\frac{1}{4} \\operatorname{Tr}(P_1 \\vert\\phi_1\\rangle\\langle \\phi_1 \\vert) +\n",
        "\\frac{1}{4} \\operatorname{Tr}(P_2 \\vert\\phi_2\\rangle\\langle \\phi_2 \\vert) +\n",
        "\\frac{1}{4} \\operatorname{Tr}(P_3 \\vert\\phi_3\\rangle\\langle \\phi_3 \\vert)\n",
        "= \\frac{1}{2}.\n",
        "$$\n",
        "\n",
        "This is optimal by the Holevo-Yuen-Kennedy-Lax condition, as a calculation reveals that\n",
        "\n",
        "$$\n",
        "Q_a = \\frac{1}{4}(\\mathbb{I} - \\vert\\phi_a\\rangle\\langle\\phi_a\\vert) \\geq 0\n",
        "$$\n",
        "\n",
        "for $a = 0,1,2,3.$\n",
        "\n",
        "## Quantum state tomography\n",
        "\n",
        "Finally, we'll briefly discuss the problem of *quantum state tomography*.\n",
        "For this problem, we're given a large number $N$ of independent copies of an unknown quantum state $\\rho,$ and the goal is to reconstruct an approximation $\\tilde{\\rho}$ of $\\rho.$\n",
        "To be clear, this means that we wish to find a classical description of a density matrix $\\tilde{\\rho}$ that is as close as possible to $\\rho.$\n",
        "\n",
        "We can alternatively describe the set-up in the following way.\n",
        "An unknown density matrix $\\rho$ is selected, and we're given access to $N$ quantum systems $\\mathsf{X}_1,\\ldots,\\mathsf{X}_N,$ each of which has been *independently* prepared in the state $\\rho.$\n",
        "Thus, the state of the compound system $(\\mathsf{X}_1,\\ldots,\\mathsf{X}_N)$ is\n",
        "\n",
        "$$\n",
        "\\rho^{\\otimes N} = \\rho \\otimes \\rho \\otimes \\cdots \\otimes \\rho \\quad \\text{($N$ times)}\n",
        "$$\n",
        "\n",
        "The goal is to perform measurements on the systems $\\mathsf{X}_1,\\ldots,\\mathsf{X}_N$ and, based on the outcomes of those measurements, to compute a density matrix $\\tilde{\\rho}$ that closely approximates $\\rho.$\n",
        "This turns out to be a fascinating problem and there is ongoing research on it.\n",
        "\n",
        "Different types of strategies for approaching the problem may be considered.\n",
        "For example, we can imagine a strategy where each of the systems $\\mathsf{X}_1,\\ldots,\\mathsf{X}_N$ is measured separately, in turn, producing a sequence of measurement outcomes.\n",
        "Different specific choices for which measurements are performed can be made, including *adaptive* and *non-adaptive* selections.\n",
        "In other words, the choice of what measurement is performed on a particular system might or might not depend on the outcomes of prior measurements.\n",
        "Based on the sequence of measurement outcomes, a guess $\\tilde{\\rho}$ for the state $\\rho$ is derived — and again there are different methodologies for doing this.\n",
        "\n",
        "An alternative approach is to perform a single *joint measurement* of the entire collection, where we think about $(\\mathsf{X}_1,\\ldots,\\mathsf{X}_N)$ as a single system and select a single measurement whose output is a guess $\\tilde{\\rho}$ for the state $\\rho.$\n",
        "This can lead to an improved estimate over what is possible for separate measurements of the individual systems, although a joint measurement on all of the systems together is likely to be much more difficult to implement.\n",
        "\n",
        "### Qubit tomography using Pauli measurements\n",
        "\n",
        "We'll now consider quantum state tomography in the simple case where $\\rho$ is a qubit density matrix.\n",
        "We assume that we're given qubits $\\mathsf{X}_1,\\ldots,\\mathsf{X}_N$ that are each independently in the state $\\rho,$ and our goal is to compute an approximation $\\tilde{\\rho}$ that is close to $\\rho.$\n",
        "\n",
        "Our strategy will be to divide the $N$ qubits $\\mathsf{X}_1,\\ldots,\\mathsf{X}_N$ into three roughly equal-size collections, one for each of the three Pauli matrices $\\sigma_x,$ $\\sigma_y,$ and $\\sigma_z.$\n",
        "Each qubit is then measured independently as follows.\n",
        "\n",
        "1. For each of the qubits in the collection associated with $\\sigma_x$ we perform a $\\sigma_x$ measurement. This means that the qubit is measured with respect to the basis $\\{\\vert + \\rangle, \\vert -\\rangle\\},$ which is an orthonormal basis of eigenvectors of $\\sigma_x,$ and the corresponding measurement outcomes are the eigenvalues associated with the two eigenvectors: $+1$ for the state $\\vert + \\rangle$ and $-1$ for the state $\\vert -\\rangle.$ By averaging together the outcomes over all of the states in the collection associated with $\\sigma_x,$ we obtain an approximation of the expectation value\n",
        "\n",
        "   $$\n",
        "   \\langle + \\vert \\rho \\vert + \\rangle - \\langle - \\vert \\rho \\vert - \\rangle = \\operatorname{Tr}(\\sigma_x \\rho).\n",
        "   $$\n",
        "\n",
        "2. For each of the qubits in the collection associated with $\\sigma_y$ we perform a $\\sigma_y$ measurement. Such a measurement is similar to a $\\sigma_x$ measurement, except that the measurement basis is $\\{\\vert\\! +\\!i \\rangle, \\vert\\! -\\!i \\rangle\\},$ the eigenvectors of $\\sigma_y.$ Averaging the outcomes over all of the states in the collection associated with $\\sigma_y,$ we obtain an approximation of the expectation value\n",
        "\n",
        "   $$\n",
        "   \\langle +i \\vert \\rho \\vert \\!+\\!i \\rangle - \\langle -i \\vert \\rho \\vert \\!-\\!i \\rangle = \\operatorname{Tr}(\\sigma_y \\rho).\n",
        "   $$\n",
        "\n",
        "3. For each of the qubits in the collection associated with $\\sigma_z$ we perform a $\\sigma_z$ measurement. This time the measurement basis is the standard basis $\\{\\vert 0\\rangle, \\vert 1 \\rangle\\},$ the eigenvectors of $\\sigma_z.$ Averaging the outcomes over all of the states in the collection associated with $\\sigma_z,$ we obtain an approximation of the expectation value\n",
        "\n",
        "   $$\n",
        "   \\langle 0 \\vert \\rho \\vert 0 \\rangle - \\langle 1 \\vert \\rho \\vert 1 \\rangle = \\operatorname{Tr}(\\sigma_z \\rho).\n",
        "   $$\n",
        "\n",
        "Once we have obtained approximations\n",
        "\n",
        "$$\n",
        "\\alpha_x \\approx \\operatorname{Tr}(\\sigma_x \\rho),\\;\n",
        "\\alpha_y \\approx \\operatorname{Tr}(\\sigma_y \\rho),\\;\n",
        "\\alpha_z \\approx \\operatorname{Tr}(\\sigma_z \\rho)\n",
        "$$\n",
        "\n",
        "by averaging the measurement outcomes for each collection, we can approximate $\\rho$ as\n",
        "\n",
        "$$\n",
        "\\tilde{\\rho} = \\frac{\\mathbb{I} + \\alpha_x \\sigma_x + \\alpha_y \\sigma_y + \\alpha_z \\sigma_z}{2} \\approx\n",
        "\\frac{\\mathbb{I} + \\operatorname{Tr}(\\sigma_x \\rho) \\sigma_x + \\operatorname{Tr}(\\sigma_y \\rho) \\sigma_y + \\operatorname{Tr}(\\sigma_z \\rho) \\sigma_z}{2}\n",
        "= \\rho.\n",
        "$$\n",
        "\n",
        "In the limit as $N$ approaches infinity, this approximation converges in probability to the true density matrix $\\rho$ by the *law of large numbers,* and well-known statistical bounds (such as *Hoeffding's inequality*) can be used to bound the probability that the approximation $\\tilde{\\rho}$ deviates from $\\rho$ by varying amounts.\n",
        "\n",
        "An important thing to recognize, however, is that the matrix $\\tilde{\\rho}$ obtained in this way may fail to be a density matrix.\n",
        "In particular, although it will always have trace equal to $1,$ it may fail to be positive semidefinite.\n",
        "There are different known strategies for \"rounding\" such an approximation $\\tilde{\\rho}$ to a density matrix,\n",
        "one of them being to compute a spectral decomposition, replace any negative eigenvalues with $0,$ and then renormalize (by dividing the matrix we obtain by its trace).\n",
        "\n",
        "### Qubit tomography using the tetrahedral measurement\n",
        "\n",
        "Another option for performing qubit tomography is to measure every qubit $\\mathsf{X}_1,\\ldots,\\mathsf{X}_N$ using the tetrahedral measurement\n",
        "$\\{P_0,P_1,P_2,P_3\\}$ described earlier.\n",
        "That is,\n",
        "\n",
        "$$\n",
        "P_0 = \\frac{\\vert \\phi_0 \\rangle \\langle \\phi_0 \\vert}{2}, \\quad\n",
        "P_1 = \\frac{\\vert \\phi_1 \\rangle \\langle \\phi_1 \\vert}{2}, \\quad\n",
        "P_2 = \\frac{\\vert \\phi_2 \\rangle \\langle \\phi_2 \\vert}{2}, \\quad\n",
        "P_3 = \\frac{\\vert \\phi_3 \\rangle \\langle \\phi_3 \\vert}{2}\n",
        "$$\n",
        "\n",
        "for\n",
        "\n",
        "$$\n",
        "\\begin{aligned}\n",
        "\\vert \\phi_0 \\rangle & = \\vert 0 \\rangle\\\\\n",
        "\\vert \\phi_1 \\rangle & = \\frac{1}{\\sqrt{3}} \\vert 0 \\rangle + \\sqrt{\\frac{2}{3}} \\vert 1 \\rangle\\\\\n",
        "\\vert \\phi_2 \\rangle & = \\frac{1}{\\sqrt{3}} \\vert 0 \\rangle + \\sqrt{\\frac{2}{3}} e^{2\\pi i/3} \\vert 1 \\rangle\\\\\n",
        "\\vert \\phi_3 \\rangle & = \\frac{1}{\\sqrt{3}} \\vert 0 \\rangle + \\sqrt{\\frac{2}{3}} e^{-2\\pi i/3} \\vert 1 \\rangle.\n",
        "\\end{aligned}\n",
        "$$\n",
        "\n",
        "Each outcome is obtained some number of times, which we will denote as $n_a$ for each $a\\in\\{0,1,2,3\\},$ so that $n_0 + n_1 + n_2 + n_3 = N.$\n",
        "The ratio of these numbers with $N$ provides an estimate of the probability associated with each possible outcome:\n",
        "\n",
        "$$\n",
        "\\frac{n_a}{N} \\approx \\operatorname{Tr}(P_a \\rho).\n",
        "$$\n",
        "\n",
        "Finally, we shall make use of the following remarkable formula:\n",
        "\n",
        "$$\n",
        "\\rho = \\sum_{a=0}^3 \\Bigl( 3 \\operatorname{Tr}(P_a \\rho) - \\frac{1}{2}\\Bigr) \\vert \\phi_a \\rangle \\langle \\phi_a \\vert.\n",
        "$$\n",
        "\n",
        "To establish this formula, we can use the following equation for the absolute values squared of inner products of tetrahedral states, which can be checked through direct calculations.\n",
        "\n",
        "$$\n",
        "\\bigl\\vert \\langle \\phi_a \\vert \\phi_b \\rangle \\bigr\\vert^2 =\n",
        "\\begin{cases}\n",
        "1 & a=b\\\\\n",
        "\\frac{1}{3} & a\\neq b.\n",
        "\\end{cases}\n",
        "$$\n",
        "\n",
        "The four matrices\n",
        "\n",
        "$$\n",
        "\\begin{aligned}\n",
        "\\vert\\phi_0\\rangle \\langle \\phi_0 \\vert & = \\begin{pmatrix} 1 & 0\\\\[2mm] 0 & 0\\end{pmatrix}\\\\[3mm]\n",
        "\\vert\\phi_1\\rangle \\langle \\phi_1 \\vert & = \\begin{pmatrix} \\frac{1}{3} & \\frac{\\sqrt{2}}{3}\\\\[2mm]\n",
        "\\frac{\\sqrt{2}}{3} & \\frac{2}{3}\\end{pmatrix}\\\\[3mm]\n",
        "\\vert\\phi_2\\rangle \\langle \\phi_2 \\vert & = \\begin{pmatrix} \\frac{1}{3} & \\frac{\\sqrt{2}}{3}e^{-2\\pi i/3}\\\\[2mm]\n",
        "\\frac{\\sqrt{2}}{3}e^{2\\pi i/3} & \\frac{2}{3}\\end{pmatrix}\\\\[3mm]\n",
        "\\vert\\phi_3\\rangle \\langle \\phi_3 \\vert & = \\begin{pmatrix} \\frac{1}{3} & \\frac{\\sqrt{2}}{3}e^{2\\pi i/3}\\\\[2mm]\n",
        "\\frac{\\sqrt{2}}{3}e^{-2\\pi i/3} & \\frac{2}{3}\\end{pmatrix}\n",
        "\\end{aligned}\n",
        "$$\n",
        "\n",
        "are linearly independent, so it suffices to prove that the formula is true when $\\rho = \\vert\\phi_b\\rangle\\langle\\phi_b\\vert$\n",
        "for $b = 0,1,2,3.$\n",
        "In particular,\n",
        "\n",
        "$$\n",
        "3 \\operatorname{Tr}(P_a \\vert\\phi_b\\rangle\\langle\\phi_b\\vert) - \\frac{1}{2}\n",
        "= \\frac{3}{2} \\vert \\langle \\phi_a \\vert \\phi_b \\rangle \\vert^2 - \\frac{1}{2}\n",
        "= \\begin{cases}\n",
        "1 & a=b\\\\\n",
        "0 & a\\neq b\n",
        "\\end{cases}\n",
        "$$\n",
        "\n",
        "and therefore\n",
        "\n",
        "$$\n",
        "\\sum_{a=0}^3 \\biggl( 3 \\operatorname{Tr}(P_a \\vert\\phi_b\\rangle\\langle\\phi_b\\vert) - \\frac{\\operatorname{Tr}(\\vert\\phi_b\\rangle\\langle\\phi_b\\vert)}{2}\\biggr) \\vert \\phi_a \\rangle \\langle \\phi_a \\vert = \\vert \\phi_b\\rangle\\langle \\phi_b \\vert.\n",
        "$$\n",
        "\n",
        "We arrive at an approximation of $\\rho:$\n",
        "\n",
        "$$\n",
        "\\tilde{\\rho} = \\sum_{a=0}^3 \\Bigl( \\frac{3 n_a}{N} - \\frac{1}{2}\\Bigr) \\vert \\phi_a \\rangle \\langle \\phi_a \\vert.\n",
        "$$\n",
        "\n",
        "This approximation will always be a Hermitian matrix having trace equal to one, but it may fail to be positive semidefinite.\n",
        "In this case, the approximation must be \"rounded\" to a density matrix, similar to the strategy involving Pauli measurements.\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "id": "a1b8767d",
      "source": "© IBM Corp., 2017-2026"
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}