Data encoding

Introduction and notation

To use a quantum algorithm, classical data must somehow be brought into a quantum circuit. This is usually referred to as data encoding, but is also called data loading . Recall from previous lessons the notion of a feature mapping, a mapping of data features from one space to another. Just transferring classical data to a quantum computer is a sort of mapping, and could be called a feature mapping. In practice, the built-in feature mappings in Qiskit (like ZFeatureMap and ZZFeatureMap) will typically include rotation layers and entangling layers that extend the state to many dimensions in the Hilbert space. This encoding process is a critical part of quantum machine learning algorithms and directly affects their computational capabilities.

Some of the encoding techniques below can be efficiently classically simulated; this is particularly easy to see in encoding methods that yield product states (i.e. which do not entangle qubits). And remember that quantum utility is most likely to lie where the quantum-like complexity of the dataset is well-matched by the encoding method. So it is very likely that you will end up writing your own encoding circuits. Here, we show a wide variety of possible encoding strategies simply so that you can compare and contrast them, and see what is possible. There are some very general statements that can be made about the usefulness of encoding techniques. For example, EfficientSU2 (see below) with a full entangling scheme is much more likely to capture quantum features of data than methods that yield product states (like ZFeatureMap). But this does not mean EfficientSU2 is sufficient, or sufficiently well-matched to your dataset, to yield a quantum speed-up. That requires careful consideration of the structure of the data being modeled or classified. There is also a balancing act with circuit depth, since many feature maps which fully entangle the qubits in a circuit yield very deep circuits, too deep to get usable results on today's quantum computers.

Notation

M

\Phi(\vec{x})

$\Phi(\vec{x})$ when discussing feature mappings in machine learning, generally, and
$U(\vec{x})$ when discussing circuit implementations of feature mappings.

Normalization and information loss

\text{X}

x^{'(i)}_k = \frac{x^{(i)}_k - \text{min}\{x^{(j)}_k\,|\,\vec{x}^{(j)}\in [\text{X}]\}}{\text{max}\{x^{(j)}_k\,|\,\vec{x}^{(j)}\in [\text{X}]\}-\text{min}\{x^{(j)}_k\,|\,\vec{x}^{(j)}\in [\text{X}]\}}

k

|\psi\rangle

|\psi\rangle\rightarrow\|\psi\|^{-1}|\psi\rangle

\vert\vec{x}^{(j)}\vert = 1

Methods of encoding

\text{X}_\text{ex}

\text{X}_{\text{ex}}=\{(4,8,5),(9,8,6),(2,9,2),(5,7,0),(3,7,5)\}

1^\text{st}

Basis encoding

P

|x^{(j)}_k\rangle

|x^{(j)} \rangle = \frac{1}{\sqrt{N}}\sum_{k=1}^{N}|x^{(j)}_k \rangle

\vec{x}^{(4)} = (5,7,0)

2^3

\vert 000\rangle, \vert 001\rangle, \vert 010\rangle, \vert 011\rangle, \vert 100\rangle, \vert 101\rangle, \vert 110\rangle, \vert 111\rangle

2^3

import math
from qiskit import QuantumCircuit
 
desired_state = [1 / math.sqrt(3), 0, 0, 0, 0, 1 / math.sqrt(3), 0, 1 / math.sqrt(3)]
 
qc = QuantumCircuit(3)
qc.initialize(desired_state, [0, 1, 2])
qc.decompose(reps=8).draw(output="mpl")

Output:

This example illustrates a couple of disadvantages of basis encoding. While it is simple to understand, the state vectors can become quite sparse, and schemes to implement it are usually not efficient.

Example

\text{X}_{\text{ex}}

\vec{x}^{(1)}=(4,8,5)

using basis encoding.

Solution:

import math
from qiskit import QuantumCircuit
 
desired_state = [
    0,
    0,
    0,
    0,
    1 / math.sqrt(3),
    1 / math.sqrt(3),
    0,
    0,
    1 / math.sqrt(3),
    0,
    0,
    0,
    0,
    0,
    0,
    0,
]
 
print(desired_state)
 
qc = QuantumCircuit(4)
qc.initialize(desired_state, [0, 1, 2, 3])
qc.decompose(reps=7).draw(output="mpl")

Output:

[0, 0, 0, 0, 0.5773502691896258, 0.5773502691896258, 0, 0, 0.5773502691896258, 0, 0, 0, 0, 0, 0, 0]

Amplitude encoding

N

|\psi^{(j)}_x\rangle = \frac{1}{\alpha}\sum_{i=1}^N x^{(j)}_i |i\rangle

N

\sum_{i=1}^N \left|x^{(j)}_i\right|^2 = \left|\alpha\right|^2.

In general, this is a different condition than the min/max normalization used for each feature across all data vectors. Precisely how this is navigated will depend on your problem. But there is no way around the quantum mechanical normalization condition above.

n

\text{X}_\text{ex}

\sum_{i=1}^N \left|x^{(1)}_i\right|^2 = 4^2+8^2+5^2 = 105 = \left|\alpha\right|^2 \rightarrow \alpha = \sqrt{105}

and the resulting 2-qubit quantum state would be:

|\psi(\vec{x}^{(1)})\rangle = \frac{1}{\sqrt{105}}(4|00\rangle+8|01\rangle+5|10\rangle+0|11\rangle)

N=3

Like in basis encoding, once we calculate what state will encode our dataset, in Qiskit we can use the initialize function to prepare it:

desired_state = [
    1 / math.sqrt(105) * 4,
    1 / math.sqrt(105) * 8,
    1 / math.sqrt(105) * 5,
    1 / math.sqrt(105) * 0,
]
 
qc = QuantumCircuit(2)
qc.initialize(desired_state, [0, 1])
 
qc.decompose(reps=5).draw(output="mpl")

Output:

\mathrm{log}_2(N)

Example

\vec{x}=(9,8,6,2,9,2)

Solution:

\alpha

|\psi\rangle = \alpha(9|000\rangle+8|001\rangle+6|010\rangle+2|011\rangle+9|100\rangle+2|101\rangle+0|110\rangle+0|111\rangle)

Note that

\langle \psi|\psi\rangle = |\alpha|^2\times(9^2+8^2+6^2+2^2+9^2+2^2+0^2+0^2) = |\alpha|^2\times(270)=1 \rightarrow \alpha = \frac{1}{\sqrt{270}}

So finally,

|\psi\rangle = \frac{1}{\sqrt{270}}(9|000\rangle+8|001\rangle+6|010\rangle+2|011\rangle+9|100\rangle+2|101\rangle+0|110\rangle+0|111\rangle)

Example

\vec{x}=(9,8,6,2,9,2),

Solution:

desired_state = [
    9 / math.sqrt(270),
    8 / math.sqrt(270),
    6 / math.sqrt(270),
    2 / math.sqrt(270),
    9 / math.sqrt(270),
    2 / math.sqrt(270),
    0,
    0,
]
 
print(desired_state)
 
qc = QuantumCircuit(3)
qc.initialize(desired_state, [0, 1, 2])
qc.decompose(reps=8).draw(output="mpl")

Output:

[0.5477225575051662, 0.48686449556014766, 0.36514837167011077, 0.12171612389003691, 0.5477225575051662, 0.12171612389003691, 0, 0]

Example

You may need to deal with very large data vectors. Consider the vector

\vec{x}=(4,8,5,9,8,6,2,9,2,5,7,0,3,7,5).

Write code to automate the normalization, and generate a quantum circuit for amplitude encoding.

Solution:

There are many possible answers. Here is code that prints a few steps along the way:

import numpy as np
from math import sqrt
 
init_list = [4, 8, 5, 9, 8, 6, 2, 9, 2, 5, 7, 0, 3, 7, 5]
qubits = round(np.log(len(init_list)) / np.log(2) + 0.4999999999)
need_length = 2**qubits
pad = need_length - len(init_list)
for i in range(0, pad):
    init_list.append(0)
 
init_array = np.array(init_list)  # Unnormalized data vector
length = sqrt(
    sum(init_array[i] ** 2 for i in range(0, len(init_array)))
)  # Vector length
norm_array = init_array / length  # Normalized array
print("Normalized array:")
print(norm_array)
print()
 
qubit_numbers = []
for i in range(0, qubits):
    qubit_numbers.append(i)
print(qubit_numbers)
 
qc = QuantumCircuit(qubits)
qc.initialize(norm_array, qubit_numbers)
qc.decompose(reps=7).draw(output="mpl")

Output:

Normalized array:
[0.17342199 0.34684399 0.21677749 0.39019949 0.34684399 0.26013299
 0.086711   0.39019949 0.086711   0.21677749 0.30348849 0.
 0.1300665  0.30348849 0.21677749 0.        ]

[0, 1, 2, 3]

Check-in question

Do you see advantages to amplitude encoding over basis encoding? If so, explain.

Answer:

There may be several answers. One answer is that, given the fixed ordering of the basis states, this amplitude encoding preserves the order of the numbers encoded. It will often also be encoded more densely.

\log_2(N)

Angle encoding

\theta

k^\text{th}

|\vec{x}^{(j)}_k\rangle = R_Y(\theta=\vec{x}^{(j)}_k)|0\rangle = \textstyle\cos\left(\frac{\vec{x}^{(j)}_k}{2}\right)|0\rangle + \sin\left(\frac{\vec{x}^{(j)}_k}{2}\right)|1\rangle.

R_X(\theta)

Angle encoding is different from the previous two methods discussed in several ways. In angle encoding:

Each feature value is mapped to a corresponding qubit, $\vec{x}^{(j)}_k \rightarrow Q_k$ , leaving the qubits in a product state.
One numerical value is encoded at a time, rather than a whole set of features from a data point.
$n$ qubits are required for $N$ data features, where $n\leq N$ . Often equality holds, here. We'll see how $n<N$ is possible in the next few sections.
The resulting circuit is a constant depth (typically the depth is 1 prior to transpilation).

\theta

|0\rangle

from qiskit.quantum_info import Statevector
from math import pi
 
qc = QuantumCircuit(1)
state1 = Statevector.from_instruction(qc)
qc.ry(pi / 2, 0)  # Phase gate rotates by an angle pi/2
state2 = Statevector.from_instruction(qc)
states = state1, state2

We will define a function to visualize the action on the state vector. The details of the function definition are not important, but the ability to visualize the state vectors and their changes is important.

from qiskit.visualization.bloch import Bloch
from qiskit.visualization.state_visualization import _bloch_multivector_data
 
 
def plot_Nstates(states, axis, plot_trace_points=True):
    """This function plots N states to 1 Bloch sphere"""
    bloch_vecs = [_bloch_multivector_data(s)[0] for s in states]
 
    if axis is None:
        bloch_plot = Bloch()
    else:
        bloch_plot = Bloch(axes=axis)
 
    bloch_plot.add_vectors(bloch_vecs)
 
    if len(states) > 1:
 
        def rgba_map(x, num):
            g = (0.95 - 0.05) / (num - 1)
            i = 0.95 - g * num
            y = g * x + i
            return (0.0, y, 0.0, 0.7)
 
        num = len(states)
        bloch_plot.vector_color = [rgba_map(x, num) for x in range(1, num + 1)]
 
    bloch_plot.vector_width = 3
    bloch_plot.vector_style = "simple"
 
    if plot_trace_points:
 
        def trace_points(bloch_vec1, bloch_vec2):
            # bloch_vec = (x,y,z)
            n_points = 15
            thetas = np.arccos([bloch_vec1[2], bloch_vec2[2]])
            phis = np.arctan2(
                [bloch_vec1[1], bloch_vec2[1]], [bloch_vec1[0], bloch_vec2[0]]
            )
            if phis[1] < 0:
                phis[1] = phis[1] + 2 * pi
            angles0 = np.linspace(phis[0], phis[1], n_points)
            angles1 = np.linspace(thetas[0], thetas[1], n_points)
 
            xp = np.cos(angles0) * np.sin(angles1)
            yp = np.sin(angles0) * np.sin(angles1)
            zp = np.cos(angles1)
            pnts = [xp, yp, zp]
            bloch_plot.add_points(pnts)
            bloch_plot.point_color = "k"
            bloch_plot.point_size = [4] * len(bloch_plot.points)
            bloch_plot.point_marker = ["o"]
 
        for i in range(len(bloch_vecs) - 1):
            trace_points(bloch_vecs[i], bloch_vecs[i + 1])
 
    bloch_plot.sphere_alpha = 0.05
    bloch_plot.frame_alpha = 0.15
    bloch_plot.figsize = [4, 4]
 
    bloch_plot.render()
 
 
plot_Nstates(states, axis=None, plot_trace_points=True)

Output:

N

|\vec{x}^{(j)}\rangle = \bigotimes^N_{k=1} \cos(\vec{x}^{(j)}_k)|0\rangle + \sin(\vec{x}^{(j)}_k)|1\rangle

We note that this is equivalent to

|\vec{x}^{(j)}\rangle = \bigotimes^N_{k=1} R_Y(2\vec{x}^{(j)}_k)|0\rangle.

Example

\vec{x}^{(j)} = (0, \pi/4, \pi/2)

Solution:

qc = QuantumCircuit(3)
qc.ry(0, 0)
qc.ry(2 * math.pi / 4, 1)
qc.ry(2 * math.pi / 2, 2)
qc.draw(output="mpl")

Output:

Check-in questions

Using angle encoding as described above, how many qubits are required to encode 5 features?

Answer: 5

Phase encoding

\phi

|0\rangle

|\vec{x}^{(j)}_k\rangle = P(\phi=\vec{x}^{(j)}_k)|+\rangle = \textstyle\frac{1}{\sqrt{2}}\big(|0\rangle + e^{i\vec{x}^{(j)}_k}|1\rangle\big).

\vec{x}^{(j)}_k \rightarrow Q_k

|\vec{x}^{(j)}\rangle = \bigotimes_{k=1}^{N} P_k(\phi = \vec{x}^{(j)}_k)|+\rangle^{\otimes N} = {\textstyle\frac{1}{\sqrt{2^N}}} \bigotimes_{k=1}^{N}\big(|0\rangle + e^{i\vec{x}^{(j)}_k}|1\rangle\big).

\vec{x}^{(j)}_k=\frac{1}{2}\pi

qc = QuantumCircuit(1)
qc.h(0)  # Hadamard gate rotates state down to Bloch equator
state1 = Statevector.from_instruction(qc)
 
qc.p(pi / 2, 0)  # Phase gate rotates by an angle pi/2
state2 = Statevector.from_instruction(qc)
 
states = state1, state2
 
qc.draw("mpl", scale=1)

Output:

\phi

plot_Nstates(states, axis=None, plot_trace_points=True)

Output:

|+\rangle \rightarrow P(\frac{1}{2}\pi)|+\rangle

Z

Check-in questions

How many qubits are required in order to use phase encoding as described above to store 8 features?

Answer: 8

Example

Write code to the vector (4,8,5,9,8,6,2,9,2,5,7,0) using phase encoding.

Solution:

There may be many answers. Here is one example:

phase_data = [4, 8, 5, 9, 8, 6, 2, 9, 2, 5, 7, 0]
qc = QuantumCircuit(len(phase_data))
for i in range(0, len(phase_data)):
    qc.h(i)
    qc.rz(phase_data[i] * 2 * math.pi / float(max(phase_data)), i)
qc.draw(output="mpl")

Output:

Dense angle encoding

z

|\vec{x}^{(j)}_k,\vec{x}^{(j)}_\ell\rangle = R_Z(\phi=\vec{x}^{(j)}_\ell) R_Y(\theta=\vec{x}^{(j)}_k)|0\rangle = \cos\left(\frac{\vec{x}^{(j)}_k}{2}\right)|0\rangle + e^{i\vec{x}^{(j)}_\ell} \sin\left(\frac{\vec{x}^{(j)}_k}{2}\right)|1\rangle.

2\times

|\vec{x}\rangle = \bigotimes_{k=1}^{N/2} \cos(x_{2k-1})|0\rangle + e^{i x_{2k}}\sin(x_{2k-1})|1\rangle

DAE can be generalized to arbitrary functions of the two features instead of the sinusoidal functions used here. This is called general qubit encoding [7] .

x_1=\theta = 3\pi/8

qc = QuantumCircuit(1)
state1 = Statevector.from_instruction(qc)
qc.ry(3 * pi / 8, 0)
state2 = Statevector.from_instruction(qc)
qc.rz(7 * pi / 4, 0)
state3 = Statevector.from_instruction(qc)
states = state1, state2, state3
 
plot_Nstates(states, axis=None, plot_trace_points=True)

Output:

Check-in questions

Given the treatment above, how many qubits are needed to encode 6 features using dense encoding?

Answer: 3

Example

Write code to load the vector (4,8,5,9,8,6,2,9,2,5,7,0,3,7,5) using dense angle encoding.

Solution:

Note that we have padded the list with a "0" to avoid the problem of there being a single unused parameter in our encoding scheme.

dense_data = [4, 8, 5, 9, 8, 6, 2, 9, 2, 5, 7, 0, 3, 7, 5, 0]
qc = QuantumCircuit(int(len(dense_data) / 2))
entry = 0
for i in range(0, int(len(dense_data) / 2)):
    qc.ry(dense_data[entry] * 2 * math.pi / float(max(dense_data)), i)
    entry = entry + 1
    qc.rz(dense_data[entry] * 2 * math.pi / float(max(dense_data)), i)
    entry = entry + 1
qc.draw(output="mpl")

Output:

Encoding with built-in feature maps

Encoding at arbitrary points

|01\rangle

But encoding need not be entirely in product states or entirely in entangled states as in amplitude encoding. Indeed, many encoding schemes built into Qiskit allow encoding both before and after an entanglement layer, as opposed to just at the beginning. This is known as "data reuploading". For related work, see references [5] and [6].

N

EfficientSU2

A common and useful example of encoding with entanglement is Qiskit's EfficientSU2 circuit. Impressively, this circuit can, for example, encode 8 features on only 2 qubits. Let's see this, and then try to understand how it is possible.

from qiskit.circuit.library import EfficientSU2
 
circuit = EfficientSU2(num_qubits=2, reps=1, insert_barriers=True)
circuit.decompose().draw(output="mpl")

Output:

b1

|\psi\rangle_{b1} = \left(\cos(\theta_0)|0\rangle+\sin(\theta_0)e^{i\theta_2}|1\rangle\right)\otimes\left(\cos(\theta_1)|0\rangle+\sin(\theta_1)e^{i\theta_3}|1\rangle\right)

b2

|\psi\rangle_{b2} = \left(\cos(\theta_0)|0\rangle+\sin(\theta_0)e^{i\theta_2}|1\rangle\right)\otimes\cos(\theta_1)|0\rangle+ \left(\sin(\theta_0)e^{i\theta_2}|0\rangle+\cos(\theta_0)|1\rangle\right)\otimes\sin(\theta_1)e^{i\theta_3}|1\rangle

We now apply the last set of rotations to obtain:

\begin{align} \nonumber |\psi\rangle_{\text{final}} &= \left(\cos(\theta_0)|0\rangle+\sin(\theta_0)e^{i\theta_2}|1\rangle\right)\otimes\cos(\theta_1)\left(\cos(\theta_4)|0\rangle+\sin(\theta_4)e^{i\theta_6}|1\rangle\right)\\\nonumber &+\left(\sin(\theta_0)e^{i\theta_2}|0\rangle+\cos(\theta_0)|1\rangle\right)\otimes\sin(\theta_1)e^{i\theta_3}\left(\cos(\theta_5)|1\rangle+\sin(\theta_5)e^{i\theta_7}|0\rangle\right)\nonumber \end{align}

\psi_\text{final} = c_0|00\rangle+c_1|01\rangle+c_2|10\rangle+c_3|11\rangle

\psi_\text{final} = (a_0+ib_0)|00\rangle+(a_1+ib_1)|01\rangle+(a_2+ib_2)|10\rangle+(a_3+ib_3)|11\rangle

One can see that we do, indeed, have 8 parameters on the state on which to encode our 8 features.

By increasing the number of qubits and increasing the number of repetitions of entangling and rotation layers, one can encode much more data. Writing out the wave functions quickly becomes intractable. But we can still see the encoding in action.

\vec{x} = [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1.1,1.2]

x = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2]
circuit = EfficientSU2(num_qubits=3, reps=1, insert_barriers=True)
encode = circuit.assign_parameters(x)
encode.decompose().draw(output="mpl")

Output:

Instead of increasing the number of qubits, you might choose to increase the number of repetitions of entangling and rotation layers. But there are limits to how many repetitions are useful.

As previously stated, there is a tradeoff: circuits with more qubits or more repetitions of entangling and rotation layers may store more parameters, but do so with greater circuit depth. We will return to the depths of some built-in feature maps, below.

The next few encoding methods that are built into Qiskit have "feature map" as part of their names. Let us reiterate that encoding data into a quantum circuit is a feature mapping, in the sense that it takes data into a new space: the Hilbert space of the qubits involved. The relationship between the dimensionality of the original feature space and that of the Hilbert space will depend on the circuit you use for encoding.

$Z$ feature map

Z

\mathscr{U}_{\text{ZFM}}(\vec{x})|0\rangle^{\otimes N}=|\phi(\vec{x})\rangle

|0\rangle^{\otimes N}

\mathscr{U}_{\text{ZFM}}=\big(P(\vec{x}_1)\otimes\ldots P(\vec{x}_k)\otimes\ldots P(\vec{x}_N)H^{\otimes N}\big)=\left(\bigotimes_{k = 1}^N P(\vec{x}_k)\right)H^{\otimes N}

r

\mathscr{U}^{(r)}_{\text{ZFM}}\left(\vec{x}\right)=\prod_{s=1}^{r}\left[\left(\bigotimes_{k = 1}^N P(\vec{x}_k)\right)H^{\otimes N}\right]

x_k

r=1

\mathscr{U}_{\text{ZFM}}(\bar{x})|00\rangle = P(\bar{x})^{\otimes 2} H^{\otimes 2}|00\rangle = \left( P\left(\textstyle\frac{1}{2}\pi\right)H|0\rangle \right) \otimes \left(P\left(\textstyle\frac{1}{3}\pi\right)H|0\rangle\right).

The formula has been rearranged around the tensor product to emphasize the operations on each qubit. The following Qiskit code uses Hadamard and phase gates explicitly to show the structure of the ZFM:

qc0 = QuantumCircuit(1)
qc1 = QuantumCircuit(1)
 
qc0.h(0)
qc0.p(pi / 2, 0)
 
qc1.h(0)
qc1.p(pi / 3, 0)
 
# Combine circuits qc0 and qc1 into 1 circuit
qc = QuantumCircuit(2)
qc.compose(qc0, [0], inplace=True)
qc.compose(qc1, [1], inplace=True)
 
qc.draw("mpl", scale=1)

Output:

\vec{x} = \left(\textstyle\frac{1}{2}\pi, \textstyle\frac{1}{3}\pi\right)

from qiskit.circuit.library import ZFeatureMap
 
zfeature_map = ZFeatureMap(feature_dimension=2, reps=3)
zfeature_map = zfeature_map.assign_parameters([(1 / 2) * pi / 2, (1 / 2) * pi / 3])
zfeature_map.decompose().draw("mpl")

Output:

You may use ZFM via Qiskit's ZFM class; you can also use this structure as inspiration to construct your own feature mapping.

$ZZ$ feature map

ZZ

R_{ZZ}(\theta)

qc = QuantumCircuit(2)
qc.rzz(pi, 0, 1)
qc.draw("mpl", scale=1)

Output:

As is often the case, we see this represented as a single gate-like unit, until we use .decompose() to see all constituent gates.

qc.decompose().draw("mpl", scale=1)

Output:

P(\theta) = e^{i\theta/2}R_Z(\theta)

The full ZZFM circuit consists of a Hadamard gate and phase gate, as in the ZFM, followed by the entanglement described above. A single repetition of the ZZFM circuit is:

\mathscr{U}_{\text{ZZFM}}(\vec{x}) = U_{ZZ}(\vec{x})\big(P(\vec{x}_1)\otimes\ldots P(\vec{x}_k)\otimes\ldots P(\vec{x}_N)H^{\otimes N}\big)=U_{ZZ}(\vec{x})\left(\bigotimes_{k = 1}^N P(\vec{x}_k)\right)H^{\otimes N},

U_{ZZ}(\vec{x})

\theta_{q,p} \rightarrow \phi(\vec{x}_q, \vec{x}_p) = 2(\pi-\vec{x}_q)(\pi-\vec{x}_p).

We will see this in several examples below. The extension to multiple repetitions is the same as in the ZFeatureMap case:

\mathscr{U}^{(r)}_{\text{ZZFM}}\left(\vec{x}\right)=\prod_{s=1}^{r}\left[U_{ZZ}(\vec{x})\left(\bigotimes_{k = 1}^N P(\vec{x}_k)\right)H^{\otimes N}\right].

\vec{x} = (x_0, x_1)

from qiskit.circuit.library import ZZFeatureMap
 
feature_dim = 2
zzfeature_map = ZZFeatureMap(
    feature_dimension=feature_dim, entanglement="linear", reps=1
)
zzfeature_map.decompose(reps=1).draw("mpl", scale=1)

Output:

(\vec{x}_1, \vec{x}_2)

\vec{x} = (\vec{x}_1, \vec{x}_2, \vec{x}_3, \vec{x}_4)

feature_dim = 4
zzfeature_map = ZZFeatureMap(
    feature_dimension=feature_dim, entanglement="linear", reps=1
)
zzfeature_map.decompose().draw("mpl", scale=1)

Output:

In the linear entanglement scheme, nearest-neighbor (numbered) pairs of qubits in this circuit are entangled. There are other built-in entanglement schemes in Qiskit, including circular and full .

Pauli feature map

r

\mathscr{U}_{\text{PFM}}(\vec{x}) = \prod_{s=1}^{r} U(\vec{x}) H^{\otimes n}.

U(\vec{x})

U(\vec{x}) = \exp\left(i \sum_{S \in\mathcal{I}} \phi_S(\vec{x}) \prod_{i \in S} \sigma_i \right),

\sigma_i

\phi_S(\vec{x})= \begin{cases} x_i & \text{if } S= \{i\} \text{ (single-qubit)}\\ \prod_{j\in{S}}(\pi-x_j) & \text{if } |S|\ge2 \text{ (multi-qubit)}\\ \end{cases}

\sigma_i

\exp(it\mathcal{H})

Y

from qiskit.circuit.library import PauliFeatureMap
 
feature_dim = 3
pauli_feature_map = PauliFeatureMap(
    feature_dimension=feature_dim, entanglement="linear", reps=1, paulis=["Y", "XX"]
)
 
pauli_feature_map.decompose().draw("mpl", scale=1.5)

Output:

\alpha

U(\bar{x}) = \exp\left(i \alpha \sum_{S\subseteq[n]} \phi_S(\bar{x}) \prod_{i \in S} \sigma_i \right)

\alpha

Gallery of Pauli feature maps

Here we visualize various Pauli feature maps for two-qubit circuits to get a better picture of the range of possibilities.

from qiskit.visualization import circuit_drawer
import matplotlib.pyplot as plt
 
feature_dim = 2
fig, axs = plt.subplots(9, 2)
i_plot = 0
for paulis in [
    ["I"],
    ["X"],
    ["Y"],
    ["Z"],
    ["XX"],
    ["XY"],
    ["XZ"],
    ["YY"],
    ["YZ"],
    ["ZZ"],
    ["X", "ZZ"],
    ["Y", "ZZ"],
    ["Z", "ZZ"],
    ["X", "YZ"],
    ["Y", "YZ"],
    ["Z", "YZ"],
    ["YY", "ZZ"],
    ["XY", "ZZ"],
]:
    pfm = PauliFeatureMap(feature_dimension=feature_dim, paulis=paulis, reps=1)
    circuit_drawer(
        pfm.decompose(),
        output="mpl",
        style={"backgroundcolor": "#EEEEEE"},
        ax=axs[int((i_plot - i_plot % 2) / 2), i_plot % 2],
    )
    axs[int((i_plot - i_plot % 2) / 2), i_plot % 2].title.set_text(paulis)
    i_plot += 1
 
fig.set_figheight(16)
fig.set_figwidth(16)

Output:

The above can, of course, be extended to include other permutations and repetitions of Pauli matrices. Learners are encouraged to experiment with those options.

Review of built-in feature maps

You have seen several schemes for encoding data into a quantum circuit:

Basis encoding
Amplitude encoding
Angle encoding
Phase encoding
Dense encoding

You have seen how to construct your own feature maps using these encoding schemes, and you have seen four built-in feature maps which take advantage of angle and phase encoding:

EfficientSU2
ZFeatureMap
ZZFeatureMap
PauliFeatureMap

These built-in feature maps differed from each other in several ways:

The depth for a given number of encoded features
The number of qubits required for a given number of features
The degree of entanglement (obviously related to the other differences)

The code below applies these four built-in feature maps to the encoding of a feature set, and plots the two-qubit depth of the resulting circuit. Since two-qubit error rates are much higher than single-qubit gate error rates, one might reasonably be most interested in the depth of two-qubit gates. In the code below, we obtain counts of all gates in a circuit by first decomposing the circuit and then using count_ops(), as shown below. Here the two-qubit gates we are interested in are 'cx' gates:

# Initializing parameters and empty lists for depths
x = [0.1, 0.2]
n_data = []
zz2gates = []
su22gates = []
z2gates = []
p2gates = []
 
# Generating feature maps
for n in range(3, 10):
    x.append(n / 10)
    zzcircuit = ZZFeatureMap(n, reps=1, insert_barriers=True)
    zcircuit = ZFeatureMap(n, reps=1, insert_barriers=True)
    su2circuit = EfficientSU2(n, reps=1, insert_barriers=True)
    pcircuit = PauliFeatureMap(n, reps=1, paulis=["XX"], insert_barriers=True)
    # Getting the cx depths
    zzcx = zzcircuit.decompose().count_ops().get("cx")
    zcx = zcircuit.decompose().count_ops().get("cx")
    su2cx = su2circuit.decompose().count_ops().get("cx")
    pcx = pcircuit.decompose().count_ops().get("cx")
 
    # Appending the cx gate counts to the lists. We shift the zz & pauli data points, because they overlap.
    n_data.append(n)
    zz2gates.append(zzcx - 0.5)
    z2gates.append(0)
    su22gates.append(su2cx)
    p2gates.append(pcx + 0.5)
 
# Plot the output
plt.plot(n_data, p2gates, "bo")
plt.plot(n_data, zz2gates, "ro")
plt.plot(n_data, su22gates, "yo")
plt.plot(n_data, z2gates, "go")
plt.ylabel("CX Gates")
plt.xlabel("Data elements")
plt.legend(["Pauli", "ZZ", "SU2", "Z"])
# plt.suptitle('ZZFeatureMap(n)')
plt.show()

Output:

Generally Pauli and ZZ feature maps will result in greater circuit depth and higher numbers of 2-qubit gates than EfficientSU2 and Z feature maps.

Because the feature maps built into Qiskit are widely applicable, we will often not need to design our own, especially in the learning phase. However, experts in quantum machine learning will likely return to the subject of designing their own feature mapping, as they tackle two complicated challenges:

$Modern hardware: the presence of noise and the large overhead of error-correcting code mean that present-day applications will need to consider things like hardware efficiency and minimizing two-qubit gate depth.$
$Mappings that fit the problem at hand: It is one thing to say that the ZZFeatureMap, for example, is difficult to simulate classically, and therefore interesting. It is quite another thing for the ZZFeatureMap to be ideally suited to your machine learning task or data set. The performance of different parameterized quantum circuits on different types of data is an active area of investigation.$

We close with a note on hardware efficiency.

Hardware-efficient feature mapping

A hardware-efficient feature mapping is one that takes into account constraints of real quantum computers, in the interest of reducing noise and errors in the computation. When running quantum circuits on near-term quantum computers, there are many strategies to mitigate noise inherent to the hardware. One main strategy for hardware efficiency is the minimization of the depth of the quantum circuit so that noise and decoherence have less time to corrupt the computation. The depth of a quantum circuit is the number of time-aligned gate steps required to complete the entire computation (after circuit optimization) [5] . Recall that the depth of the abstract, logical circuit may be much lower than the depth once the circuit is transpiled for a real quantum computer.

Z

ZZ

The above graphic shows a network of nodes and edges that represent physical qubits and hardware couplings, respectively. The coupling map and performance of ibm_torino is shown with all possible two-qubit CZ coupling gates. Qubits are color-coded on a scale based on the T1 relaxation time in microseconds (μs), where longer T1 times are better and in a lighter shade. The coupling edges are color-coded by CZ error, where darker shades are better. Information on the hardware specification can be accessed in the hardware backend configuration schema IBMQBackend.configuration() .

References

Maria Schuld and Francesco Petruccione, Supervised Learning with Quantum Computers, Springer 2018, doi:10.1007/978-3-319-96424-9.
Vojtech Havlicek et al., “Supervised Learning with Quantum Enhanced Feature Spaces.” Nature, vol. 567 (2019): 209–212. https://arxiv.org/abs/1804.11326.
Ryan LaRose and Brian Coyle, "Robust data encodings for quantum classifiers", Physical Review A 102, 032420 (2020), doi:10.1103/PhysRevA.102.032420, arXiv:2003.01695.
Lou Grover and Terry Rudolph. “Creating Superpositions That Correspond to Efficiently Integrable Probability Distributions.” arXiv:quant-ph/0208112, August 15, 2002, https://arxiv.org/abs/quant-ph/0208112.
Adrián Pérez-Salinas, Alba Cervera-Lierta, Elies Gil-Fuster, José I. Latorre, "Data re-uploading for a universal quantum classifier", Quantum 4, 226 (2020), ArXiv.org/abs/1907.02085.
Maria Schuld, Ryan Sweke, Johannes Jakob Meyer, "The effect of data encoding on the expressive power of variational quantum machine learning models", Phys. Rev. A 103, 032430 (2021), arxiv.org/abs/2008.08605

import qiskit
 
qiskit.version.get_version_info()

Output:

'2.0.2'

Was this page helpful?

Report a bug or request content on GitHub.

Data encoding

Introduction and notation

Notation

Normalization and information loss

Methods of encoding

Basis encoding

Example

Amplitude encoding

Example

Example

Example

Check-in question

Angle encoding

Example

Check-in questions

Phase encoding

Check-in questions

Example

Dense angle encoding

Check-in questions

Example

Encoding with built-in feature maps

Encoding at arbitrary points

EfficientSU2

ZZZ feature map

ZZZZZZ feature map

Pauli feature map

Gallery of Pauli feature maps

Review of built-in feature maps

Hardware-efficient feature mapping

References

$Z$ feature map

$ZZ$ feature map