Workload usage

Usage is a measurement of the amount of time the QPU is locked for your workload, and it is calculated differently, depending on which execution mode you're using.

Session usage is the time from when the first job starts until the session goes inactive, is closed, or when its last job completes, whichever happens last. It includes both classical and quantum time (time spent by the QPU complex to process your job).
Batch usage is the sum of quantum time of all jobs in the batch.
Single job usage is the quantum time the job uses in processing.

The reported usage is the time a QPU is locked for your workload. Failed or canceled jobs count toward your usage in certain circumstances - see the Failed and canceled jobs section for details.

For paid plan users, usage determines how much the workload costs. See Manage cost for details.

Usage for failed and canceled jobs

When a job is failed or canceled, the reported usage is as follows:

Job or batch mode: The reported usage is the time the QPU was locked for executing your workload until the time it failed or was canceled. Therefore, if the failure or cancellation occurred before the lock, the reported usage is zero. Otherwise, the workload's reported usage is the amount of usage before the workload failed or was canceled. Thus, some failed jobs do not appear in your reported usage and others do.
Session mode: The reported usage is the wall-clock time from when the first job started executing in the session until the session terminates, regardless of the number of jobs that fail or are canceled.

Determine a workload's actual usage

After a workload has completed, there are several ways to view its actual usage:

Run batch.usage() or session.usage() in qiskit-ibm-runtime 0.30 or later. If using an older version of qiskit-ibm-runtime (>= 0.23 and < 0.30), the usage can be still be found in session.details()["usage_time"] and batch.details()["usage_time"].
Use GET /sessions/{id} to see usage for a specific batch or session.
Use GET /jobs/{id} to see usage for a single job.

View instance usage

You can view an instance's usage on the Instances page, or, for those with the proper authority, the Analytics page. Note that the pages might show different usage numbers because they calculate usage differently.

The Instances page shows real-time usage for the last 28 days (rolling), up to the current time on the current day. The Analytics page usage is re-calculated hourly and includes the last 28 full days; that is, it shows usage from 00:00 28 days ago to today, at the top of the hour.

Estimate usage before submitting a job

While getting an accurate local estimation is complicated by the extra operations done for error suppression and mitigation, you can use this baseline formula to get an approximation of estimated usage:

<per sub-job overhead> + (rep_delay + <circuit length>) * <num executions>

<per sub-job overhead> is an overhead of approximately 2s per sub-job. This includes operations such as loading the payload into control electronics. Your primitive job may be divided into multiple sub-jobs if it is too large for the execution engine to process all at once.
rep_delay is a user-customizable option, and the default is given by backend.default_rep_delay, which is 250 microseconds on most IBM Quantum backends. Note that lowering rep_delay decreases the total QPU execution time, but at the expense of increased state preparation error rate; see the Dynamic repetition rate execution guide for more information.
<circuit length> is the total instruction length. Each instruction takes different amount of time on the QPU, so the total length varies from circuit to circuit. A measurement, for example, can take 56 times longer than an x gate. backend.target[<instruction>][<qubit>].duration can be used to find the exact duration for each instruction. A typical circuit length is likely between 50-100 microseconds. If you are using error suppression or mitigation techniques with the primitives, extra instructions might be inserted into your circuit, which would increase the total circuit length.
<num executions> is the total number of circuits times the number of shots, where the circuits are those generated after PUB elements are broadcasted. If you are using error-mitigation techniques with the primitives, extra circuits can be run as part of the mitigation process, which would increase the total number of executions. Advanced error-mitigation techniques such as PEA and PEC come with much higher overhead because they require running circuits for noise learning.

If you aren't using any advanced error-mitigation techniques or custom rep_delay, you can use 2+0.00035*<num executions> as a quick formula.

Next steps

Recommendations

Review these tips: Minimize job run time.
Set the Maximum execution time.
Learn how to transpile locally in the Transpile section.
Try the Compare transpiler settings tutorial.

Was this page helpful?

Report a bug or request content on GitHub.