Skip to main content
IBM Quantum Platform

Workload usage

Usage represents consumption of the Qiskit Runtime service and is determined by the amount of time a QPU is locked to execute workloads.

  • Session usage is measured as the elapsed time while the session remains active, because QPU capacity is reserved for the duration of the session, regardless of whether workloads are actively running. See Session length for more information about session status transitions.
  • Batch usage is measured as the cumulative time the QPU is locked to execute all jobs in the batch.
  • Single job usage is measured as the time the QPU is locked to execute the job.

Note that failed or canceled jobs count toward your usage in certain circumstances - see the Failed and canceled jobs section for details.

For Pay-As-You-Go Plan users, see Manage cost for details on setting a cost limit.


Usage for failed and canceled jobs

When a job is failed or canceled, the reported usage is as follows:

  • Job or batch mode: If the failure or cancellation occurred because of a system error, the reported usage is zero. For jobs that failed due to user error or when a user canceled a job, the reported usage is any consumption that has occurred up to that point.

  • Session mode: The reported usage is the wall-clock time the session is active, regardless of the number of jobs that fail or are canceled.


Query a workload's actual usage

After a workload has completed, there are several ways to view its actual usage:

  • Run batch.usage() or session.usage() in qiskit-ibm-runtime 0.30 or later. If using an older version of qiskit-ibm-runtime (>= 0.23 and < 0.30), the usage can be still be found in session.details()["usage_time"] and batch.details()["usage_time"].
  • Use GET /sessions/{id} to see usage for a specific batch or session.
  • Use GET /jobs/{id} to see usage for a single job.

View instance usage

You can view an instance's usage on the Instances page, or, for those with the proper authority, the Analytics page. Note that the pages might show different usage numbers because they calculate usage differently.

The Instances page shows real-time usage for the last 28 days (rolling), up to the current time on the current day. The Analytics page usage is re-calculated hourly and includes the last 28 full days; that is, it shows usage from 00:00 28 days ago to today, at the top of the hour.


Estimate usage before submitting a job

While getting an accurate local estimation is complicated by the extra operations done for error suppression and mitigation, you can use this baseline formula to get an approximation of estimated usage:

<per sub-job overhead> + (rep_delay + <circuit length>) * <num executions>

  • <per sub-job overhead> is an overhead of approximately 2s per sub-job. This includes operations such as loading the payload into control electronics. Your primitive job may be divided into multiple sub-jobs if it is too large for the execution engine to process all at once.
  • rep_delay is a user-customizable option, and the default is given by backend.default_rep_delay, which is 250 microseconds on most IBM Quantum backends. Note that lowering rep_delay decreases the total QPU execution time, but at the expense of increased state preparation error rate; see the Dynamic repetition rate execution guide for more information.
  • <circuit length> is the total instruction length. Each instruction takes different amount of time on the QPU, so the total length varies from circuit to circuit. A measurement, for example, can take 56 times longer than an x gate. backend.target[<instruction>][<qubit>].duration can be used to find the exact duration for each instruction. A typical circuit length is likely between 50-100 microseconds. If you are using error suppression or mitigation techniques with the primitives, extra instructions might be inserted into your circuit, which would increase the total circuit length.
    Note

    The experimental scheduler_timing option returns the total circuit time, but this is NOT the time used for billing.

  • <num executions> is the total number of circuits times the number of shots, where the circuits are those generated after PUB elements are broadcasted.
    • If you are using error mitigation techniques with the primitives, extra circuits might be run as part of the mitigation process, which would increase the total number of executions. Additionally, advanced error mitigation techniques such as PEA and PEC come with much higher overhead because they require running circuits for noise learning.
    • Estimator groups qubit-wise commuting observables, which reduces the number of executions.

If you aren't using any advanced error mitigation techniques or custom rep_delay, you can use 2+0.00035*<num executions> as a quick formula.

Estimate usage locally with Qiskit

This code example demonstrates how to use Qiskit to calculate circuit time:


# Schedule the circuit to get more accurate timing
pm = generate_preset_pass_manager(
    target=backend.target,
    optimization_level=0,
    scheduling_method="alap"
)

scheduled_circuits = pm.run(isa_circuits)

init_duration = backend.target["reset"][(0,)].duration
rep_delay = sampler.options.execution.rep_delay or backend.default_rep_delay

circuit_duration = 0

for circuit in scheduled_circuits:
    # Estimate circuit length
    circuit_duration += circuit.estimate_duration(backend.target)

    # Add INIT time
    if sampler.options.execution.init_qubits:
        circuit_duration += init_duration

    # Add rep_delay
    circuit_duration += rep_delay

total_time = 2 + (circuit_duration*shots)
print(f"Total estimated usage is {math.ceil(total_time)} seconds")

Next steps

Recommendations
Was this page helpful?
Report a bug, typo, or request content on GitHub.