We now have a transmon at frequency
Every gate depends on a handful of physical parameters, each found by a dedicated sweep-and-fit experiment. Crucially these parameters drift, frequencies wander with temperature and two-level-system noise on hour-to-day timescales, so the whole set is re-run periodically. Calibration and benchmarking are one closed cycle: you tune, you grade, and if the grade slips you tune again.
flowchart TD
A["Spectroscopy<br/>coarse w_q (~MHz)"] --> B["Ramsey<br/>fine w_q + T2*<br/>(~kHz)"]
B --> C["Rabi<br/>pi amplitude"]
C --> D["DRAG / AllXY<br/>drive phase,<br/>leakage"]
D --> E["Readout cal<br/>chi, freq,<br/>power, IQ"]
E --> F["RB / IRB<br/>grade the gates"]
F --> G{"r within target<br/>and near<br/>coherence floor?"}
G -->|"NO: drift<br/>re-tune"| B
G -->|"YES"| H["Run circuits"]
The individual steps:
-
Qubit frequency
$\omega_q$ : coarse. Drive continuously while sweeping the drive frequency and watch the excited-state population. You get a Lorentzian; its center is$\omega_q/2\pi$ to ~MHz precision when quoted in Hz. -
Qubit frequency
$\omega_q$ : fine, plus$T_2^*$ (Ramsey). Two$\pi/2$ pulses separated by a delay$\tau$ convert a small detuning into a beat. If the fitted beat is$\Delta f$ in Hz, then$\delta\omega=2\pi\Delta f$ . This is kHz-level and is how you track drift. -
Pulse amplitude (Rabi). Sweep drive amplitude (or duration), fit the Rabi oscillation, pick the amplitude giving exactly a
$\pi$ rotation. -
DRAG & leakage. A transmon is only weakly anharmonic (
$\alpha \sim -200$ MHz, illustrative), so a fast pulse has spectral weight at the$1!\leftrightarrow!2$ transition and leaks into$|2\rangle$ . The DRAG technique adds a quadrature component proportional to the derivative of the main pulse to cancel that leakage and the associated phase error; the DRAG coefficient is itself a calibrated knob. -
AllXY fine-tuning. A fixed sequence of 21 pairs of
$X/Y$ ,$\pi/\pi/2$ pulses whose ideal outcome is a known staircase. Different error types (amplitude, detuning, DRAG phase) deform the staircase in characteristic, distinguishable ways, a cheap, sensitive diagnostic for the residuals Rabi/Ramsey miss. -
Readout. Calibrate
$\chi$ , choose the readout frequency and power that best separate the$|0\rangle$ /$|1\rangle$ pointer states in the IQ plane, and fit the discrimination boundary. (More below, the full story is the assignment matrix.)
A 1% amplitude error is invisible in one
Intuition. Tuning a gate by eye is like checking a clock against one tick. Run it for an hour (apply the gate hundreds of times) and a tiny rate error becomes minutes of visible drift you can correct.
Step by step:
- The first
$\pi/2$ pulse maps$|0\rangle$ to an equator superposition$(|0\rangle+|1\rangle)/\sqrt{2}$ . - During the delay
$\tau$ the Bloch vector precesses in the rotating frame at the detuning$\delta\omega = \omega_{\text{drive}} - \omega_q = 2\pi\Delta f$ , accumulating phase$\delta\omega,\tau$ . - Dephasing randomizes that phase across the ensemble; averaging gives a contrast factor $e^{-\tau/T_2^}$ (or a Gaussian $e^{-(\tau/T_2^)^2}$ when slow
$1/f$ noise dominates). - The second
$\pi/2$ pulse converts accumulated phase into population: projecting back gives$P=\tfrac12[1+e^{-\tau/T_2^*}\cos(\delta\omega\tau+\phi)]$ . -
Fit: the oscillation frequency
$\to |\Delta f|$ unless the IQ phase convention or a deliberately signed offset supplies the sign; the envelope$\to T_2^*$ . With$\delta\omega=\omega_{\text{drive}}-\omega_q$ , estimate$\omega_q=\omega_{\text{drive}}-\delta\omega$ , so set$f_{d,\mathrm{new}}=f_d-\Delta f_{\rm signed}$ . If the sign is unknown, test both directions or use phase-sensitive Ramsey.
The naive way to grade a gate, run it, do tomography, compare to ideal, is contaminated by state-preparation and measurement (SPAM) errors, which can dwarf the gate error. RB sidesteps this.
The recipe. Choose a set of sequence lengths
Here
The deep reason RB works is twirling: for time-stationary, Markovian, trace-preserving errors that remain in the computational subspace and are gate-independent (or only weakly gate-dependent), averaging
- A general channel has many parameters (write it as a Pauli transfer matrix).
- Average it over the group (twirl).
- The Clifford group is a unitary 2-design, so by Schur's lemma the twirled channel must commute with every group element; on the traceless subspace it can only be a scalar multiple of the identity.
- Hence, under those assumptions,
$\overline{\Lambda}$ is fixed by one number$p$ : keep$\rho$ with probability$p$ , replace it by$\mathbb{I}/d$ with probability$1-p$ . - Composing
$m$ such steps multiplies the$p$ 's$\Rightarrow p^m$ . SPAM enters only as the constants$A$ (initial-state/readout contrast) and$B$ (asymptote,$\sim 1/d$ ). - Leakage is outside this model; it must be measured separately, for example with leakage/seepage RB. With qutrit-sensitive readout, one often fits the leakage population as
$$P_{\rm leak}(m)=P_\infty+P_{\rm leak}(0)-P_\infty^m,\qquad P_\infty=\frac{L}{L+S},$$
where
$L$ is leakage out of the computational subspace and$S$ is seepage back. AllXY is useful for phase/amplitude/DRAG residuals, but it is not a substitute for$P_2$ readout or leakage/seepage RB.
The average error per Clifford follows:
Pitfall.
$r$ is per Clifford, not per physical gate. A single-qubit Clifford often compiles to ~1.5-2 native pulses, so the native-gate error is roughly$r$ divided by the average native-gates-per-Clifford. Always state the assumption.
survival F(m)
1.0 |*.
| '*.. F(m) = A p^m + B
| '-*..
| o '-*-..._ good SPAM (large A)
| '·o._ '''*----*----*---- → B≈0.50
0.5 |.......'·--o.._.................... ← same p (parallel)
| dashed: o''--o----o----o---- worse SPAM (small A, high B)
| "same p, same r; static SPAM changes A,B"
+------------------------------------ m
0 50 100 150 200
Run reference RB (decay
The point estimate divides out the Clifford "carrier" error. But real errors aren't exactly depolarizing, so Magesan et al. (2012) give an explicit systematic bound
RB needs a group structure; for generic gates on many qubits it gets unwieldy. XEB, the metric behind "quantum supremacy", runs random circuits and compares the measured bitstring distribution to one an ideal simulator predicts:
- A random circuit produces a Porter-Thomas output distribution: ideal probabilities are exponentially distributed, so a few bitstrings are strongly favored ("speckle" from constructive interference).
- Ideal sampling lands preferentially on those favored strings:
$\sum_x P_{\text{ideal}}(x)^2 \approx 2/2^n$ . Uniform noise gives$\sum_x (1/2^n)P_{\text{ideal}}(x) = 1/2^n$ . - Defining
$F_{\text{XEB}} = 2^n\langle P_{\text{ideal}}\rangle - 1$ sends ideal$\to 1$ , uniform noise$\to 0$ . - Under a digital error model, each faulty gate scrambles weight into the uniform background, so surviving coherent weight multiplies:
$F_{\text{XEB}} \approx \prod_g (1-e_g) \approx e^{-\sum_g e_g}$ . This per-cycle product lets you predict full-circuit fidelity from individual gate errors and cross-check.
Pitfall. XEB needs a trusted classical simulation of the ideal amplitudes, assumes the digital error model, and shallow-circuit linear-XEB spoofing results are known. It is a statistical test, not a proof of correctness.
The scalar readout fidelity is only the one-qubit shadow of a matrix. Prepare each computational basis state, histogram the discriminated outcomes, and stack those histograms as columns:
| prepared |
prepared |
|
|---|---|---|
| measured 0 | ||
| measured 1 |
With raw measured
Pitfall. Naive
$M^{-1}$ can return negative probabilities and amplifies statistical noise, and the full matrix is$2^n\times 2^n$ , exponential to calibrate. Use constrained least-squares / iterative unfolding (keep counts$\ge 0$ ) and tensor-product or subset approximations. And note:$F_a$ is reported separately, RB deliberately cancels readout error from the gate number.
Two gates with identical
INCOHERENT (depolarize) COHERENT (over-rotation)
. - . . - .
/ ↓ ↓ ↓ \ / ↻ \ rigid tilt
| →• ← | shrunk sphere | •-↗ | by small angle
\ ↑ ↑ ↑ / \ /
' - ' ' - '
error ~ LINEAR in depth error ~ QUADRATIC, worst-case large
error | coherent (curve) The two have the SAME r
vs N | ,·' at small N but diverge:
| ,·' coherent accumulates faster.
| _,·'____ incoherent (line)
+----------------------------- N
- Incoherent errors randomize rather than apply a fixed rotation. Depolarization shrinks the Bloch sphere uniformly; dephasing shrinks the transverse components. In average fidelity they add roughly linearly with depth.
-
Coherent (calibration/over-rotation) errors rotate the sphere rigidly; amplitudes can interfere constructively, so worst-case error can be much larger and accumulate quadratically. For a fixed average
$r$ , coherent errors are generally the more dangerous.
Standard RB reports coherent errors only through their average infidelity; it does not by itself tell you whether the error was coherent, stochastic, or dangerous in worst case. The SPAM-robust tool that separates them is unitarity (purity) RB: instead of survival probability, it tracks the purity of the output state vs sequence length. For the unital block
Purity RB fits a decay
Even with perfect control,
Model the gate as ideal unitary plus relaxation and total transverse decay over duration
Single qubit,
-
Error per Clifford:
$r=\dfrac{(d-1)(1-p)}{d}=\dfrac{(1)(0.001)}{2}=5\times10^{-4}$ , a "99.95%" Clifford. Check:$F_{\text{avg}}=p+(1-p)/d=0.999+0.0005=0.9995=1-r$ . ✓ -
Per physical gate: if the compiler averages ~1.5 native gates/Clifford, per-gate error
$\approx r/1.5 = 3.3\times10^{-4}$ . -
Coherence-limited? Take $T_1=80,\mu$s, $T_2=60,\mu$s,
$\tau_g=30$ ns:$r_{\lim}\approx\frac{30\text{ ns}}{6}\left(\frac{1}{80,\mu s}+\frac{2}{60,\mu s}\right)=(5.0\times10^{-9})(45833)=2.3\times10^{-4}$ per 30 ns physical gate. Compare like with like: the inferred per-gate error$3.3\times10^{-4}$ is$\sim1.4\times$ this per-gate coherence floor. Equivalently, a 1.5-pulse Clifford has$r_{\lim,C}\approx1.5(2.3\times10^{-4})=3.5\times10^{-4}$ , so the measured$5\times10^{-4}$ per Clifford is also$\sim1.4\times$ the Clifford floor. Pushing the pulse harder buys little; longer$T_1/T_2$ is the lever. -
IRB add-on:
$p_{\text{ref}}=0.999$ , interleaved$X$ -gate$p_{\overline C}=0.9982$ →$r_{\text{gate}}=\frac12\left(1-\frac{0.9982}{0.999}\right)=\frac12(0.0008)=4.0\times10^{-4}$ , quoted with the Magesan bound$E$ . -
XEB sketch:
$n=20$ ,$2^n=1{,}048{,}576$ . If$\langle P_{\text{ideal}}\rangle=1.9\times10^{-6}$ , then$F_{\text{XEB}}=1{,}048{,}576\times1.9\times10^{-6}-1\approx1.992-1=0.992$ . Uniform sampling ($\langle P\rangle=1/2^n$ ) gives exactly$0$ .
For an ideal target unitary
| Symbol | Name | Formula | Note |
|---|---|---|---|
| depolarizing / decay parameter | fit of |
SPAM-robust under the RB model | |
| average gate fidelity | |||
| process / entanglement fidelity | not directly equal to |
||
| avg error per Clifford | per Clifford, not per gate | ||
| per-gate error | physical-gate error | divide by compiling factor | |
| readout assignment fidelity | $1-\tfrac12[P(1 | 0)+P(0 | |
| linear-XEB estimator | approximates circuit fidelity under the XEB noise model |
| Method | Measures | Needs | Scales? | Blind spots |
|---|---|---|---|---|
| Standard RB | avg error/Clifford |
Clifford group + recovery | partial | coherent & worst-case errors |
| Interleaved RB | one gate's |
reference RB + interleaving | partial | systematic bound |
| XEB | full-circuit |
random circuits + ideal sim | yes (until sim infeasible) | trusts error model; spoofable |
| Unitarity/Purity RB | coherence of the noise | purity estimation | partial | complements, not replaces, |
- "RB gives THE gate error." No, an average over the Clifford group, not a single physical gate and not worst-case. Divide by the average native-gates-per-Clifford for a rough per-gate error.
-
"High fidelity = safe gate." RB alone does not diagnose coherent accumulation; two gates with the same
$r$ can diverge in deep circuits. Use unitarity RB and remember the diamond norm exists. -
"$p$ is just readout error." In the standard RB model, static SPAM lives in
$A$ and$B$ , not in$p$ . Drift, leakage, or model failure still need residual checks. -
"Readout fidelity is part of RB." It is separate (
$F_a$ / the assignment matrix). -
"Just invert
$M$ ." Constrained least-squares / unfolding, not naive$M^{-1}$ . - "Good single-qubit RB ⇒ good multi-qubit." Isolated RB hides crosstalk; run simultaneous/correlated RB.
- Calibration is a closed loop: spectroscopy → Ramsey → Rabi → DRAG/AllXY → readout → RB/IRB, re-run to track drift; error amplification exposes errors below the noise floor.
- RB reports a SPAM-robust average error per Clifford because the twirl (Clifford = unitary 2-design) collapses in-subspace Markovian error to one depolarizing
$p$ . -
IRB isolates one gate (with a systematic bound
$E$ ); XEB scales to large random circuits via the Porter-Thomas product model (but needs simulation and is spoofable). - A fidelity is meaningful only with context: averaged not worst-case, floored by
$T_1/T_2$ ($r_{\lim}$ ), separate from readout ($F_a$ ) and crosstalk, and not diagnostic of coherent vs stochastic errors unless you run unitarity RB.
- E. Magesan, J. M. Gambetta, J. Emerson, Scalable and Robust Randomized Benchmarking of Quantum Processes, Phys. Rev. Lett. 106, 180504 (2011), arXiv:1009.3639.
- E. Magesan et al., Efficient Measurement of Quantum Gate Error by Interleaved Randomized Benchmarking, Phys. Rev. Lett. 109, 080505 (2012), arXiv:1203.4550.
- J. Wallman, C. Granade, R. Harper, S. T. Flammia, Estimating the Coherence of Noise, New J. Phys. 17, 113020 (2015), arXiv:1503.07865 (unitarity / purity RB).
- S. Boixo et al., Characterizing Quantum Supremacy in Near-Term Devices, Nat. Phys. 14, 595 (2018), arXiv:1608.00263 (XEB).
- F. Arute et al. (Google AI Quantum), Quantum supremacy using a programmable superconducting processor, Nature 574, 505 (2019), DOI:10.1038/s41586-019-1666-5.
- P. Krantz et al., A Quantum Engineer's Guide to Superconducting Qubits, Appl. Phys. Rev. 6, 021318 (2019), arXiv:1904.06560.
- B. Barak, C.-N. Chou, X. Gao, Spoofing Linear Cross-Entropy Benchmarking in Shallow Quantum Circuits, arXiv:2005.02421.
← Back to project README · Tutorial index