Skip to content

Benchmarks API

Runner

runner

Benchmark runner: dispatches problems to registered solvers.

BenchmarkRunner

Runs benchmark problems against registered solvers.

Usage

runner = BenchmarkRunner() runner.register(SolverEntry("qaoa-p3-x", "quantum", my_qaoa_fn)) runner.register(SolverEntry("cvxpy-mv", "classical", my_mv_fn)) results = runner.run_problem(portfolio_small())

run_problem(problem)

Run all registered solvers on a single problem.

run_all(problems)

Run all solvers on all problems.

BenchmarkRow dataclass

Single benchmark result row.

SolverEntry dataclass

A registered solver.

Problems

problems

Standardized benchmark problem definitions.

Provides pre-built portfolio, option, and credit risk problems at small/medium/large scales with known classical reference solutions.

Problem dataclass

A benchmark problem specification.

PortfolioProblem dataclass

Bases: Problem

Portfolio optimization benchmark problem.

portfolio_small(seed=42)

15-asset benchmark: S&P sector ETFs + indices.

Cardinality K=5, 3 sectors with caps.

portfolio_medium(seed=42)

25-asset benchmark: NIFTY50 subset. K=8.

portfolio_large(seed=42)

50-asset benchmark: SP500 subset. K=15.

all_problems()

Return all registered benchmark problems.

Leaderboard

leaderboard

Benchmark leaderboard: tabular summary of solver results.

Generates Markdown or CSV leaderboards from BenchmarkRow results, ranked by objective value or approximation ratio.

generate_leaderboard(rows, sort_by='objective', ascending=True)

Generate a leaderboard table from benchmark rows.

Returns list of row dicts with string-formatted values.

to_markdown(rows, **kwargs)

Generate a Markdown leaderboard table.

to_csv(rows, **kwargs)

Generate a CSV leaderboard.

Manifest

manifest

Reproducibility manifest: RNG seeds, dependency versions, git commit.

Captures all information needed to reproduce a benchmark run.

Manifest dataclass

Reproducibility manifest for a benchmark run.

build_manifest(problem_ids=None, solver_names=None, seeds=None)

Build a reproducibility manifest for the current environment.

Hardware Benchmarks

hardware_benchmarks

Hardware benchmark framework for quantum finance validation campaigns.

Provides configurable benchmark runners for QAOA and QAE circuits on real hardware (IBM, IonQ via Braket) with statistical analysis, error mitigation comparison, and reproducibility manifests.

HardwareBenchmarkConfig dataclass

Configuration for a hardware benchmark campaign.

Parameters

target_devices : list[str] Backend identifiers to benchmark on. qubit_counts : list[int] Number of qubits to test at each scale. qaoa_depths : list[int] QAOA circuit depths (p values) to benchmark. qae_precisions : list[int] QAE precision levels (number of evaluation qubits). shots : int Number of measurement shots per circuit. n_runs : int Number of independent runs for statistical analysis. mitigation_methods : list[str] Error mitigation methods to apply: "none", "zne", "trex", "readout". seed : int | None Base random seed for reproducibility.

HardwareBenchmarkResult dataclass

Result from a single hardware benchmark run.

Parameters

device_id : str Backend identifier used. circuit_type : str Type of circuit: "qaoa", "qae", "mitigation". n_qubits : int Number of qubits in the circuit. depth : int Circuit depth. approximation_ratio : float Ratio of achieved vs optimal objective (for QAOA). success_probability : float Probability of measuring the correct/target state. wall_clock_time : float Wall-clock time in seconds. raw_results : dict[str, Any] Raw measurement counts and metadata. mitigated_results : dict[str, Any] Results after error mitigation (if applied). confidence_interval : tuple[float, float] 95% confidence interval for the primary metric. metadata : dict[str, Any] Additional benchmark metadata.

HardwareBenchmarkRunner

Runner for hardware validation benchmark campaigns.

Executes QAOA and QAE circuits on specified backends, collects statistics over multiple runs, and generates reproducibility manifests.

config property

Return the benchmark configuration.

results property

Return collected results.

run_qaoa_benchmark(problem, backend, config=None)

Run QAOA benchmarks at various qubit counts and depths.

Parameters

problem : Problem or dict Problem specification with QUBO parameters. backend : Backend Quantum backend to execute on. config : HardwareBenchmarkConfig | None Override config; uses self._config if None.

Returns

List of HardwareBenchmarkResult, one per (qubit_count, depth, run).

run_qae_benchmark(backend, config=None)

Run QAE benchmarks at various precision levels.

Parameters

backend : Backend Quantum backend to execute on. config : HardwareBenchmarkConfig | None Override config; uses self._config if None.

Returns

List of HardwareBenchmarkResult, one per (precision, run).

run_mitigation_comparison(circuit, backend)

Compare raw and mitigated results for a circuit.

Tests each mitigation method configured in self._config and collects comparative results.

Parameters

circuit : QuantumCircuit Circuit to benchmark (without measurements). backend : Backend Noisy backend to execute on.

Returns

List of HardwareBenchmarkResult, one per mitigation method.

generate_manifest(results=None)

Generate a reproducibility manifest for benchmark results.

Parameters

results : list[HardwareBenchmarkResult] | None Results to include. Uses self._results if None.

Returns

Dict with hardware info, config, and summary statistics.

statistical_analysis(results=None)

Compute summary statistics over benchmark results.

Parameters

results : list[HardwareBenchmarkResult] | None Results to analyze. Uses self._results if None.

Returns

Dict with mean, std, 95% CI for key metrics, grouped by circuit type.

save_results(results=None, path='benchmarks/hardware')

Save benchmark results as JSON.

Parameters

results : list[HardwareBenchmarkResult] | None Results to save. Uses self._results if None. path : str | Path Directory to save results in.

Returns

Path to the saved JSON file.

IonQBenchmarkRunner

Bases: HardwareBenchmarkRunner

Benchmark runner specialized for IonQ devices via Amazon Braket.

Extends HardwareBenchmarkRunner with IonQ-specific metrics: circuit depth comparison (native vs transpiled), 2Q gate count analysis, and cost estimation.

run_qaoa_benchmark(problem, backend, config=None)

Run QAOA benchmark with IonQ cost analysis.

Extends the base runner with 2Q gate counting and cost estimates for IonQ Harmony/Aria.

cost_analysis(results=None)

Compute total estimated cost for all benchmark runs.

Returns

Dict with total_cost_usd, cost_per_run, total_shots.