GPU pricing / RunPlacement framework

Useful GPU-Hour Framework

Direct answer: Useful GPU-hour cost is the better comparison unit when GPU providers differ in utilization, queueing, reliability, storage behavior, or operational model.

Decision rule
  • A higher listed GPU rate can be cheaper if it produces more completed work per paid hour.
  • Use provider pricing pages and your own bill or quote before making a purchase or migration decision.

RunPlacement quiz

Pressure-test this workload

A higher listed GPU rate can be cheaper if it produces more completed work per paid hour.

Uses workload type, budget, GPU need, data movement, priority, and ops tolerance.
Use the quiz

Definition

useful GPU-hour

A useful GPU-hour is one paid accelerator hour that actually advances the workload, excluding idle time, queue time, failed jobs, retries, and blocked data staging.

Useful GPU-hour cost = total GPU-related job cost / completed useful GPU-hours.

Example scenarios

Training retry

A cheap GPU with frequent failed runs can cost more per completed run than a higher-priced reliable environment.

Inference baseline

Provisioned GPU capacity with low traffic has a high useful GPU-hour cost even if the listed rate is low.

Data staging bottleneck

A GPU waiting on storage or transfer is paid time without useful model progress.

Decision Table

OptionBest useRisk
Listed GPU-hourAdvertised hourly accelerator rateScreening quotes
Paid GPU-hourAll billable GPU timeUnderstanding invoice exposure
Useful GPU-hourBillable time that advances the workloadComparing provider fit
Completed job costFull run cost including storage, transfer, retries, and supportProcurement decisions

Related decisions

Apply the framework

Use these long-tail decision pages when a specific cost driver or provider choice is already visible.

Related resources

Turn the framework into a worksheet

These checklists make the concept easier to share, cite, and apply.

FAQ

Why not compare GPU clouds by hourly rate?

Hourly rate ignores utilization, retries, storage, data transfer, support, and whether the workload completes reliably.

How do I estimate useful GPU-hours?

Start with paid GPU hours, then subtract idle time, queue time, failed jobs, retries, and time blocked by data movement.

Who should use useful GPU-hour cost?

Teams comparing H100, A100, L40S, or managed inference options should use it before choosing the cheapest listed rate.

Sources

RunPlacement quiz

Pressure-test this workload

A higher listed GPU rate can be cheaper if it produces more completed work per paid hour.

Uses workload type, budget, GPU need, data movement, priority, and ops tolerance.
Use the quiz