Before you start, ensure that you have API Key with
Availability -> Read permissionAvailable Endpoints
The availability API provides two endpoints to query different types of resources:/api/v1/availability/gpus- Get GPU instance availability/api/v1/availability/disks- Get standalone disk availability
Retrieving GPU Availability Data
Suppose you want to check pricing options for a single H100 GPU, with the location restricted to the United States or Canada. To do this, send a request to our availability endpoint as shown below:Query Parameters
All availability endpoints support pagination and the following filters:| Parameter | Type | Description |
|---|---|---|
regions | List[string] | Filter by region(s) (e.g., united_states, canada, europe_west) |
gpu_count | integer | Desired number of GPUs |
gpu_type | string | GPU model (e.g., H100_80GB, A100_80GB) |
socket | string | GPU socket type (e.g., PCIe, SXM) |
security | string | Security type: secure_cloud or community_cloud |
data_center_id | string | Filter by specific data center ID |
cloud_id | string | Filter by provider’s cloud ID |
disks | List[string] | Filter by disk IDs (see Disk Filtering section) |
page | integer | Page number (default: 1, min: 1) |
page_size | integer | Results per page (default: 100, max: 100) |
Understanding the GPU Response Object
The GPU availability endpoint returns a paginated response with a list of available GPU configurations and the total count. Here is an example of the JSON response you might receive:Example response
For a full breakdown of the response schema, see the
Get Gpu Availability Endpoint documentationResponse Structure
items
An array containing the GPU availability records for the current page. Each item represents a unique GPU configuration from a provider.
totalCount
The total number of GPU configurations matching your filters across all pages.
For example, if totalCount is 247 and page_size is 100, you know there are 3 pages total.
Let’s walk through the most important fields in each GPU availability item:
provider
Indicates the company or platform providing the GPU. You may encounter multiple offers from the same provider.
cloudId
A unique identifier provided by the GPU provider. This value is required if you plan to provision the GPU through our provisioning API.
region
The geographic region where the GPU is located (e.g., united_states, europe_west). This helps you select resources closer to your users.
dataCenter
Optional but necessary if present in the response. If the provider operates multiple datacenters with the same cloudId, you must specify which data center to use when provisioning the GPU.
Resource Specifications: disk, sharedDisk, vcpu, memory
Each resource field contains detailed specification information with the following attributes:
minCount: Minimum allowed value for this resourcemaxCount: Maximum allowed value for this resourcedefaultCount: Default value provided by the providerpricePerUnit: Cost per unit per hour (if customizable)step: Increment step for adjusting the valuedefaultIncludedInPrice: Whether the default value is included in the base priceadditionalInfo: Extra information about pricing or constraints
disk: Instance-local disk storage in GB. This storage is attached to your GPU instance.
vcpu: Number of virtual CPUs allocated to the instance.
memory: RAM size available in GB.
Cost Calculation Example
In the example above, your server will include16 vCPUs and 251GB of RAM, with customizable disk space. If the disk space is adjustable, its cost is calculated separately from the prices property. Here’s an example breakdown of total costs:
minCount and maxCount adjustable by step and you’ll pay pricePerUnit/hr for every unit you want to use.
Some providers offer predefined server configurations, with the defaultIncludedInPrice set to true. This means the defaultCount is included in the base price, but any changes (even reductions) will incur additional charges.
interconnect and interconnectType
For multi-GPU setups, these fields specify the interconnect speed (in Gbps) and technology type (e.g., InfiniBand, NVLink) used for GPU-to-GPU communication.
provisioningTime
Estimated time in minutes required to provision and start the GPU instance.
prices and security
There are two types of offers:
- Secure Cloud: GPU is provided by a provider that maintains security standards and the GPUs are hosted in secured datacenters.
- Community Cloud: GPUs provided by the community or from sources where we have limited information about the data center.
security type is set to secure_cloud, then the price is defined in prices.onDemand. Otherwise, community pricing is stored in prices.communityPrice.
Some providers may also use dynamic pricing for their instances. If the field isVariable is set to true, the price may fluctuate based on demand or currency conversion.
prepaidTime
If set, you will be pre-charged for the total number of hours specified in this field at the time of ordering, regardless of actual usage. After the prepaid time expires, the standard hourly rate will apply.
Filtering by User Disks
Thedisks query parameter allows you to filter GPU availability based on your existing disks with persisted data. This is particularly useful when you want to provision a GPU instance that can attach to disks where you’ve already stored your datasets, models, or other important data.
How It Works
When you provide disk IDs via thedisks parameter, the API:
- Fetches the disk information for the provided disk IDs
- Filters GPU availability to show only instances that can be provisioned in locations where your disks are available for attachment
- Applies rest of the filters to narrow down the results.
Example Request
Suppose you have two disks with IDsclhxy6aw80000j8080gdf8kqv and clhxy9bz50001j8080hdf9lrw containing your training datasets in different datacenters. You can find GPUs that can attach to either disk:
Use Cases
- Data Persistence: Reuse datasets, models, and checkpoints stored on existing disks
- Quick Provisioning: Attach pre-loaded data disks to new GPU instances without re-uploading
Checking Disk Availability
The disk availability endpoint allows you to query standalone network-attached storage options across different providers and datacenters.Endpoint
Example Request
Query Parameters for Disk Availability
| Parameter | Type | Description |
|---|---|---|
regions | List[string] | Filter by region(s) |
data_center_id | string | Filter by specific data center ID |
cloud_id | string | Filter by provider’s cloud ID |
page | integer | Page number (default: 1, min: 1) |
page_size | integer | Results per page (default: 100, max: 100) |
Disk Availability Response
The disk endpoint returns a paginated response with disk availability objects and the total count:Example disk response
Key Fields in Disk Response
spec
Contains the disk size specification with the same structure as resource specifications in GPU responses:
minCount: Minimum disk size in GBmaxCount: Maximum disk size in GBpricePerUnit: Cost per GB per hourstep: Size increment (usually 1 GB)
isMultinode
Indicates whether this disk can be attached to multiple instances simultaneously. This is useful for shared data scenarios where multiple GPU instances need access to the same dataset.
security
Similar to GPU instances, disks can be classified as secure_cloud or community_cloud based on the provider’s infrastructure.
Disk Cost Calculation
Disk costs are straightforward to calculate:pricePerUnit: 0.00015:
Pagination
All availability endpoints support pagination to handle large result sets efficiently:- Use
pageto specify which page of results to retrieve (1-indexed) - Use
page_sizeto control how many results per page (max 100) - Results are returned in consistent order for predictable pagination