Get Availability Information - Prime Intellect Docs

Before you start, ensure that you have API Key with Availability -> Read permission

Available Endpoints

The availability API provides two endpoints to query different types of resources:

/api/v1/availability/gpus - Get GPU instance availability
/api/v1/availability/disks - Get standalone disk availability

Retrieving GPU Availability Data

Suppose you want to check pricing options for a single H100 GPU, with the location restricted to the United States or Canada. To do this, send a request to our availability endpoint as shown below:

curl --request GET \
  --url 'https://api.primeintellect.ai/api/v1/availability/gpus?regions=united_states&regions=canada&gpu_count=1&gpu_type=H100_80GB' \
  --header 'Authorization: Bearer your_api_key'

You can generate request samples using our interactive Availability API documentation

This is a GET request, and it requires an Authorization: Bearer token with your API key. The request accepts query parameters, allowing you to filter GPUs by region, GPU count, GPU type, and other criteria. In this case, the filters include regions, gpu_count, and gpu_type.

Query Parameters

All availability endpoints support pagination and the following filters:

Parameter	Type	Description
`regions`	List[string]	Filter by region(s) (e.g., `united_states`, `canada`, `europe_west`)
`gpu_count`	integer	Desired number of GPUs
`gpu_type`	string	GPU model (e.g., `H100_80GB`, `A100_80GB`)
`socket`	string	GPU socket type (e.g., `PCIe`, `SXM`)
`security`	string	Security type: `secure_cloud` or `community_cloud`
`data_center_id`	string	Filter by specific data center ID
`cloud_id`	string	Filter by provider’s cloud ID
`disks`	List[string]	Filter by disk IDs (see Disk Filtering section)
`page`	integer	Page number (default: 1, min: 1)
`page_size`	integer	Results per page (default: 100, max: 100)

Understanding the GPU Response Object

The GPU availability endpoint returns a paginated response with a list of available GPU configurations and the total count. Here is an example of the JSON response you might receive:

Example response

{
  "items": [
    {
      "cloudId": "NVIDIA H100 PCIe",
      "gpuType": "H100_80GB",
      "socket": "PCIe",
      "provider": "runpod",
      "region": "united_states",
      "dataCenter": "US-KS-2",
      "country": "US",
      "gpuCount": 1,
      "gpuMemory": 80,
      "disk": {
        "minCount": 80,
        "defaultCount": 80,
        "maxCount": 1000,
        "pricePerUnit": 0.00014,
        "step": 1,
        "defaultIncludedInPrice": false,
        "additionalInfo": null
      },
      "vcpu": {
        "defaultCount": 16
      },
      "memory": {
        "defaultCount": 251
      },
      "internetSpeed": null,
      "interconnect": null,
      "interconnectType": null,
      "provisioningTime": null,
      "stockStatus": "Low",
      "security": "secure_cloud",
      "prices": {
        "onDemand": 2.69,
        "communityPrice": null,
        "isVariable": false,
        "currency": "USD"
      },
      "images": [
        "ubuntu_22_cuda_12",
        "cuda_12_1_pytorch_2_2",
        "cuda_11_8_pytorch_2_1",
        "stable_diffusion",
        "flux",
        "axolotl",
        "bittensor",
        "vllm_llama_8b",
        "vllm_llama_70b",
        "vllm_llama_405b"
      ],
      "isSpot": null,
      "prepaidTime": null
    }
  ],
  "totalCount": 247
}

For a full breakdown of the response schema, see the Get Gpu Availability Endpoint documentation

Response Structure

`items`

An array containing the GPU availability records for the current page. Each item represents a unique GPU configuration from a provider.

`totalCount`

The total number of GPU configurations matching your filters across all pages. For example, if totalCount is 247 and page_size is 100, you know there are 3 pages total. Let’s walk through the most important fields in each GPU availability item:

`provider`

Indicates the company or platform providing the GPU. You may encounter multiple offers from the same provider.

`cloudId`

A unique identifier provided by the GPU provider. This value is required if you plan to provision the GPU through our provisioning API.

`region`

The geographic region where the GPU is located (e.g., united_states, europe_west). This helps you select resources closer to your users.

`dataCenter`

Optional but necessary if present in the response. If the provider operates multiple datacenters with the same cloudId, you must specify which data center to use when provisioning the GPU.

Resource Specifications: `disk`, `sharedDisk`, `vcpu`, `memory`

Each resource field contains detailed specification information with the following attributes:

minCount: Minimum allowed value for this resource
maxCount: Maximum allowed value for this resource
defaultCount: Default value provided by the provider
pricePerUnit: Cost per unit per hour (if customizable)
step: Increment step for adjusting the value
defaultIncludedInPrice: Whether the default value is included in the base price
additionalInfo: Extra information about pricing or constraints

disk: Instance-local disk storage in GB. This storage is attached to your GPU instance. vcpu: Number of virtual CPUs allocated to the instance. memory: RAM size available in GB.

Cost Calculation Example

In the example above, your server will include 16 vCPUs and 251GB of RAM, with customizable disk space. If the disk space is adjustable, its cost is calculated separately from the prices property. Here’s an example breakdown of total costs:

GPU cost: $2.69
vcpu cost: $0.00
memory cost: $0.00
disk cost: $0.0112 (80 units * $0.00014)

Total cost: $2.7012

The total cost varies based on the disk space that you send to the provisioning API. The value can range between minCount and maxCount adjustable by step and you’ll pay pricePerUnit/hr for every unit you want to use. Some providers offer predefined server configurations, with the defaultIncludedInPrice set to true. This means the defaultCount is included in the base price, but any changes (even reductions) will incur additional charges.

`interconnect` and `interconnectType`

For multi-GPU setups, these fields specify the interconnect speed (in Gbps) and technology type (e.g., InfiniBand, NVLink) used for GPU-to-GPU communication.

`provisioningTime`

Estimated time in minutes required to provision and start the GPU instance.

`prices` and `security`

There are two types of offers:

Secure Cloud: GPU is provided by a provider that maintains security standards and the GPUs are hosted in secured datacenters.
Community Cloud: GPUs provided by the community or from sources where we have limited information about the data center.

If the security type is set to secure_cloud, then the price is defined in prices.onDemand. Otherwise, community pricing is stored in prices.communityPrice. Some providers may also use dynamic pricing for their instances. If the field isVariable is set to true, the price may fluctuate based on demand or currency conversion.

`prepaidTime`

If set, you will be pre-charged for the total number of hours specified in this field at the time of ordering, regardless of actual usage. After the prepaid time expires, the standard hourly rate will apply.

Filtering by User Disks

The disks query parameter allows you to filter GPU availability based on your existing disks with persisted data. This is particularly useful when you want to provision a GPU instance that can attach to disks where you’ve already stored your datasets, models, or other important data.

How It Works

When you provide disk IDs via the disks parameter, the API:

Fetches the disk information for the provided disk IDs
Filters GPU availability to show only instances that can be provisioned in locations where your disks are available for attachment
Applies rest of the filters to narrow down the results.

Example Request

Suppose you have two disks with IDs clhxy6aw80000j8080gdf8kqv and clhxy9bz50001j8080hdf9lrw containing your training datasets in different datacenters. You can find GPUs that can attach to either disk:

curl --request GET \
  --url 'https://api.primeintellect.ai/api/v1/availability/gpus?disks=clhxy6aw80000j8080gdf8kqv&disks=clhxy9bz50001j8080hdf9lrw&gpu_type=H100_80GB' \
  --header 'Authorization: Bearer your_api_key'

This will return H100 GPUs available in the same locations where your disks are stored, making it easy to reuse your persisted data.

Use Cases

Data Persistence: Reuse datasets, models, and checkpoints stored on existing disks
Quick Provisioning: Attach pre-loaded data disks to new GPU instances without re-uploading

Checking Disk Availability

The disk availability endpoint allows you to query standalone network-attached storage options across different providers and datacenters.

Endpoint

GET https://api.primeintellect.ai/api/v1/availability/disks

Example Request

curl --request GET \
  --url 'https://api.primeintellect.ai/api/v1/availability/disks?regions=united_states&page=1&page_size=50' \
  --header 'Authorization: Bearer your_api_key'

Query Parameters for Disk Availability

Parameter	Type	Description
`regions`	List[string]	Filter by region(s)
`data_center_id`	string	Filter by specific data center ID
`cloud_id`	string	Filter by provider’s cloud ID
`page`	integer	Page number (default: 1, min: 1)
`page_size`	integer	Results per page (default: 100, max: 100)

Disk Availability Response

The disk endpoint returns a paginated response with disk availability objects and the total count:

Example disk response

{
  "items": [
    {
      "cloudId": null,
      "provider": "hyperstack",
      "dataCenter": "US-1",
      "country": "US",
      "region": "united_states",
      "spec": {
        "minCount": 0,
        "defaultCount": 0,
        "maxCount": 100000,
        "pricePerUnit": 0.00015,
        "step": 1,
        "defaultIncludedInPrice": false,
        "additionalInfo": null
      },
      "stockStatus": "Available",
      "security": "secure_cloud",
      "isMultinode": false
    }
  ],
  "totalCount": 15
}

Key Fields in Disk Response

`spec`

Contains the disk size specification with the same structure as resource specifications in GPU responses:

minCount: Minimum disk size in GB
maxCount: Maximum disk size in GB
pricePerUnit: Cost per GB per hour
step: Size increment (usually 1 GB)

`isMultinode`

Indicates whether this disk can be attached to multiple instances simultaneously. This is useful for shared data scenarios where multiple GPU instances need access to the same dataset.

`security`

Similar to GPU instances, disks can be classified as secure_cloud or community_cloud based on the provider’s infrastructure.

Disk Cost Calculation

Disk costs are straightforward to calculate:

Disk cost per hour = disk_size_gb * pricePerUnit

For example, a 500GB disk with pricePerUnit: 0.00015:

500 GB * $0.00015/GB/hr = $0.075/hr

Pagination

All availability endpoints support pagination to handle large result sets efficiently:

Use page to specify which page of results to retrieve (1-indexed)
Use page_size to control how many results per page (max 100)
Results are returned in consistent order for predictable pagination

Example with pagination:

curl --request GET \
  --url 'https://api.primeintellect.ai/api/v1/availability/gpus?gpu_type=A100_80GB&page=2&page_size=50' \
  --header 'Authorization: Bearer your_api_key'

API Documentation

Compute API

Inference API

Availability

Disks

evals

Pods

Sandbox

SSH Keys

user

​Available Endpoints

​Retrieving GPU Availability Data

​Query Parameters

​Understanding the GPU Response Object

​Response Structure

​items

​totalCount

​provider

​cloudId

​region

​dataCenter

​Resource Specifications: disk, sharedDisk, vcpu, memory

​Cost Calculation Example

​interconnect and interconnectType

​provisioningTime

​prices and security

​prepaidTime

​Filtering by User Disks

​How It Works

​Example Request

​Use Cases

​Checking Disk Availability

​Endpoint

​Example Request

​Query Parameters for Disk Availability

​Disk Availability Response

​Key Fields in Disk Response

​spec

​isMultinode

​security

​Disk Cost Calculation

​Pagination

Available Endpoints

Retrieving GPU Availability Data

Query Parameters

Understanding the GPU Response Object

Response Structure

`items`

`totalCount`

`provider`

`cloudId`

`region`

`dataCenter`

Resource Specifications: `disk`, `sharedDisk`, `vcpu`, `memory`

Cost Calculation Example

`interconnect` and `interconnectType`

`provisioningTime`

`prices` and `security`

`prepaidTime`

Filtering by User Disks

How It Works

Example Request

Use Cases

Checking Disk Availability

Endpoint

Example Request

Query Parameters for Disk Availability

Disk Availability Response

Key Fields in Disk Response

`spec`

`isMultinode`

`security`

Disk Cost Calculation

Pagination