Skip to main content
Before you start, ensure that you have API Key with Availability -> Read permission

Available Endpoints

The availability API provides two endpoints to query different types of resources:
  • /api/v1/availability/gpus - Get GPU instance availability
  • /api/v1/availability/disks - Get standalone disk availability

Retrieving GPU Availability Data

Suppose you want to check pricing options for a single H100 GPU, with the location restricted to the United States or Canada. To do this, send a request to our availability endpoint as shown below:
curl --request GET \
  --url 'https://api.primeintellect.ai/api/v1/availability/gpus?regions=united_states&regions=canada&gpu_count=1&gpu_type=H100_80GB' \
  --header 'Authorization: Bearer your_api_key'
You can generate request samples using our interactive Availability API documentation
This is a GET request, and it requires an Authorization: Bearer token with your API key. The request accepts query parameters, allowing you to filter GPUs by region, GPU count, GPU type, and other criteria. In this case, the filters include regions, gpu_count, and gpu_type.

Query Parameters

All availability endpoints support pagination and the following filters:
ParameterTypeDescription
regionsList[string]Filter by region(s) (e.g., united_states, canada, europe_west)
gpu_countintegerDesired number of GPUs
gpu_typestringGPU model (e.g., H100_80GB, A100_80GB)
socketstringGPU socket type (e.g., PCIe, SXM)
securitystringSecurity type: secure_cloud or community_cloud
data_center_idstringFilter by specific data center ID
cloud_idstringFilter by provider’s cloud ID
disksList[string]Filter by disk IDs (see Disk Filtering section)
pageintegerPage number (default: 1, min: 1)
page_sizeintegerResults per page (default: 100, max: 100)

Understanding the GPU Response Object

The GPU availability endpoint returns a paginated response with a list of available GPU configurations and the total count. Here is an example of the JSON response you might receive:
Example response
{
  "items": [
    {
      "cloudId": "NVIDIA H100 PCIe",
      "gpuType": "H100_80GB",
      "socket": "PCIe",
      "provider": "runpod",
      "region": "united_states",
      "dataCenter": "US-KS-2",
      "country": "US",
      "gpuCount": 1,
      "gpuMemory": 80,
      "disk": {
        "minCount": 80,
        "defaultCount": 80,
        "maxCount": 1000,
        "pricePerUnit": 0.00014,
        "step": 1,
        "defaultIncludedInPrice": false,
        "additionalInfo": null
      },
      "vcpu": {
        "defaultCount": 16
      },
      "memory": {
        "defaultCount": 251
      },
      "internetSpeed": null,
      "interconnect": null,
      "interconnectType": null,
      "provisioningTime": null,
      "stockStatus": "Low",
      "security": "secure_cloud",
      "prices": {
        "onDemand": 2.69,
        "communityPrice": null,
        "isVariable": false,
        "currency": "USD"
      },
      "images": [
        "ubuntu_22_cuda_12",
        "cuda_12_1_pytorch_2_2",
        "cuda_11_8_pytorch_2_1",
        "stable_diffusion",
        "flux",
        "axolotl",
        "bittensor",
        "vllm_llama_8b",
        "vllm_llama_70b",
        "vllm_llama_405b"
      ],
      "isSpot": null,
      "prepaidTime": null
    }
  ],
  "totalCount": 247
}
For a full breakdown of the response schema, see the Get Gpu Availability Endpoint documentation

Response Structure

items

An array containing the GPU availability records for the current page. Each item represents a unique GPU configuration from a provider.

totalCount

The total number of GPU configurations matching your filters across all pages. For example, if totalCount is 247 and page_size is 100, you know there are 3 pages total. Let’s walk through the most important fields in each GPU availability item:

provider

Indicates the company or platform providing the GPU. You may encounter multiple offers from the same provider.

cloudId

A unique identifier provided by the GPU provider. This value is required if you plan to provision the GPU through our provisioning API.

region

The geographic region where the GPU is located (e.g., united_states, europe_west). This helps you select resources closer to your users.

dataCenter

Optional but necessary if present in the response. If the provider operates multiple datacenters with the same cloudId, you must specify which data center to use when provisioning the GPU.

Resource Specifications: disk, sharedDisk, vcpu, memory

Each resource field contains detailed specification information with the following attributes:
  • minCount: Minimum allowed value for this resource
  • maxCount: Maximum allowed value for this resource
  • defaultCount: Default value provided by the provider
  • pricePerUnit: Cost per unit per hour (if customizable)
  • step: Increment step for adjusting the value
  • defaultIncludedInPrice: Whether the default value is included in the base price
  • additionalInfo: Extra information about pricing or constraints
disk: Instance-local disk storage in GB. This storage is attached to your GPU instance. vcpu: Number of virtual CPUs allocated to the instance. memory: RAM size available in GB.

Cost Calculation Example

In the example above, your server will include 16 vCPUs and 251GB of RAM, with customizable disk space. If the disk space is adjustable, its cost is calculated separately from the prices property. Here’s an example breakdown of total costs:
GPU cost: $2.69
vcpu cost: $0.00
memory cost: $0.00
disk cost: $0.0112 (80 units * $0.00014)

Total cost: $2.7012
The total cost varies based on the disk space that you send to the provisioning API. The value can range between minCount and maxCount adjustable by step and you’ll pay pricePerUnit/hr for every unit you want to use. Some providers offer predefined server configurations, with the defaultIncludedInPrice set to true. This means the defaultCount is included in the base price, but any changes (even reductions) will incur additional charges.

interconnect and interconnectType

For multi-GPU setups, these fields specify the interconnect speed (in Gbps) and technology type (e.g., InfiniBand, NVLink) used for GPU-to-GPU communication.

provisioningTime

Estimated time in minutes required to provision and start the GPU instance.

prices and security

There are two types of offers:
  • Secure Cloud: GPU is provided by a provider that maintains security standards and the GPUs are hosted in secured datacenters.
  • Community Cloud: GPUs provided by the community or from sources where we have limited information about the data center.
If the security type is set to secure_cloud, then the price is defined in prices.onDemand. Otherwise, community pricing is stored in prices.communityPrice. Some providers may also use dynamic pricing for their instances. If the field isVariable is set to true, the price may fluctuate based on demand or currency conversion.

prepaidTime

If set, you will be pre-charged for the total number of hours specified in this field at the time of ordering, regardless of actual usage. After the prepaid time expires, the standard hourly rate will apply.

Filtering by User Disks

The disks query parameter allows you to filter GPU availability based on your existing disks with persisted data. This is particularly useful when you want to provision a GPU instance that can attach to disks where you’ve already stored your datasets, models, or other important data.

How It Works

When you provide disk IDs via the disks parameter, the API:
  1. Fetches the disk information for the provided disk IDs
  2. Filters GPU availability to show only instances that can be provisioned in locations where your disks are available for attachment
  3. Applies rest of the filters to narrow down the results.

Example Request

Suppose you have two disks with IDs clhxy6aw80000j8080gdf8kqv and clhxy9bz50001j8080hdf9lrw containing your training datasets in different datacenters. You can find GPUs that can attach to either disk:
curl --request GET \
  --url 'https://api.primeintellect.ai/api/v1/availability/gpus?disks=clhxy6aw80000j8080gdf8kqv&disks=clhxy9bz50001j8080hdf9lrw&gpu_type=H100_80GB' \
  --header 'Authorization: Bearer your_api_key'
This will return H100 GPUs available in the same locations where your disks are stored, making it easy to reuse your persisted data.

Use Cases

  • Data Persistence: Reuse datasets, models, and checkpoints stored on existing disks
  • Quick Provisioning: Attach pre-loaded data disks to new GPU instances without re-uploading

Checking Disk Availability

The disk availability endpoint allows you to query standalone network-attached storage options across different providers and datacenters.

Endpoint

GET https://api.primeintellect.ai/api/v1/availability/disks

Example Request

curl --request GET \
  --url 'https://api.primeintellect.ai/api/v1/availability/disks?regions=united_states&page=1&page_size=50' \
  --header 'Authorization: Bearer your_api_key'

Query Parameters for Disk Availability

ParameterTypeDescription
regionsList[string]Filter by region(s)
data_center_idstringFilter by specific data center ID
cloud_idstringFilter by provider’s cloud ID
pageintegerPage number (default: 1, min: 1)
page_sizeintegerResults per page (default: 100, max: 100)

Disk Availability Response

The disk endpoint returns a paginated response with disk availability objects and the total count:
Example disk response
{
  "items": [
    {
      "cloudId": null,
      "provider": "hyperstack",
      "dataCenter": "US-1",
      "country": "US",
      "region": "united_states",
      "spec": {
        "minCount": 0,
        "defaultCount": 0,
        "maxCount": 100000,
        "pricePerUnit": 0.00015,
        "step": 1,
        "defaultIncludedInPrice": false,
        "additionalInfo": null
      },
      "stockStatus": "Available",
      "security": "secure_cloud",
      "isMultinode": false
    }
  ],
  "totalCount": 15
}

Key Fields in Disk Response

spec

Contains the disk size specification with the same structure as resource specifications in GPU responses:
  • minCount: Minimum disk size in GB
  • maxCount: Maximum disk size in GB
  • pricePerUnit: Cost per GB per hour
  • step: Size increment (usually 1 GB)

isMultinode

Indicates whether this disk can be attached to multiple instances simultaneously. This is useful for shared data scenarios where multiple GPU instances need access to the same dataset.

security

Similar to GPU instances, disks can be classified as secure_cloud or community_cloud based on the provider’s infrastructure.

Disk Cost Calculation

Disk costs are straightforward to calculate:
Disk cost per hour = disk_size_gb * pricePerUnit
For example, a 500GB disk with pricePerUnit: 0.00015:
500 GB * $0.00015/GB/hr = $0.075/hr

Pagination

All availability endpoints support pagination to handle large result sets efficiently:
  • Use page to specify which page of results to retrieve (1-indexed)
  • Use page_size to control how many results per page (max 100)
  • Results are returned in consistent order for predictable pagination
Example with pagination:
curl --request GET \
  --url 'https://api.primeintellect.ai/api/v1/availability/gpus?gpu_type=A100_80GB&page=2&page_size=50' \
  --header 'Authorization: Bearer your_api_key'