> ## Documentation Index
> Fetch the complete documentation index at: https://docs.primeintellect.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Full Fine-Tuning (Beta)

> Dedicated full-parameter RL training on Hosted Training

<Note>
  Full fine-tuning is in **closed beta**. Access is gated per-team — reach out to us to get enabled.
</Note>

Full fine-tuning updates every parameter of the model on a dedicated cluster reserved for your run, instead of training a LoRA adapter on top of a shared deployment.

## Config

Full-FT runs use the native [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) config schema. Set `type = "full_finetune"` at the top of your TOML and size the run with `[deployment]` — `num_train_gpus` / `num_infer_gpus` for single-node, `num_train_nodes` / `num_infer_nodes` for multi-node.

Minimal single-node example (1 trainer GPU + 1 inference GPU):

```toml theme={null}
type = "full_finetune"
name = "reverse-text-full-ft"
max_steps = 100
seq_len = 2048

[model]
name = "PrimeIntellect/Qwen3-0.6B-Reverse-Text-SFT"

[deployment]
num_train_gpus = 1
num_infer_gpus = 1

[trainer.optim]
lr = 3e-6

[orchestrator]
batch_size = 64
rollouts_per_example = 8

[orchestrator.train.sampling]
max_completion_tokens = 512

[[orchestrator.train.env]]
id = "primeintellect/reverse-text"
name = "reverse-text"

[orchestrator.renderer]
name = "default"

[inference]
```

Multi-node example (2 train nodes + 2 inference nodes, each a full 8-GPU node):

```toml theme={null}
type = "full_finetune"
name = "qwen30b-math"
seq_len = 32768

[model]
name = "Qwen/Qwen3-30B-A3B-Thinking-2507"

[deployment]
num_train_nodes = 2
num_infer_nodes = 2

[trainer.model]
impl = "custom"
attn = "flash_attention_3"
ep = 8                            # expert parallel (MoE)

[trainer.optim]
type = "adamw"
lr = 1e-6

[orchestrator]
batch_size = 512
oversampling_factor = 2
max_off_policy_steps = 8

[orchestrator.train.sampling]
max_completion_tokens = 32768

[[orchestrator.train.env]]
id = "primeintellect/math-env"
name = "math"

[inference.parallel]
tp = 8                            # tensor parallel inside each inference replica
```

<Tip>
  Multi-node runs broadcast weights over NCCL by default and auto-discover the cluster's RDMA devices — no extra config needed.
</Tip>

See the [prime-rl docs](/prime-rl/configuration) and [config examples](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/configs) for the full schema.

## Launching a run

Same CLI as LoRA — `prime train` auto-detects the config shape:

```bash theme={null}
prime train run configs/full-ft.toml
```

On dispatch you get a run ID:

```
Dispatched hosted run wn2cjdrzdo6bmfqajoeuu30p
```

<Tip>
  Runs use the `main` tag of the prime-rl image by default. Pin a specific build with `--image-tag v0.5.1` on the CLI or `image_tag = "v0.5.1"` in the TOML (CLI wins).
</Tip>

## Monitoring

A full-FT run has several distinct components. Pick which one to read with `-c / --component`:

```bash theme={null}
prime train logs <run-id>                  # orchestrator (default)
prime train logs <run-id> -c trainer       # trainer (FSDP / torchrun)
prime train logs <run-id> -c inference     # vLLM inference server
prime train logs <run-id> --env <env-name> # env-server for a specific env
```

List the orchestrator and env-server components for a run:

```bash theme={null}
prime train components <run-id>
```

Follow and filter the same way as LoRA — `-f`, `--search`, `--regex`, `--level`, `--since`. See [Monitoring](/hosted-training/end-to-end-run#step-6-monitor-the-run) for details.

The dashboard works as it does for LoRA runs: reward curves, rubric scores, and individual rollouts at `https://app.primeintellect.ai/dashboard/training/<run-id>`.

<CardGroup cols={2}>
  <Card title="End-to-End Run" icon="rocket" href="/hosted-training/end-to-end-run">
    LoRA walkthrough — most workflow steps apply identically.
  </Card>

  <Card title="prime-rl Configuration" icon="gear" href="/prime-rl/configuration">
    Full reference for the underlying training framework config schema.
  </Card>
</CardGroup>
