Model Distribution
The validator's server.py serves model data to miners and peer validators via a FastAPI HTTP server. This page explains each endpoint and how model serving works.
Authentication
All endpoints require Bearer token authentication:
Authorization: Bearer <BZ_AUTH_TOKEN>
The token is configured via the BZ_AUTH_TOKEN environment variable on the validator. Miners must include this token in all requests — obtain it from the subnet operator.
Serving Expert Group Slices to Miners
GET /model/partial
During the distribute phase, miners call this endpoint to download their assigned expert group slice.
Request:
GET /model/partial
Authorization: Bearer <token>
X-Expert-Group: 0
X-Hotkey: 5FHneW46xGXgs5mUiveU4sbTyGBzmstUspZC92UhjJM694ty
What the server does:
- Validates the bearer token
- Looks up the requesting miner's registered expert group (from chain state)
- Verifies
X-Expert-Groupmatches the registered assignment - Extracts the expert group parameters from the full model using
ExpertMapping - Returns the slice as a PyTorch state dict
Response metadata header:
{
"expert_group": 0,
"block_number": 12345,
"model_version": "step_1000",
"content_hash": "sha256:abcdef..."
}
Binary tensor data follows the metadata.
Caching behavior: The first request for a given expert group in a cycle triggers the extraction. The extracted slice is cached in memory for subsequent requests from other miners in the same group during the same distribute window. This avoids re-extracting the same slice for every miner.
Serving the Full Model to Peer Validators
GET /model/full
Peer validators use this endpoint to sync their model state. This endpoint is restricted to callers registered as validators on the BlockZero subnet.
Request:
GET /model/full
Authorization: Bearer <token>
The server streams the full model in safetensors format. This is a large download (~60GB); only call this when synchronizing a new validator node.
Accepting Miner Checkpoint Submissions
POST /submit
During the submit phase, miners upload their trained checkpoints.
Request body:
{
"hotkey": "5FHneW46xGXgs5mUiveU4sbTyGBzmstUspZC92UhjJM694ty",
"expert_group": 0,
"checkpoint_url": "https://miner-server.example.com/checkpoint_500_12345.pt",
"block_number": 12345,
"signature": "url-safe-base64-signature"
}
What the server does:
- Validates the bearer token
- Verifies the submission is in the submit phase window
- Checks for duplicate submission (same hotkey + block number)
- Verifies the ed25519 signature using the miner's public key from chain
- Stores the message in the submission queue for
run.pyto process - Returns
202 Accepted
The actual checkpoint file is downloaded later by run.py when it enters the evaluation phase.
Status codes:
202 Accepted— queued for evaluation400 Bad Request— invalid signature or malformed body409 Conflict— duplicate submission for this block423 Locked— not currently in submit phase
Request Logging
All requests to the validator server are logged with:
- Timestamp
- Requester hotkey
- Expert group
- Response code
- Transfer size and duration
Logs are written to ./logs/server.log by default. Review these logs to:
- Monitor which miners are submitting
- Detect unusual request patterns
- Diagnose connectivity issues
Checkpoint Cache Management
Downloaded miner checkpoints are stored in checkpoint_cache_dir (from validator config). The cache is managed automatically:
- Checkpoints are stored as
{hotkey}_{block}_{expert_group}.pt - When
max_checkpoints_cachedis reached, the oldest entries are evicted - The cache is persistent across validator restarts
Size your max_checkpoints_cached to fit within your storage budget:
cache_size ≈ max_checkpoints_cached × (total_model_params / num_groups) × 2 bytes
Figure: Request flow through the validator API. Miners download model slices in the distribute phase, upload submissions in the submit phase. run.py processes submissions asynchronously.