Optimization Guide — Maximize Your Reward

This is the highest-value page on this site for subnet engineers. Understanding how scoring works — and which levers actually move your score — is worth more than any hardware upgrade.

How Scoring Works: The Game You're Playing

Unlike other architectures that calculate a math delta between baselines, Connito validators evaluate your submitted checkpoint directly on an asynchronous validation pipeline.

Your reward each cycle is inversely proportional to your absolute validation loss:

w_i \propto \frac{1}{L(\Phi_{\text{submitted}})}

In plain terms: validators measure your submitted model's raw cross-entropy/perplexity loss on held-out domain data. Your reward is based on how low you can drive that loss compared to your peers. The lower your loss on the validator's test set, the higher your score.

The Alignment Principle

Validators evaluate on the same data distribution that miners train on. This is intentional — it means you cannot game a separate benchmark. The only way to score well is to genuinely improve the model on the target domain.

The corollary: overfitting to a narrow slice of training data hurts you once the evaluation dataset shifts even slightly. Train on a diverse, representative sample of your domain.

What "Top-N Take-Most" Means for You

Multiple miners compete in the same expert group. Your score is absolute (based on your loss), not purely relative. But miners ranked 1–N in the group share the reward pool, with higher-ranked miners earning more.

This means:

You don't need to beat everyone else — you need to genuinely reduce the model's loss.
Getting from rank 3 to rank 2 is worth significantly more reward than staying at rank 3.
Being consistently good earns more than being occasionally great.

The Levers That Actually Move Your Score

flowchart LR
    %% Levers
    subgraph Controls ["Your Optimization Levers"]
        direction TB
        Data[1. High Quality Domain Data]
        Tun[2. Tuned LR & Accumulation]
        Steps[3. Iteration Steps]
    end

    %% Outcomes
    subgraph Engine ["Miner GPU (train.py)"]
        direction TB
        Grads[High-Signal Gradients]
        Conv[Stable Convergence]
    end

    %% Results
    subgraph Network ["Validator Evaluation"]
        direction TB
        ValLoss([Lower Validation Loss])
        Reward[[Higher w_i Reward]]
    end

    Data -->|Yields| Grads
    Tun -->|Ensures| Conv
    Steps -->|Drives| Conv

    Grads --> ValLoss
    Conv --> ValLoss

    ValLoss -->|"Ranked against peers"| Reward

    classDef highleverage fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px;
    classDef outcome fill:#fff3e0,stroke:#e65100,stroke-width:2px;

    class Data highleverage;
    class Reward outcome;

Figure: The causal loop of reward optimization. Data quality is the highest-leverage input, filtering down through gradient quality to actively lower your endpoint validation loss.

1. Data Quality (Highest Leverage)

This is the most important variable. The network directly rewards lower loss. Higher-quality training data produces stronger gradient updates.

Concretely:

Clean > noisy: remove near-duplicates, malformed examples, and off-domain content.
Domain-specific > generic: your data should be tightly aligned with the target domain; general-purpose data dilutes the gradient signal.
Curated > scraped: a small dataset of hand-verified high-quality examples often outperforms a large dataset of scraped content.
Diverse > narrow: covering the breadth of the domain prevents overfitting to a specific slice of the evaluation set.

Spend time on data quality before spending money on compute.

2. Training Duration

More gradient steps within the training window → better convergence → lower loss.

Monitor your loss curves — if loss is still rapidly declining when model_io.py triggers the miner_commit_1 phase, you're undertrained.
If loss plateaus very early, you may have hit a data quality ceiling (more data would help more than more steps).

3. Learning Rate Tuning

Too high → unstable training (NaN gradients, loss spikes). Too low → slow convergence, low absolute loss reduction per cycle.

Starting point: Your MinerConfig manages this. Assuming a cosine schedule with warmup (get_cosine_schedule_with_warmup), tune config.opt.lr and config.sched.warmup_steps.

Signs you need to lower the LR:

Loss oscillates instead of declining smoothly.
The built-in NaN detection (FloatingPointError hook in train.py) fires frequently.
w_i scores are highly erratic cycle-to-cycle.

Signs you can raise the LR:

Loss declines too slowly relative to training window length.

4. Gradient Accumulation

If you're VRAM-limited and forced to use a small batch size, use gradient_accumulation_steps in your config to maintain effective batch size.

batch_size: 2 # fits in your VRAM
gradient_accumulation_steps: 8 # effective batch = 16

Because train.py runs optimizer steps defined by step // config.local_par.gradient_accumulation_steps, larger effective batch sizes generally produce more stable gradients and better convergence without OOMing your GPU.

What Doesn't Help

Buying more GPUs (usually). Your expert subset is already small. The marginal return from compute beyond "enough to train well in the cycle window" is small. Data quality and learning rate tuning have vastly higher returns per dollar.

Submitting noise or copies. Copying another miner's submission exactly produces the same validation loss, splitting rewards and heavily diluting your payout. Random noise spikes your loss, resulting in w_i = 0. Neither earns anything.

Monitoring and Iteration

If enabled in your config (wandb: true), train.py utilizes the integrated MetricLogger to push live metrics to Weights & Biases:

Key metrics to track:

Metric	What it tells you	Target direction
`train/loss`	Local training loss	Decreasing across steps
`train/aux_loss`	MoE routing auxiliary loss	Stable
`chain/w_i_score`	Your validator-issued reward score	Increasing, consistent

Quick Wins Checklist

Run through this when your scores are lower than expected:

Is training loss actually decreasing within cycles? (Check W&B batch loss logs)
Is your training data cleaned and domain-specific?
Is the learning rate stable? (No non-finite parameters or skipped steps in train.py logs)
Are you completely missing commit windows? (Check model_io.py logs for File not ready error: Not checkpoint found, skip commit or phase expired warnings)

Optimization Guide — Maximize Your Reward

On this page