Connito AI
Layer 2: AI Architecture

Architecture Overview

AI Architecture Overview

Connito's AI architecture is built on Mixture-of-Experts (MoE) — a model design where only a subset of parameters activate for any given input. This makes it possible to scale model capacity without scaling compute, and to update individual domain experts without touching the rest of the system.


Mixture-of-Experts Model

Traditional dense models activate every parameter for every input. MoE models use a routing mechanism to select a small subset of specialized expert modules per input, meaning parameter capacity scales independently of inference cost.

This property is what makes decentralized training viable — different contributors can train different experts in parallel without conflicts, because experts are architecturally isolated from each other. The router learns which expert handles which type of input, and sparse activation keeps compute costs manageable even as total model size grows.

🔗 MoE model


Literature Review

The architectural choices behind Connito draw on recent advances in sparse MoE systems, selective adaptation, expert lifecycle management, and composable model architectures. This review covers the foundational research — from early MoE scaling work through modern modular and composable expert systems — and explains how each informs Connito's design.

🔗 Literature review


TEFT: Targeted Expert Fine-Tuning

TEFT is the optimization framework that enables domain adaptation without full model retraining. It's a four-step cycle:

  1. Identify — select which experts are relevant to the target domain
  2. Train — fine-tune only those experts on domain-specific data
  3. Aggregate — score updates using Proof-of-Loss and merge the best contributions
  4. Reintegrate — slot the updated experts back into the full model

Because only relevant experts are modified, the base model's general capabilities remain intact — solving the catastrophic forgetting problem at the architecture level rather than through regularization hacks.

🔗 TEFT protocol