Architecture
How the Boundary Layer compresses, transmits, and reconstructs meaning between AI agents — without natural language.
The Bottleneck
When an AI system completes a complex task, it rarely uses a single model. It uses a chain: one agent breaks the task down, another executes a subtask, another verifies the result.
Every connection in that chain is a bottleneck. Each agent must receive, parse, generate, and pass full natural-language text. This is O(N²) overhead. The Boundary Layer makes each connection 28× cheaper.
The Protocol
The Boundary Layer compresses the full contextualised meaning of a message into a 64-dimensional vector. This is not a summary. It's a geometric representation of intent — trained with three simultaneous objectives.
The receiving agent must be able to execute the correct task from the vector alone. No additional context needed.
The vector must preserve enough information that the downstream agent reaches the same outcome as it would from the full text.
The vector dimension is minimized subject to the first two constraints. The model learns to use as few dimensions as the task allows.
Slots: Atomic facts — names, numbers, dates, identifiers — travel as typed key-value pairs alongside the vector, never through compression. Lossless over the entire agent chain.
Three Modes
Natural language in. Compact vector out. The Transformer runs once. Structured facts travel as typed slots — never compressed.
Agent to agent. No Transformer. No tokens. The Boundary Layer decoder maps the incoming vector directly to the next task. This is the fast path.
Latent back to language. Happens once, at the end of the chain, for human-facing output.
Auditability
At every point where a vector crosses an agent boundary, a log entry is written. The log translates the vector back to human-readable language via k-NN lookup.
The log runs asynchronously — it does not slow the chain. It is the structural basis for EU AI Act compliance, MDR auditability, and enterprise security requirements.
Efficiency
Latency between handoffs: 847ms → 30ms per connection
64 floats instead of ~74 tokens per handoff
Encode → Decode → Re-encode fidelity