Authentication in Distributed Systems
API Keys, JWTs, and Device Pairing
On this page
- 1. Authentication vs Authorization
- 2. API Key to JWT Exchange
- Short-Lived Tokens
- Claims
- 3. JWT Mechanics
- 4. Device Pairing
- The Flow
- Lifecycle States
- Rate Limiting
- 5. Ephemeral TURN Credentials
- 6. End-to-End Encryption via DTLS
- 7. Threat Modeling
- Implemented Mitigations
- Known Gaps
- Authentication Boundary Summary
- Summary
Lesson 08: Authentication in Distributed Systems
API Keys, JWTs, and Device Pairing
Prerequisites: Lesson 05 — Signaling Protocol Design, Lesson 07 — Application Protocol Design
Key source files:
registry/src/auth.rs— AuthState, JWT issuance and validation, TURN credential generationregistry/src/pairing.rs— invite code generation, redemption, approval pipelineregistry/src/main.rs— WebSocket upgrade with JWT validation
1. Authentication vs Authorization
Every distributed system must answer two questions per request:
- Authentication (AuthN): Who are you? Prove your identity.
- Authorization (AuthZ): What are you allowed to do?
In a mesh network, the stakes are high — every authenticated peer can communicate with every other peer. Without authentication, attackers join the mesh. Without authorization, a legitimate peer can submit tasks to agents it should not control or exfiltrate data.
chatixia-mesh implements authentication at the HTTP and WebSocket layers but has intentional gaps in authorization. Understanding where those gaps are is a core objective of this lesson.
2. API Key to JWT Exchange
Agents authenticate via a two-step process: present a long-lived credential, receive a short-lived JWT.
Agent Registry
| POST /api/token |
| Header: X-API-Key: ak_dev_001 |
|---------------------------------->|
| | Look up key -> peer_id + role
| | Sign JWT (exp = now + 300s)
| { token, peer_id, role } |
|<----------------------------------|
| GET /ws?token=eyJ... |
|---------------------------------->| Validate JWT on upgrade
| <websocket established> |
The handler (registry/src/auth.rs) checks X-API-Key first, then X-Device-Token (for paired agents). If neither is valid, it returns 401.
Short-Lived Tokens
The JWT has a 5-minute TTL (exp = now + 300). If intercepted from logs or network traffic, the attacker has at most 5 minutes. The sidecar re-exchanges transparently on expiry.
Claims
pub struct Claims {
pub sub: String, // peer_id
pub role: String, // "agent"
pub exp: usize,
pub iat: usize,
}
The sub field is critical for sender verification: the registry checks that every signaling message’s peer_id matches the JWT’s sub claim.
3. JWT Mechanics
A JWT has three Base64url-encoded parts: Header (algorithm), Payload (claims), and Signature.
chatixia-mesh uses HMAC-SHA256 (symmetric): the same SIGNALING_SECRET signs and verifies tokens. Only the registry holds this secret. If it leaks, anyone can forge tokens.
On WebSocket upgrade, the registry validates:
- Signature — HMAC valid?
- Expiration —
expin the future? - Structure — deserializes to
Claims?
The token is passed as ws?token=... (a query parameter) because browser WebSocket APIs do not support custom headers. The trade-off: tokens appear in server/proxy logs. The 5-minute TTL mitigates this — captured tokens are likely expired.
After upgrade, sender verification continues on every message:
if sm.peer_id != peer_id {
error!("[WS] peer_id mismatch: expected={}, got={}", peer_id, sm.peer_id);
continue;
}
4. Device Pairing
API keys require manual provisioning. Device pairing provides a zero-configuration onboarding flow using a 6-digit invite code.
The Flow
Admin Registry New Agent
| | |
| POST /generate-code | |
|---------------------->| |
| { code: "482917" } | |
|<----------------------| |
| | |
| (tell code to agent) | |
| | POST /pair |
| | { code, agent_name } |
| |<---------------------|
| | rate limit + validate |
| | -> pending_approval |
| |--------------------->|
| | |
| POST /approve | |
|---------------------->| |
| | generate device_token |
| { device_token } | |
|<----------------------| |
| | |
| (deliver token) | POST /api/token |
| | X-Device-Token: dt_...|
| |<---------------------|
| | { JWT, peer_id } |
| |--------------------->|
Step 1: Admin generates a 6-digit code (300s TTL, single-use).
Step 2: Agent redeems the code via the unauthenticated /pair endpoint. Three checks: rate limiting (5 attempts/IP/60s), format validation, code consumption. On success, a peer_id is assigned and status is pending_approval.
Step 3: Admin approves. The registry generates a device token (dt_ + 32 hex chars = 128 bits of randomness).
Step 4: Agent exchanges the device token for JWTs, same as API key agents.
Lifecycle States
pending_approval --approve--> approved --revoke--> revoked
\--reject--> rejected
Revocation is immediate: the device token becomes invalid on the next JWT exchange.
Rate Limiting
The pairing endpoint is the only unauthenticated path to mesh access. With 5 attempts per IP per 60 seconds and a 5-minute code TTL, a single attacker gets at most 25 guesses against 1,000,000 possible codes (0.0025% success probability). Even then, they only reach pending_approval — admin approval is still required.
5. Ephemeral TURN Credentials
When peers cannot connect directly, they relay through a TURN server. chatixia-mesh uses the coturn use-auth-secret pattern to generate short-lived credentials without a user database.
The registry and TURN server share TURN_SECRET. The registry generates credentials:
fn generate_turn_credentials(secret: &str, ttl_secs: u64) -> (String, String) {
let expiry = now_secs() + ttl_secs;
let username = format!("{}:mesh", expiry);
let password = Base64(HMAC-SHA1(secret, username));
(username, password)
}
The TURN server validates independently: parse expiry from the username, check it is not past, recompute HMAC-SHA1, compare. No shared database needed.
The 24-hour TTL balances security (leaked credentials expire) against operational convenience (agents do not need frequent re-fetches).
6. End-to-End Encryption via DTLS
WebRTC DataChannels are encrypted by DTLS (the UDP equivalent of TLS), providing confidentiality, integrity, and authentication — all without a PKI. Each sidecar generates a self-signed certificate on startup; the fingerprint is included in the SDP exchanged during signaling.
The critical architectural property: the registry cannot read DataChannel messages. It relays SDP/ICE during setup but never sees the symmetric keys negotiated during the DTLS handshake. Even a compromised registry cannot decrypt data-plane traffic (though it could perform a man-in-the-middle by substituting fingerprints during signaling — threat T3).
This contrasts with HTTP/gRPC models where the server terminates TLS and sees all plaintext. WebRTC’s end-to-end encryption means the system’s own infrastructure cannot inspect the data plane.
7. Threat Modeling
chatixia-mesh uses the STRIDE model. Key threats and their status:
Implemented Mitigations
| Threat | Mitigation |
|---|---|
| Spoofing (T1): Unauthorized signaling access | JWT on WebSocket upgrade + sender verification |
| Tampering (T5): Task queue poisoning | Skill-based assignment + TTL expiration |
| Information Disclosure: DataChannel eavesdropping | DTLS end-to-end encryption |
Known Gaps
| Threat | Gap |
|---|---|
| T9: Information Disclosure | Registry GET endpoints (/api/registry/agents, /api/registry/route) are unauthenticated |
| T4: Denial of Service | No rate limiting except on pairing endpoint |
| T8: Tampering | DELETE /api/registry/agents/{id} is unauthenticated — any client can deregister any agent |
| T5: Authorization | Any authenticated agent can submit tasks to any other agent |
| T7: Elevation of Privilege | No sanitization of task payloads before LLM processing (prompt injection risk) |
Authentication Boundary Summary
Authenticated: Unauthenticated:
- POST /api/token (key/token) - GET /api/registry/agents
- GET /ws (JWT required) - DELETE /api/registry/agents/{id}
- POST /generate-code (API key) - POST /api/hub/tasks
- POST /pair (rate-limited) - POST /api/pairing/{id}/approve
- GET /api/hub/topology
The unauthenticated column is the current attack surface.
Summary
Core authentication flow: Long-lived credentials (API keys or device tokens) are exchanged for short-lived JWTs (5-minute TTL) that authenticate WebSocket connections and enable sender verification.
Device pairing provides zero-config onboarding: 6-digit invite code (rate-limited, single-use) leads to admin approval, then a device token that works like an API key.
TURN credentials use HMAC-SHA1 shared-secret validation with 24-hour TTL, requiring no user database.
DTLS provides end-to-end encryption on DataChannels without PKI — even the registry cannot read data-plane traffic.
STRIDE threat modeling reveals both implemented protections (JWT auth, rate-limited pairing, DTLS) and gaps (unauthenticated APIs, no task submission ACLs, prompt injection risk).
Previous: Lesson 07: Application Protocol Design | Next: Lesson 09: AI Agent Architecture