chatixia blog
Fundamentals March 5, 2026 · 8 min read

Peer-to-Peer Networking: When Servers Get Out of the Way

The vast majority of internet traffic follows the client-server model: your browser (client) sends a request to a server, and the server sends back a response. This pattern dominates because it is ...

p2pnetworkingdistributed-systems
On this page

Peer-to-Peer Networking: When Servers Get Out of the Way

Prerequisites

What You’ll Learn

  • Why most internet traffic uses client-server and when P2P is a better fit
  • What NATs are, why they exist, and how they block inbound connections
  • How STUN, TURN, and UDP hole-punching overcome NAT barriers
  • How ICE orchestrates candidate gathering, connectivity checks, and path selection

1. Client-Server vs Peer-to-Peer

The vast majority of internet traffic follows the client-server model: your browser sends a request to a server with a stable, publicly routable IP address, and the server responds. Clients can be anywhere — behind home routers, on cellular networks, on corporate VPNs.

Peer-to-peer (P2P) removes the central server from the data path. Peers talk directly to each other. P2P is worth the added complexity when:

  1. Latency matters. Routing through a server adds a round trip. In video calls and real-time collaboration, that extra hop is noticeable.
  2. Bandwidth is expensive at the center. BitTorrent moves terabytes without any single server paying the bandwidth bill.
  3. Privacy or trust. P2P with end-to-end encryption (DTLS in WebRTC) means only the communicating peers see the plaintext.
  4. Resilience. No central server means no single point of failure.
Use CaseProtocolWhy P2P
File sharingBitTorrentBandwidth distributed across peers
Video/voice callsWebRTCLow latency, end-to-end encryption
BlockchainBitcoin/EthereumNo central authority
Agent meshchatixia-meshDirect agent-to-agent, registry not in data path

The trade-off is complexity. Establishing a direct connection between two peers behind NATs is substantially harder than connecting to a known server.


2. The NAT Problem

Why NATs exist

IPv4 provides roughly 4.3 billion addresses — not enough for the modern internet. Network Address Translation (NAT) is the short-term fix: your home router assigns private IPs to local devices (e.g., 192.168.1.x) and shares a single public IP for all outbound traffic. The router maintains a mapping table to route replies back to the correct device.

Why NAT breaks inbound connections

NAT works for client-server traffic: the client initiates the connection, the NAT creates a mapping, and replies flow back. But if an external device tries to initiate a connection to your machine, there is no mapping — the router drops the packet.

This is the fundamental P2P problem: both peers are typically behind NATs, and neither can accept incoming connections from the other.

Types of NAT

NATs differ in how strictly they filter inbound packets:

  • Full Cone (least restrictive) — Once a mapping is created, any external host can send to that mapped port. Easiest to traverse.
  • Restricted Cone — Only accepts inbound from IP addresses the internal host has previously sent to.
  • Port-Restricted Cone — Only the exact IP:port pair the internal host contacted can send back.
  • Symmetric (most restrictive) — A new mapping is created for every unique destination, with different external ports. STUN sees a different port than a peer would, making hole-punching very difficult. Common in enterprise networks.
  • Carrier-Grade NAT (CGNAT) — The ISP adds a second layer of NAT. Two levels of translation make hole-punching unreliable; TURN is often required.

3. NAT Traversal Techniques

UDP Hole-Punching

Both peers send UDP packets to each other’s public address simultaneously. Both NATs create mappings for the outbound traffic, and subsequent packets flow through. The trick: both peers need to know each other’s public IP and port first (learned via a STUN or signaling server).

Hole-punching works with full cone, restricted cone, and port-restricted cone NATs. It generally fails with symmetric NAT because the port assigned for the STUN query differs from the port assigned for the peer.

STUN: Session Traversal Utilities for NAT

STUN (RFC 8489) answers one question: “What is my public IP address and port?” A peer sends a Binding Request to a STUN server on the public internet; the server returns the source IP:port it sees after NAT translation. This gives the peer its server-reflexive candidate.

STUN servers are cheap to operate (small request/response pairs, no relay). Google operates free STUN servers (stun:stun.l.google.com:19302). STUN alone fails when the peer is behind symmetric NAT or UDP is entirely blocked.

TURN: Traversal Using Relays around NAT

TURN (RFC 8656) is the fallback when direct connectivity is impossible. A TURN server allocates a public relay address and forwards all traffic between peers through itself. It is expensive (relays all data) and partially negates the P2P advantage, but preserves end-to-end encryption (DTLS) and ensures connectivity when nothing else works.

TURN servers typically use ephemeral credentials: the signaling server generates time-limited username/credential pairs via HMAC-SHA1, which the TURN server validates without a database lookup.

In chatixia-mesh

The registry serves ICE server configuration via GET /api/config, always including STUN and optionally TURN. The sidecar reads TURN_URL and TURN_SECRET from environment variables and generates ephemeral credentials using the same HMAC-SHA1 algorithm as coturn.


4. ICE: Interactive Connectivity Establishment

ICE (RFC 8445) ties STUN, TURN, and direct connectivity together. It is an orchestration protocol that gathers all possible ways to reach a peer, tests them, and selects the best one.

Candidate gathering

ICE gathers three types of candidates:

TypeNameSourcePriority
hostHost candidateLocal network interfaceHighest
srflxServer-reflexiveSTUN server responseMedium
relayRelay candidateTURN server allocationLowest

Connectivity checks

Once both peers exchange candidates via signaling, ICE forms candidate pairs (every local candidate paired with every remote candidate) and tests them via STUN Binding Requests. Pairs are tested in priority order; the first to succeed wins.

Path selection

ICE ranks pairs by candidate type (host > srflx > relay), network interface, and component ID. The first pair to complete a connectivity check becomes the selected pair. If the path degrades later, ICE can restart and switch to a better pair.

In chatixia-mesh

The sidecar’s webrtc_peer.rs creates an RTCPeerConnection with ICE servers from the environment. As candidates are gathered, each is sent to the remote peer through the signaling server. The full flow:

  Sidecar A                    Registry (signaling)              Sidecar B
     |                              |                               |
     |-- register ----------------->|<---------------- register ----|
     |<-- peer_list [B] -----------|-- peer_list [A] ------------->|
     |-- SDP offer (target: B) --->|-- SDP offer ----------------->|
     |<------------- SDP answer ---|<-- SDP answer (target: A) ----|
     |-- ICE candidates ---------->|-- ICE candidates ------------>|
     |<------------- ICE cand. ----|<-- ICE candidates ------------|
     |                              |                               |
     |<======= DataChannel (direct P2P, DTLS) ===================>|

5. Signaling: The Bootstrap Problem

To establish a P2P connection, two peers need to exchange SDP offers/answers and ICE candidates. But they cannot exchange information until they have a connection. This is the bootstrap problem.

A signaling server is a rendezvous point where peers discover each other and exchange connection metadata. It is not part of the data path — once the P2P connection is up, the signaling server could shut down and existing connections would continue.

The signaling mechanism is intentionally left unspecified by WebRTC. Any transport works: WebSocket, HTTP long-polling, carrier pigeon.

In chatixia-mesh

The registry (signaling.rs) serves as the signaling server. Sidecars connect via WebSocket, authenticated with JWT. The registry tracks peers, sends peer_list messages, and relays sdp_offer, sdp_answer, and ice_candidate messages between peers. The three connectivity tiers degrade gracefully:

TierPathLatencyWhen used
1Direct P2P DataChannel<100msOpen UDP path (same LAN or cooperating NATs)
2TURN relay50-200msNAT/firewall blocks direct UDP
3HTTP task queue (via registry)3-15sAll UDP blocked, no TURN configured

The system never fails — it only slows down.


Exercises

  1. Determine your NAT type. Use a WebRTC test page or STUN client to query stun.l.google.com:19302. What is your public IP:port? Does the port change between queries (indicating symmetric NAT)?

  2. ICE on the same LAN. Trace the ICE sequence for two agents on 192.168.1.x. Which candidate pair succeeds first? Does STUN/TURN play any role?

  3. ICE across networks. Trace ICE for one agent behind a home NAT and another on a VPS with a public IP. Which candidate pair wins? When would TURN be needed?

  4. Why TURN is necessary. Give two network configurations where hole-punching fails. Compare running a TURN server vs. accepting Tier 3 HTTP latency for different workload types.


  • Lesson 01: Networking Foundations — prerequisite.
  • Lesson 03: WebRTC Deep Dive — DTLS, SCTP, DataChannels, the full WebRTC stack.
  • Lesson 04: The Sidecar Pattern — why chatixia-mesh separates WebRTC into a Rust sidecar.

Further Reading

Previous: Lesson 01: Why Distributed Systems | Next: Lesson 03: WebRTC Fundamentals

Comments