IPC Design -- Bridging Languages with Unix Sockets
The chatixia-mesh sidecar pattern splits the system into two processes: a Rust sidecar that handles WebRTC networking, and a Python agent that runs application logic. These processes need to talk t...
On this page
Lesson 06: IPC Design — Bridging Languages with Unix Sockets
Prerequisites: Lesson 01 — Why Distributed Systems, Lesson 04 — Async Programming Patterns
Key source files:
| File | What to study |
|---|---|
sidecar/src/ipc.rs | Rust side: Unix socket server, read loop, command dispatch |
sidecar/src/protocol.rs | IpcMessage struct, ipc_types constants |
agent/chatixia/core/mesh_client.py | Python side: MeshClient, _send_ipc, _listen_loop, request() |
Introduction
The chatixia-mesh sidecar pattern splits the system into two processes: a Rust sidecar (WebRTC networking) and a Python agent (application logic). These processes need to communicate across a language and runtime boundary. The solution is Unix domain sockets with a JSON-lines protocol.
1. Why IPC?
When the sidecar pattern creates a process boundary, you need an inter-process communication mechanism. Cross-process calls require serialization, a transport, error handling, and a shared protocol.
chatixia-mesh uses Unix domain sockets because:
- Same machine, always. The sidecar is a companion process on the same host. No TCP overhead needed.
- Bidirectional. Both sides initiate messages — the agent sends commands, the sidecar pushes events.
- Language independent. Any language that can open a socket can participate.
- Low overhead. Unix sockets bypass the TCP/IP stack — no routing, checksums, or network interface processing.
Alternatives were rejected for specific reasons: shared memory has incompatible memory models across Rust/Python; pipes are awkward for bidirectional use; TCP adds unnecessary network stack overhead for localhost; gRPC adds .proto management and code generation overhead disproportionate to the protocol’s 8 message types.
2. Unix Domain Sockets
A Unix domain socket is a communication endpoint in the filesystem:
/tmp/chatixia-sidecar.sock
It is a special socket file (type s in ls -la) that the kernel uses as a rendezvous point between processes. Key differences from TCP:
| Property | TCP (localhost) | Unix domain socket |
|---|---|---|
| Addressing | IP:port | Filesystem path |
| Kernel path | Full network stack | Direct kernel buffer copy |
| Latency | ~10us | ~1us |
| Cross-machine | Yes | No |
| Security | IP-based ACLs | File permissions (owner, group, mode) |
Unix sockets inherit filesystem security — setting permissions to 0600 restricts access to the owner. This is simpler than firewall rules for same-machine communication.
3. JSON-Lines Protocol
Unix sockets provide a byte stream without message boundaries. You need a framing protocol to mark where one message ends and the next begins. chatixia-mesh uses JSON-lines: one JSON object per line, terminated by \n.
The parsing logic on both sides is simply:
loop:
line = read_until('\n')
message = json_parse(line)
handle(message)
JSON-lines was chosen for debuggability (readable with standard tools), language independence (every language has a JSON parser), and simplicity (no code generation). The trade-off — slower parsing than binary formats — is negligible for chatixia-mesh’s low-frequency control messages.
4. The IPC Protocol
The IPC protocol defines eight message types: four commands (agent to sidecar) and four events (sidecar to agent).
Message Structure
Every IPC message has the same shape:
{"type": "<message_type>", "payload": { ... }}
In Rust (sidecar/src/protocol.rs):
#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct IpcMessage {
#[serde(rename = "type")]
pub msg_type: String,
#[serde(default)]
pub payload: serde_json::Value,
}
Agent-to-Sidecar Commands
| Command | Example | Purpose |
|---|---|---|
send | {"type":"send","payload":{"target_peer":"peer-abc","message":{...}}} | Send to a specific peer |
broadcast | {"type":"broadcast","payload":{"message":{...}}} | Send to all peers |
list_peers | {"type":"list_peers","payload":{}} | Request connected peer list |
connect | {"type":"connect","payload":{"target_peer_id":"peer-abc"}} | Initiate WebRTC connection |
Sidecar-to-Agent Events
| Event | Example | Purpose |
|---|---|---|
message | {"type":"message","payload":{"from_peer":"peer-abc","message":{...}}} | Received a message from a peer |
peer_connected | {"type":"peer_connected","payload":{"peer_id":"peer-abc"}} | DataChannel established |
peer_disconnected | {"type":"peer_disconnected","payload":{"peer_id":"peer-abc"}} | DataChannel closed |
peer_list | {"type":"peer_list","payload":{"peers":["peer-abc","peer-def"]}} | Response to list_peers |
Rust-Side Processing
The serve function in sidecar/src/ipc.rs accepts a single connection (one agent per sidecar), splits the socket into read/write halves, and runs two concurrent tasks: the read path processes agent commands in the main loop, while a spawned task forwards sidecar events via an mpsc channel.
Key design points: stale socket files are cleaned up before bind() to prevent “address already in use” errors. The handle_agent_command function dispatches by type — send extracts the target and message for the DataChannel, list_peers pushes a peer_list event back through the write channel.
5. Request/Response Correlation
The IPC protocol is message-oriented, but the Python agent sometimes needs request-response semantics. MeshClient.request() implements this using request_id correlation:
async def request(self, target_peer, message, timeout=30.0):
if not message.request_id:
message.request_id = uuid.uuid4().hex[:12]
future = loop.create_future()
self._pending_responses[message.request_id] = future
await self.send(target_peer, message)
try:
return await asyncio.wait_for(future, timeout=timeout)
finally:
self._pending_responses.pop(message.request_id, None)
The flow: (1) generate a unique 12-character hex ID, (2) create an asyncio.Future and store it keyed by the ID, (3) send the message, (4) await the future with a timeout.
When a response arrives, _dispatch checks for a matching request_id:
if req_id and req_id in self._pending_responses:
self._pending_responses[req_id].set_result(inner)
return
This resolves the future, waking the coroutine that called request(). The finally block cleans up the pending entry regardless of success, timeout, or exception.
This is a standard pattern — HTTP/2 uses stream IDs, JSON-RPC uses an id field, AMQP has correlation_id.
6. Lifecycle Management
The Python agent manages the sidecar’s full lifecycle.
Startup
MeshClient.start() orchestrates three phases:
- Cleanup: Remove stale socket files from previous crashes.
- Spawn and wait: Resolve the sidecar binary (configured path, then
SIDECAR_BINARYenv var, thenPATHlookup), spawn it viasubprocess.Popen, then poll for the socket file every 100ms for up to 5 seconds. If the sidecar crashes during startup, its stderr is captured and included in the error. - Connect: Open an
asyncioUnix connection and start the listen loop as a background task.
Binary Resolution
The three-stage lookup serves different deployments:
| Stage | Source | Use case |
|---|---|---|
| 1. Configured path | agent.yaml | Development: target/release/chatixia-sidecar |
| 2. Environment variable | SIDECAR_BINARY | Docker/CI |
| 3. PATH lookup | shutil.which() | Production: /usr/local/bin |
Shutdown
stop() proceeds in order: set _connected = False, cancel the listen task, close the socket writer (sidecar sees EOF), then terminate() the sidecar process with a 5-second wait.
Putting It All Together
The IPC layer bridges two worlds: Rust with WebRTC APIs, and Python with AI logic. It consists of:
- A Unix domain socket for low-latency, same-machine communication
- A JSON-lines framing protocol for simplicity and debuggability
- An 8-message IPC protocol (4 commands, 4 events) with clean separation of concerns
- Request/response correlation via
request_idandasyncio.Future - Lifecycle management for binary resolution, spawning, readiness detection, and shutdown
The agent can issue a single request() call and, behind the scenes, it generates a request ID, serializes to JSON, writes to a Unix socket, the sidecar forwards over a DataChannel, the remote peer responds, the sidecar writes back to IPC, the listen loop matches the request ID, and the Future resolves — all asynchronously, without the agent knowing about WebRTC, DTLS, or ICE.
Previous: Lesson 05: Signaling Protocol Design | Next: Lesson 07: Application Protocol Design