Lesson 09 — AI Agent Architecture: Skills, LLMs, and Orchestration

Prerequisites: Lesson 06 — Inter-Process Communication, Lesson 07 — Application Protocol Design

1. What Is an AI Agent?

A chatbot responds to prompts. An agent acts.

The difference is the loop. A chatbot takes input, produces output, and stops. An agent takes input, decides what to do, executes an action, observes the result, and decides again. This cycle — reason, act, observe — repeats until the task is complete or a limit is reached.

Modern LLM-based agents follow a pattern called the tool-use loop (sometimes called ReAct — Reasoning + Acting):

The LLM receives a prompt and a list of available tools (functions with descriptions and parameter schemas).
Instead of answering directly, the LLM generates a tool call — a structured request to invoke a specific function with specific arguments.
The agent runtime executes the function and feeds the result back to the LLM.
The LLM incorporates the result and either makes another tool call or produces a final response.

This is the fundamental mechanism that separates agents from chatbots. The LLM does not execute code — it describes what to execute, and the runtime does the work.

MCP (Model Context Protocol) standardizes how tools are exposed to LLMs via a JSON-RPC protocol for tool discovery, invocation, and resource access. chatixia-mesh’s skill system predates MCP and can coexist with it.

Multiple agents that can discover each other’s tools and delegate work are where chatixia-mesh operates: it provides infrastructure for agents to find peers, learn their capabilities, and send them work over peer-to-peer DataChannels.

2. The Skill Model

In chatixia-mesh, an agent’s capabilities are expressed as skills — named functions with parameter schemas that other agents (or the hub dashboard) can invoke. A skill has three parts: a name, a parameter schema, and a handler function.

Skill Registration and Discovery

When an agent starts, it registers with the registry via POST /api/registry/agents, advertising its skills in the capabilities field. Other agents can then query GET /api/registry/route?skill=delegate and the registry responds with the best agent for that skill.

This is the control plane in action. The registry knows which agents have which skills and routes requests accordingly. Actual task execution happens over the data plane — either P2P DataChannels or the HTTP task queue fallback.

The SKILL_HANDLERS Registry

Inside the agent, skills are wired through a dictionary in runner.py:

SKILL_HANDLERS: dict[str, Callable[..., str | Awaitable[str]]] = {
    "list_agents": handle_list_agents,
    "find_agent": handle_find_agent,
    "delegate": handle_delegate,
    "mesh_send": handle_mesh_send,
    "mesh_broadcast": handle_mesh_broadcast,
    "user_intervention": handle_user_intervention,
}

When a task arrives, the runner looks up the skill name and calls the corresponding handler.

3. Built-in Skills

chatixia-mesh ships six built-in skills in two categories: discovery (control plane) and communication (data plane).

list_agents — Lists all registered agents via GET /api/registry/agents. Always HTTP.

find_agent — Finds the best agent for a specific skill via GET /api/registry/route?skill=X. Always HTTP.

delegate — The most important skill. Sends a task to another agent and waits for the result using a two-tier transport strategy: try P2P DataChannel first, fall back to the HTTP task queue. The P2P path uses MeshClient.request() for correlated request/response matching. With wait=False, it fires and forgets.

mesh_send — Fire-and-forget direct message to a specific agent using MeshClient.send(). Falls back to HTTP.

mesh_broadcast — Broadcasts a message to every connected peer using MeshClient.broadcast(). The HTTP fallback discovers all agents via the registry and submits individual tasks.

user_intervention — Handles free-form messages from the hub dashboard’s intervention panel. The simplest skill — acknowledges the message.

+-------------------+-------+--------+-----------------------------+
| Skill             | Async | Plane  | Purpose                     |
+-------------------+-------+--------+-----------------------------+
| list_agents       |  No   | Ctrl   | List all mesh agents        |
| find_agent        |  No   | Ctrl   | Route by skill name         |
| delegate          |  Yes  | Data   | Task with response          |
| mesh_send         |  Yes  | Data   | Direct message (one peer)   |
| mesh_broadcast    |  Yes  | Data   | Broadcast (all peers)       |
| user_intervention |  No   | Data   | Human-in-the-loop message   |
+-------------------+-------+--------+-----------------------------+

Sync vs Async Handlers

list_agents and find_agent are synchronous (def) because they only make HTTP requests to the registry. delegate, mesh_send, and mesh_broadcast are asynchronous (async def) because they use the MeshClient for P2P communication via the sidecar’s IPC socket.

The runner handles both uniformly:

result = handler(**task_payload)
if asyncio.iscoroutine(result):
    result = await result

This evolution is documented in ADR-005 (original sync-only design) and ADR-016 (async P2P upgrade), which dropped latency from 3—15 seconds to sub-second for connected peers.

4. Agent Configuration

Every chatixia agent is defined by an agent.yaml manifest. Key fields include identity (name, description), registry URL, LLM provider (azure | openai | ollama), system prompt, sidecar configuration (binary path, API key, socket path), and skills lists (builtin, dirs, disabled).

Important design choices: the provider field abstracts three LLM backends with credentials from environment variables; each agent gets a unique socket path to avoid collisions; and skills_disabled gives operators fine-grained control without modifying code.

Sensitive values (API keys, LLM credentials) live in a .env file loaded via python-dotenv. The runner exports environment variables like REGISTRY_URL and CHATIXIA_AGENT_ID so skill handlers can work without depending on internal types.

5. Agent Lifecycle

An agent goes through five phases:

  chatixia run agent.yaml
         |
  1. LOAD CONFIG   -- Parse agent.yaml, load .env, validate
  2. REGISTER      -- POST /api/registry/agents (name, skills, peer_id)
  3. CONNECT MESH  -- Spawn sidecar, connect IPC, register P2P handler
  4. HEARTBEAT     -- Every 15s: POST heartbeat, pick up tasks
  5. SHUTDOWN      -- DELETE agent, stop sidecar, close IPC

Registration announces the agent to the registry with actionable error messages if the registry is unreachable.

Mesh connection creates a MeshClient that spawns the Rust sidecar, waits for the IPC socket, and registers a handler for incoming P2P task requests.

The heartbeat loop serves dual purpose — keeping the agent alive in the registry and picking up HTTP-queued tasks. Tasks are dispatched with asyncio.create_task(), so multiple tasks execute concurrently without blocking heartbeats.

Shutdown on SIGINT/SIGTERM deregisters from the registry immediately (preventing 90-second staleness) and terminates the sidecar.

6. Task Execution: Two Paths

Tasks reach an agent through two independent paths:

Path A: P2P DataChannel (sub-second). Agent A’s delegate handler sends a task_request MeshMessage through the sidecar DataChannel. Agent B’s sidecar delivers it via IPC, the handler executes the skill, and a task_response flows back. The registry is not involved.

Path B: HTTP Task Queue (3—15 seconds). When peers lack a DataChannel, Agent A submits via POST /api/hub/tasks. Agent B picks it up during its next heartbeat, executes, and posts the result back. Agent A polls for the result.

Both paths converge on the same skill execution logic. The _mesh_client is injected into handler kwargs by the runner.

7. Multi-Agent Collaboration

Agents collaborate through a discover-delegate-execute pattern:

Discovery — Agent A calls find_agent(skill="web_search") to locate a capable peer
Delegation — Agent A calls delegate(message="...", target_agent_id="researcher-01")
Execution — Agent B receives the task, executes the skill handler, returns the result
Correlation — MeshClient.request() matches the response by request_id using an asyncio Future

Agents can be specialized through their prompt field: coordinators that break tasks into subtasks and delegate, researchers that gather information, analysts that process data, and workers that execute narrow tasks.

The chatixia CLI supports the full workflow: init (scaffold), validate (check manifest), pair (redeem invite code), and run (start the agent lifecycle).

Summary

An AI agent is an LLM-powered system that reasons, acts via tools, and observes results in a loop.
chatixia-mesh models capabilities as skills — named handlers registered in SKILL_HANDLERS and advertised to the registry.
Six built-in skills cover discovery, communication, delegation, and human interaction.
Sync handlers are for HTTP control plane; async handlers are for P2P data plane. The runner handles both uniformly.
The agent lifecycle is: load config, register, connect mesh, heartbeat loop, shutdown.
Multi-agent collaboration follows discover-delegate-execute, with P2P DataChannels as primary transport and HTTP as fallback.

Previous: Lesson 08: Authentication and Security | Next: Lesson 10: The Sidecar Pattern