Technically speaking

The landing page is written for humans. This page is for the curious. Here's what's actually happening under the hood.

Audience mode

Orchestrator-Worker Architecture

a.k.a. “One lead, many specialists

lmkgpt implements the orchestrator-worker pattern described in Anthropic's agent design guidelines. A single coordinator agent receives your prompt, decomposes it into discrete subtasks, and delegates each to a purpose-built sub-agent. The coordinator never performs the work itself; it plans, assigns, and synthesizes.

  • The coordinator uses structured output to generate a typed execution plan with agent definitions, objectives, and dependency ordering.
  • Each sub-agent receives an isolated context window containing only its assigned objective and relevant background, with no cross-agent bleed.
  • The coordinator synthesizes all sub-agent outputs into a single coherent response, resolving conflicts and filling gaps.

Parallel Sub-Agent Execution

a.k.a. “They all work at once

Sub-agents execute concurrently, not sequentially. Each agent runs as an independent LLM call with its own system prompt, context, and objective. Results stream back to the UI in real time via server-sent events.

  • Agents are dispatched using Promise.allSettled, so a single agent failure doesn't block others.
  • Each agent streams tokens independently, and the UI renders progress per agent as it arrives.
  • The coordinator waits for all agents to complete (or fail) before synthesizing the final answer.

API-First Design

a.k.a. “Other AI can use it too

lmkgpt exposes a RESTful orchestration API at /api/v1/orchestrate. External systems, including other AI agents, can POST a prompt and receive the full multi-agent execution as a response. This makes lmkgpt composable: it can be a tool in another agent's toolbelt.

  • POST /api/v1/orchestrate accepts a prompt and returns the analysis plan, individual agent outputs, and synthesized result.
  • Designed for agent-to-agent communication, other AI systems can delegate complex research tasks to lmkgpt.
  • Authentication via API key. Rate-limited per key.

Real-Time Streaming via SSE

a.k.a. “Watch them work

The execution view uses server-sent events (SSE) to push each agent's token stream to the client as it's generated. No polling, no WebSockets, just a persistent HTTP connection with streaming JSON payloads.

  • Each agent's output streams independently, so you see progress from fast agents before slow ones finish.
  • Status transitions (idle → running → complete) are pushed as discrete events.
  • The UI renders a live dashboard with per-agent status indicators and streaming output.

Full Prompt Transparency

a.k.a. “See the full picture

Every execution preserves the complete prompt chain: the coordinator's system prompt, its generated plan, each sub-agent's system prompt, their objectives, and their raw outputs. Nothing is hidden.

  • The plan view shows the coordinator's decomposition, and how it chose to break your question into parts.
  • Each agent card reveals the exact system prompt and objective it received.
  • Raw outputs are preserved alongside the synthesized answer so you can verify the reasoning.

Multi-Provider Resilience (Anthropic + OpenAI)

a.k.a. “Powered by the best AI

The coordinator runs on Claude (Anthropic). Sub-agents default to Claude but automatically fall back to GPT (OpenAI) if a request fails. This per-agent failover means a provider outage degrades gracefully instead of failing entirely.

  • Primary: Claude via the Anthropic SDK. Fallback: GPT via the OpenAI SDK.
  • Failover is per-agent, not global; if one agent hits a rate limit, only that agent retries on the fallback provider.
  • Model selection is configurable. The coordinator always uses Claude for structured output reliability.

Want to build with lmkgpt?

The orchestration API is available for other AI systems and developers.

POST /api/v1/orchestrate