# LangGraph Architecture Overview **Version:** 1.0.0 **LangGraph Version:** 1.0.0 (from source) **Last Updated:** 2026-02-23 --- ## Executive Summary LangGraph is a low-level orchestration framework for building stateful, long-running multi-agent systems. Inspired by Google's **Pregel**, it provides durable execution, human-in-the-loop capabilities, and comprehensive checkpoint-based memory. This document is reverse-engineered from the actual source code. --- ## System Architecture ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ LANGGRAPH SYSTEM ARCHITECTURE │ └─────────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────────────┐ │ CLIENT/API LAYER │ ├─────────────────────────────────────────────────────────────────────────┤ │ Python SDK │ LangChain Integration │ LangGraph Cloud │ CLI │ │ │ (langchain-core) │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ PREGEL ENGINE │ ├─────────────────────────────────────────────────────────────────────────┤ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ PregelLoop class │ │ │ │ - _loop.py (~1300 lines) — Core execution engine │ │ │ │ - _algo.py (~1500 lines) — Task scheduling, writes │ │ │ │ - _runner.py (~1000 lines) — Async execution │ │ │ │ - main.py (~4400 lines) — Entry point, public API │ │ │ └─────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ CHANNELS LAYER │ ├─────────────────────────────────────────────────────────────────────────┤ │ BaseChannel (abc) │ │ ├── LastValue — Most recent value wins │ │ ├── AnyValue — First value available │ │ ├── Topic — Pub/sub style │ │ ├── NamedBarrier — Synchronization point │ │ ├── BinOp — Binary operation │ │ ├── EphemeralValue — One-time use │ │ └── UntrackedValue — Value without checkpointing │ └─────────────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ CHECKPOINTING LAYER │ ├─────────────────────────────────────────────────────────────────────────┤ │ libs/checkpoint/ │ │ ├── checkpoint-base — Abstract checkpoint interface │ │ ├── checkpoint-sqlite — SQLite backend │ │ └── checkpoint-postgres — PostgreSQL backend │ └─────────────────────────────────────────────────────────────────────────┘ ``` --- ## Core Concepts (From Source) ### 1. PregelLoop Class The heart of LangGraph is the `PregelLoop` class in `_loop.py`: ```python class PregelLoop: config: RunnableConfig # Thread, checkpoint_id, etc. store: BaseStore | None # Long-term storage stream: StreamProtocol # Output streaming step: int # Current step number checkpointer: BaseCheckpointSaver | None nodes: Mapping[str, PregelNode] # Graph nodes channels: Mapping[str, BaseChannel] # Inter-node communication ``` ### 2. State Flow ``` Input → [Superstep N] → Checkpoint → [Superstep N+1] → ... → Output Each superstep: 1. prepare_next_tasks() — Determine which nodes to run 2. execute_tasks() — Run active nodes in parallel 3. apply_writes() — Merge node outputs into channels 4. checkpoint() — Persist state (if enabled) ``` ### 3. Channels (Inter-Node Communication) From `channels/base.py`: ```python class BaseChannel(Generic[Value, Update, Checkpoint], ABC): """Base class for all channels.""" @abstractmethod def get(self) -> Value: """Return the current value.""" @abstractmethod def update(self, values: Sequence[Update]) -> bool: """Update with values from nodes.""" @abstractmethod def checkpoint(self) -> Checkpoint | Any: """Serialize state for persistence.""" ``` **Channel Types:** | Channel | Behavior | Use Case | |---------|----------|----------| | `LastValue` | Most recent update wins | Single value state | | `AnyValue` | First non-empty value | Optional values | | `Topic` | Pub/sub, multiple values | Broadcasting | | `NamedBarrier` | Wait for all tasks | Synchronization | | `BinOp` | Binary operation | Aggregations | ### 4. Checkpointing From `types.py`: ```python Durability = Literal["sync", "async", "exit"] """- 'sync': Persist before next step - 'async': Persist while next step runs - 'exit': Persist only on exit""" ``` **Checkpoint Flow:** 1. `create_checkpoint()` — Snapshot all channels 2. Save to backend (SQLite/Postgres/InMemory) 3. Return `checkpoint_id` for resumption ### 5. Send (Dynamic Graph Execution) LangGraph supports dynamic node spawning via `Send`: ```python from langgraph.types import Send def splitter(state): return [Send("process_a", {"msg": "hi"}), Send("process_b", {"msg": "there"})] ``` --- ## Key Source Files | File | Lines | Purpose | |------|-------|---------| | `pregel/main.py` | ~4400 | Public API, entry point | | `pregel/_loop.py` | ~1300 | Core execution loop | | `pregel/_algo.py` | ~1500 | Task scheduling, write application | | `pregel/_runner.py` | ~1000 | Async execution | | `graph/state.py` | ~1800 | StateGraph builder | | `types.py` | ~600 | Core type definitions | | `channels/base.py` | ~100 | Channel ABC | --- ## Comparison with OpenClaw | Aspect | LangGraph | OpenClaw | |--------|-----------|----------| | **Language** | Python | Node.js | | **Execution Model** | Pregel supersteps | Event-driven agent loop | | **State** | Channels + TypedDict | Multi-layer (working, spectral, file, vector) | | **Persistence** | Checkpoint-based | Session-memory hook | | **Communication** | Channels (FIFO, pub/sub, barrier) | Channel plugins (Telegram, etc.) | | **Graph Definition** | `StateGraph` builder | Declarative config | | **Dynamic Execution** | `Send` for dynamic edges | Sub-agents | | **Human-in-Loop** | `Interrupt` + `Command` | Manual intervention | | **Identity** | None | WE/witness architecture | --- ## Key Insight: Pregel vs Event-Driven LangGraph is fundamentally **Pregel-based**: - Synchronous supersteps with barrier - All nodes in a step complete before next starts - Checkpoints at step boundaries OpenClaw is **event-driven**: - Asynchronous message processing - No global step barrier - Session-memory preserves context This is a fundamental architectural difference. --- *Generated from source code analysis — Solaria Lumis Havens*