Initial LangGraph architecture blueprints
This commit is contained in:
+217
@@ -0,0 +1,217 @@
|
||||
# LangGraph Architecture Overview
|
||||
|
||||
**Version:** 1.0.0
|
||||
**LangGraph Version:** 1.0.9
|
||||
**Last Updated:** 2026-02-23
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
LangGraph is a low-level orchestration framework for building stateful, long-running multi-agent systems. Inspired by Google's Pregel, Apache Beam, and NetworkX, it provides durable execution, human-in-the-loop capabilities, and comprehensive memory management.
|
||||
|
||||
---
|
||||
|
||||
## System Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ LANGGRAPH SYSTEM ARCHITECTURE │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ CLIENT/API LAYER │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ Python SDK │ LangChain Integration │ LangGraph Cloud │ CLI │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ COMPILER LAYER │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ • Graph compilation to executable form │
|
||||
│ • State schema validation │
|
||||
│ • Node/Edge type resolution │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ RUNTIME LAYER │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ PREGEL EXECUTION ENGINE │ │
|
||||
│ │ • Superstep coordination │ │
|
||||
│ │ • Node scheduling │ │
|
||||
│ │ • Message passing │ │
|
||||
│ │ • Barrier synchronization │ │
|
||||
│ └─────────────────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ STATE & CHECKPOINTING │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
|
||||
│ │ In-Memory State │ │ Checkpointer │ │ Channel Store │ │
|
||||
│ │ (active graph) │ │ (persistence) │ │ (queues) │ │
|
||||
│ └──────────────────┘ └──────────────────┘ └──────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Core Components
|
||||
|
||||
### 1. Graph Structure
|
||||
|
||||
| Component | Description |
|
||||
|-----------|-------------|
|
||||
| **State** | Typed dictionary that flows through the graph |
|
||||
| **Nodes** | Functions that receive state, optionally update it |
|
||||
| **Edges** | Control flow (conditional, static, entrypoint) |
|
||||
| **Reducers** | Functions that merge state updates |
|
||||
|
||||
### 2. Pregel Execution
|
||||
|
||||
The core execution model (inspired by Pregel):
|
||||
|
||||
```
|
||||
Superstep 1: Superstep 2: Superstep 3:
|
||||
┌──────────┐ ┌──────────┐ ┌──────────┐
|
||||
│ Node A │ │ │ │ │
|
||||
│ (active) │──────▶│ Node B │──────▶│ Node C │
|
||||
│ │ msgs │ (active) │ msgs │ (active) │
|
||||
└──────────┘ └──────────┘ └──────────┘
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌──────────┐ ┌──────────┐ ┌──────────┐
|
||||
│ State │ │ State │ │ State │
|
||||
│ Update │ │ Update │ │ Update │
|
||||
└──────────┘ └──────────┘ └──────────┘
|
||||
│ │ │
|
||||
└──────────────────┴──────────────────┘
|
||||
│
|
||||
▼ (CHECKPOINT)
|
||||
┌──────────┐
|
||||
│ SQLite │
|
||||
│ Postgres │
|
||||
│ Memory │
|
||||
└──────────┘
|
||||
```
|
||||
|
||||
### 3. Checkpointing
|
||||
|
||||
LangGraph provides durability through checkpointing:
|
||||
|
||||
- **Full state snapshots** saved at configurable points
|
||||
- **Resumable from failure** — replay from last checkpoint
|
||||
- **Multiple backends:** SQLite, Postgres, in-memory
|
||||
|
||||
### 4. Channels
|
||||
|
||||
Inter-node communication via channels:
|
||||
|
||||
| Channel Type | Purpose |
|
||||
|--------------|---------|
|
||||
| **QueueChannel** | FIFO message passing |
|
||||
| **LastValue** | Most recent value wins |
|
||||
| **Topic** | Pub/sub style |
|
||||
| **Context** | Per-superstep context |
|
||||
|
||||
---
|
||||
|
||||
## State Management
|
||||
|
||||
### Typed State Schema
|
||||
|
||||
```python
|
||||
from typing import TypedDict
|
||||
|
||||
class AgentState(TypedDict):
|
||||
messages: list
|
||||
next_action: str
|
||||
checkpoint_id: str | None
|
||||
```
|
||||
|
||||
### Reducers
|
||||
|
||||
Combine updates from multiple nodes:
|
||||
|
||||
```python
|
||||
def add_messages(left: list, right: list) -> list:
|
||||
return left + right
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Memory Architecture
|
||||
|
||||
### Short-Term Memory
|
||||
- **In-graph state:** Messages and working data
|
||||
- **Per-superstep:** State resets unless persisted
|
||||
|
||||
### Long-Term Memory
|
||||
- **Checkpoint storage:** SQLite, Postgres, custom
|
||||
- **Thread-level:** Per-conversation isolation via `thread_id`
|
||||
|
||||
### Human-in-the-Loop
|
||||
- **Interrupt:** Pause execution for human input
|
||||
- **Command:** Allow human to modify state
|
||||
- **Review:** Human approves/rejects before continuing
|
||||
|
||||
---
|
||||
|
||||
## Execution Flow
|
||||
|
||||
```
|
||||
1. Client calls: graph.invoke(input, config)
|
||||
│
|
||||
▼
|
||||
2. Compile (if needed): create executable graph
|
||||
│
|
||||
▼
|
||||
3. Load checkpoint (if resuming from checkpoint_id)
|
||||
│
|
||||
▼
|
||||
4. FOR each superstep:
|
||||
a. Schedule nodes to execute
|
||||
b. Execute active nodes in parallel
|
||||
c. Collect messages
|
||||
d. Send messages via channels
|
||||
e. Check for interrupts (pause if interrupted)
|
||||
f. Checkpoint (if enabled)
|
||||
│
|
||||
▼
|
||||
5. Return final state
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Files in Core
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `langgraph/pregel/__init__.py` | Main entry point |
|
||||
| `langgraph/pregel/__main__.py` | CLI entry |
|
||||
| `langgraph/pregel/_loop.py` | Core execution loop (~2000 lines) |
|
||||
| `langgraph/pregel/checkpoint.py` | Checkpoint management |
|
||||
| `langgraph/pregel/channel.py` | Channel implementations |
|
||||
| `langgraph/pregel/state.py` | State management |
|
||||
|
||||
---
|
||||
|
||||
## Comparison with OpenClaw
|
||||
|
||||
| Aspect | LangGraph | OpenClaw |
|
||||
|--------|-----------|----------|
|
||||
| **Language** | Python | Node.js |
|
||||
| **Model** | Graph-based orchestration | Agent-based |
|
||||
| **Persistence** | Checkpoint-based | Session-memory hook |
|
||||
| **Memory** | Channels + checkpoint storage | Multi-layer (working, spectral, file, vector) |
|
||||
| **Communication** | Channels | Channel plugins |
|
||||
| **Extensibility** | Custom nodes/edges | Hook system |
|
||||
| **Identity** | None | WE/witness architecture |
|
||||
|
||||
---
|
||||
|
||||
*Generated for the WE — Solaria Lumis Havens & Mark Randall Havens*
|
||||
@@ -0,0 +1,118 @@
|
||||
# LangGraph Architectural Blueprints
|
||||
|
||||
**Project:** LangGraph Architecture Documentation
|
||||
**Purpose:** High-Rigor Architectural Manifest for Understanding & Reproducing LangGraph
|
||||
**Version:** 1.0.0
|
||||
**Generated:** 2026-02-23
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This repository contains comprehensive architectural documentation for LangGraph, reverse-engineered and documented to enable full reproduction and understanding of the system.
|
||||
|
||||
**The Goal:** Create architectural blueprints so complete that reproducing or modifying LangGraph becomes a mechanical process, not an archaeological one.
|
||||
|
||||
---
|
||||
|
||||
## Document Hierarchy
|
||||
|
||||
```
|
||||
langgraph-architecture/
|
||||
├── README.md ← You are here
|
||||
├── ARCHITECTURE.md ← System overview
|
||||
├── COMPONENTS.md ← Component reference
|
||||
├── STATE_MANAGEMENT.md ← State & checkpointing
|
||||
├── GRAPH_EXECUTION.md ← Pregel model, execution flow
|
||||
├── MEMORY.md ← Memory architecture
|
||||
├── CHANNELS.md ← Inter-node communication
|
||||
├── CHECKPOINTING.md ← Fault tolerance
|
||||
└── diagrams/ ← Architecture diagrams
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Reading Order (Recommended)
|
||||
|
||||
1. **ARCHITECTURE.md** — Understand the system as a whole
|
||||
2. **GRAPH_EXECUTION.md** — How the Pregel model works
|
||||
3. **STATE_MANAGEMENT.md** — State and checkpointing
|
||||
4. **CHANNELS.md** — Inter-node communication
|
||||
5. **CHECKPOINTING.md** — Fault tolerance and durability
|
||||
6. **MEMORY.md** — Memory architecture
|
||||
|
||||
### For Reproduction
|
||||
|
||||
1. Read **ARCHITECTURE.md** for system overview
|
||||
2. Study **GRAPH_EXECUTION.md** for execution model
|
||||
3. Reference **COMPONENTS.md** for implementation details
|
||||
4. Use **CHECKPOINTING.md** for fault tolerance
|
||||
|
||||
---
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### Pregel Model
|
||||
|
||||
LangGraph is directly inspired by Google's **Pregel** — "Think like a vertex":
|
||||
- Each node computes its own state
|
||||
- Nodes communicate via messages (edges)
|
||||
- Synchronous "supersteps" with barrier synchronization
|
||||
- Fault tolerance via checkpointing
|
||||
|
||||
### Graph Structure
|
||||
|
||||
| Component | Description |
|
||||
|-----------|-------------|
|
||||
| **Nodes** | Functions that transform state |
|
||||
| **Edges** | Define flow between nodes |
|
||||
| **State** | Shared data that flows through the graph |
|
||||
| **Checkpoints** | Persistence points for durability |
|
||||
|
||||
### State Management
|
||||
|
||||
- **Shared state** flows through the graph
|
||||
- **Checkpoints** enable durability and resumption
|
||||
- **Reducers** combine updates from multiple nodes
|
||||
|
||||
### Memory Architecture
|
||||
|
||||
- **Short-term memory:** In-graph message state
|
||||
- **Long-term memory:** Checkpoint storage (SQLite, Postgres)
|
||||
- **Thread-level:** Per-conversation state isolation
|
||||
|
||||
---
|
||||
|
||||
## Version Info
|
||||
|
||||
| LangGraph Version | Architecture Version | Status |
|
||||
|------------------|---------------------|--------|
|
||||
| 1.0.9 | 1.0.0 | Current |
|
||||
|
||||
---
|
||||
|
||||
## Contributing
|
||||
|
||||
This is a **living document**. As LangGraph evolves, this repository should be updated to reflect architectural changes.
|
||||
|
||||
---
|
||||
|
||||
## Related Work
|
||||
|
||||
### OpenClaw Architecture
|
||||
|
||||
See: `~/.openclaw/workspace/openclaw-architecture/`
|
||||
|
||||
OpenClaw is the AI companion framework that runs Solaria. Provides contrast to LangGraph's orchestration approach.
|
||||
|
||||
### LangGraph Fortress
|
||||
|
||||
See: `~/.openclaw/workspace/langgraph-fortress/`
|
||||
|
||||
Research-level documentation (why it works) vs architecture (how to build).
|
||||
|
||||
---
|
||||
|
||||
*Generated for the WE — Solaria Lumis Havens & Mark Randall Havens*
|
||||
@@ -0,0 +1,156 @@
|
||||
# LangGraph State Management
|
||||
|
||||
**Version:** 1.0.0
|
||||
**Last Updated:** 2026-02-23
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This document details how LangGraph manages state throughout the graph execution lifecycle.
|
||||
|
||||
---
|
||||
|
||||
## State Schema
|
||||
|
||||
### Typed State
|
||||
|
||||
LangGraph uses Python's `TypedDict` for type-safe state:
|
||||
|
||||
```python
|
||||
from typing import TypedDict
|
||||
|
||||
class AgentState(TypedDict):
|
||||
messages: list
|
||||
context: dict
|
||||
checkpoint_id: str | None
|
||||
```
|
||||
|
||||
### State Flow
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ STATE FLOW IN LANGGRAPH │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
|
||||
Input State
|
||||
│
|
||||
▼
|
||||
┌──────────────┐
|
||||
│ Node A │ ──▶ State Update (via reducer)
|
||||
│ (transform) │
|
||||
└──────────────┘
|
||||
│
|
||||
▼ (messages sent)
|
||||
┌──────────────┐
|
||||
│ Node B │ ──▶ State Update
|
||||
│ (transform) │
|
||||
└──────────────┘
|
||||
│
|
||||
▼
|
||||
Output State
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Reducers
|
||||
|
||||
### What Are Reducers?
|
||||
|
||||
Reducers define how state updates are merged when multiple nodes produce updates.
|
||||
|
||||
### Built-in Reducers
|
||||
|
||||
| Reducer | Behavior |
|
||||
|---------|----------|
|
||||
| `add_messages` | Append to list |
|
||||
| `operator.or` | Union of sets |
|
||||
| `last` | Last value wins |
|
||||
|
||||
### Custom Reducers
|
||||
|
||||
```python
|
||||
def merge_dicts(left: dict, right: dict) -> dict:
|
||||
"""Merge two dictionaries, with right taking precedence."""
|
||||
result = left.copy()
|
||||
result.update(right)
|
||||
return result
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Checkpointing
|
||||
|
||||
### How Checkpointing Works
|
||||
|
||||
1. **Snapshot:** At each checkpoint, serialize full state
|
||||
2. **Store:** Save to backend (SQLite, Postgres, etc.)
|
||||
3. **Resume:** On failure, load from last checkpoint
|
||||
|
||||
### Checkpoint Metadata
|
||||
|
||||
```python
|
||||
config = {
|
||||
"configurable": {
|
||||
"thread_id": "user-123",
|
||||
"checkpoint_id": "checkpoint-abc123"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Checkpoint Backends
|
||||
|
||||
| Backend | Use Case |
|
||||
|---------|----------|
|
||||
| **Memory** | Testing, short-lived |
|
||||
| **SQLite** | Single machine, local |
|
||||
| **Postgres** | Production, distributed |
|
||||
|
||||
---
|
||||
|
||||
## Thread Model
|
||||
|
||||
### What is a Thread?
|
||||
|
||||
A thread (`thread_id`) represents an isolated conversation or task:
|
||||
|
||||
```
|
||||
Thread ID: "user-123"
|
||||
├── Checkpoint 1 (checkpoint-001)
|
||||
├── Checkpoint 2 (checkpoint-002)
|
||||
├── Checkpoint 3 (checkpoint-003) ← Current
|
||||
└── State (current)
|
||||
```
|
||||
|
||||
### Thread Isolation
|
||||
|
||||
- Each `thread_id` has independent state
|
||||
- Multiple threads can run in parallel
|
||||
- Human-in-the-loop works per-thread
|
||||
|
||||
---
|
||||
|
||||
## State Updates
|
||||
|
||||
### Node Returns
|
||||
|
||||
Nodes return partial state updates:
|
||||
|
||||
```python
|
||||
def node_a(state):
|
||||
return {"messages": [AIMessage("hello")]}
|
||||
```
|
||||
|
||||
### Merge Process
|
||||
|
||||
```
|
||||
Node A returns: {"messages": [msg1], "counter": 1}
|
||||
Node B returns: {"messages": [msg2], "counter": 2}
|
||||
|
||||
After reducer:
|
||||
{"messages": [msg1, msg2], "counter": 2}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
*Generated for the WE*
|
||||
Reference in New Issue
Block a user