
Your machine.
Your AI.
Your rules.
SocketBase is an OpenClaw-style, local-first AI agent — a single command centre for chat, tools, automation, RAG, and visual flow design. Fully offline, fully yours.

SocketBase is an OpenClaw-style, local-first AI agent — a single command centre for chat, tools, automation, RAG, and visual flow design. Fully offline, fully yours.
I looked at OpenClaw and thought: “This is exactly what I want — but I want to build it in the languages and frameworks I use every single day.”
So I did. Next.js for the UI. Fastify for the gateway. TypeScript end-to-end. PostgreSQL in Docker for storage — sessions, messages, cron jobs, skills, embeddings, and flow definitions all in one place. No unfamiliar toolchain, no mystery runtime — just the stack I already ship production code with, repurposed into a local AI command centre that I actually understand top to bottom.
If you live in the JS/TS ecosystem, this is the agent framework that feels like home.
Docker Compose brings up Postgres, pgvector & the gateway in seconds
$ docker compose up -d
Creating socketbase-postgres … done
PostgreSQL 16 + pgvector · port 5432
Sessions, messages, cron jobs, skills, RAG embeddings
Creating socketbase-gateway … done
Fastify gateway · port 3010
$ npm run dev
Next.js UI · http://localhost:3005
Ready — PostgreSQL, RAG, Flow Engine, LM Studio all connected
Postgres + pgvector
Runs in its own Docker container. Stores sessions, messages, cron jobs, skills — and vector embeddings for RAG.
RAG Pipeline
Ingest docs, chunk, embed, and retrieve — all local. The agent queries pgvector for relevant context before every response.
Flow Design Engine
Visual canvas to build multi-step execution paths. Start, run, call tool, pause, end — import/export as JSON.
+ Qwen3-30B-A3B · thinking-2507
The Rig
RTX 3090
24 GB VRAM — the entire model sits in GPU memory
32 GB
DDR RAM — Postgres, Next.js, the gateway & RAG pipeline all at once
Intel i9
CPU — embeddings, tool execution & Docker containers at full speed
No data centre. No cloud bill. A 30B reasoning model, a Docker-hosted Postgres with pgvector, RAG, a flow engine, and the full agent stack — all on a single desktop under the desk.
$ lms load qwen3-30b-a3b-thinking-2507
Loading model… 30B params, ~3B active (MoE)
GPU: NVIDIA RTX 3090 · 24 GB VRAM · model fully offloaded
Context: 32 768 tokens · Built-in reasoning traces
Server listening on http://localhost:1234
$ # SocketBase points here — zero cloud, full reasoning
$ SOCKETBASE_LLM_URL=http://localhost:1234/v1
Qwen3-30B-A3B is a Mixture-of-Experts model: 30 billion parameters, but only ~3 billion are active per token. That means it thinks deeply without melting your GPU. The thinking-2507 variant adds built-in chain-of-thought reasoning — it shows its work before answering.
On an RTX 3090 with 24 GB VRAM, the entire model loads straight into GPU memory — no CPU offloading, no quantisation compromises. Pair that with 32 GB system RAM and an i9, and you have more than enough headroom to run LM Studio, a Docker Postgres with pgvector for RAG, the Fastify gateway, the Next.js UI, and a full Playwright browser — all at the same time, on one machine.
Load it in LM Studio and you get a local OpenAI-compatible API at localhost:1234. SocketBase connects to it out of the box. No API keys. No rate limits. No data leaving your network. Just point, run, and let a reasoning model power your agent loop on hardware you already own.
Postgres 16 + pgvector in a container. Sessions, messages, cron jobs, skills, and vector embeddings — docker compose up and you're done.
Ingest documents, chunk, embed with your local model, store in pgvector, and retrieve relevant context before every LLM call. Fully offline.
Visual canvas to design multi-step execution paths: start, run process, call tool, pause, end. Each flow becomes a tool the LLM can invoke with run_flow.
LLM-driven tool execution with automatic retries, learnings, and a shared LESSONS_LEARNED file that makes the agent smarter over time.
Markdown-based skills for tasks, weather, browser, and cron jobs — toggle in Settings. The agent can even write its own.
Chat from your phone via Baileys. Webhooks let external services trigger the agent. Same brain, any channel.
Playwright-powered: browse, fill forms, scrape data — all from a tool call. The agent sees the web like you do.
DuckDuckGo, Wikipedia, World Time, REST Countries — no API keys required. Useful tools out of the box.
Import flows by pasting JSON or syncing from a directory (e.g. data/flows). Version and share flows as simple JSON files.