Name: CentrOS
Availability: LimitedAvailability
Author: Shren Patel

// why CentrOS exists

Three reasons agent demos don't survive production.

I kept hitting the same three walls when building agentic workflows for real ops. CentrOS is what falls out when you take those walls seriously instead of papering over them with prompts.

Most agents are stateless prompts. Real ops need durable state.

Crash recovery, checkpointing, replay, idempotent retries — these are not optional when an agent's action touches money, communication, or anything a human will be embarrassed by tomorrow. CentrOS contracts are PostgreSQL rows with explicit lifecycle states; runtime crashes resume from the last gate, not the beginning.

Most demos auto-execute. Real ops need approval gates.

Legal sign-offs, monetary actions, irreversible writes — every workflow I actually run has a step where a human has to say "yes." CentrOS makes that step a first-class part of the runtime: policy lanes route high-blast-radius actions to an owner inbox, and an audit trail captures who approved what and on what evidence.

Most platforms ship one workflow. Real ops scale through reusable patterns.

When I add a new vertical, I don't want to build a new platform. CentrOS is structured so that a new vertical = new triggers + new atomic work units + new tool boundaries + new linters + new approval rules. The runtime, the gate engine, the owner inbox, and the observability layer don't change.

// what CentrOS does

Three pillars.

Everything in CentrOS rolls up into one of three concerns. If a feature can't be filed under one of these, I don't build it.

Durable runtime

Typed contracts, eval gates, rollback, crash recovery. Built on PostgreSQL durability — not in-memory state, not "we'll add persistence later."

Governed autonomy

Every action lands in a policy lane. Lanes B and C route to the owner inbox before any write. Audit trails are state, not afterthoughts.

Reusable workflows

Same shell (Workbench), different controls per vertical. Motel ops today, hospitality-adjacent ops next, the same primitives carry both.

// vocabulary

The five terms that run the platform.

If you read the rest of this page or browse the repo, these five terms are the backbone. Learn them once and the rest of the architecture explains itself.

Contract

A typed, atomic work unit with risk tier, policy lane, and required gates. Lives as a row in PostgreSQL with a strict lifecycle.

Why: makes agent intent inspectable, durable, and resumable across crashes.

Risk tier

R1 / R2 / R3 — escalates with the blast radius of the action. Communications < money < regulatory.

Why: the runtime can refuse high-tier work without manual escalation criteria; tier maps to lane.

Policy lane

A (auto), B (confirm), C (careful). Determines whether the executor proceeds, waits for owner sign-off, or pauses for adversarial review first.

Why: maps intent to autonomy level — not the other way around.

Gate

Evidence · Adversary · Verifier · Rollback · Stop-Ship. Every contract must pass its required gates before close. Failure is a first-class outcome with its own state.

Why: non-deterministic systems need deterministic acceptance criteria.

Workbench mode

Ask · Books · Legal · Plan. The same shell renders different tool boundaries and linters depending on which mode the operator opens.

Why: verticals share infrastructure but need domain-specific controls at the edges.

// the four-agent pipeline

Router · Executor · Adversary · Verifier.

Every contract moves through four agents. Click any to see what it reads, what it writes, what it can fail at, and what the next agent picks up.

The router reads the intake event and decides what it actually is. Then it produces a typed contract with the right risk tier and policy lane.

Reads: Raw intake (email body, webhook payload, calendar event, GitHub hook).
Writes: Contract row with intent classification (DO / DECIDE / DEVELOP / DOCUMENT), risk tier, lane, required gates.
Fails when: Intent is genuinely ambiguous → escalates to owner inbox as an unclassified contract rather than guessing.
Hands off to: Executor (lane A) or owner inbox (lane C straight to confirm).

Equivalent layer to: LangGraph state-machine entry node, Temporal workflow signal handler, MCP capability discovery.

The executor does the actual work. It calls tools from a per-contract tool registry, pulls retrieval-backed context, and writes state at every checkpoint so the runtime can resume from the last gate if anything dies.

Reads: Contract spec + retrieval results from the RAG layer + tool registry permissions.
Writes: Tool-call evidence, intermediate state, checkpoints, structured outputs (Zod-validated).
Fails when: A tool returns an error, validation fails, or the contract enters an illegal state — the DB layer enforces it.
Hands off to: Adversary, with everything the executor did captured as evidence.

Equivalent layer to: LangChain / CrewAI tool-use loops, Temporal durable activities, agent-runtime function execution.

The adversary's job is to find what the executor missed. It reviews the evidence and asks: "what could go wrong if this ships?" If it finds a real issue, it escalates the contract — back to the executor for a retry, or straight to the owner inbox for a human call.

Reads: Executor evidence + contract acceptance criteria + adversarial prompts tuned per workflow.
Writes: Critique notes, severity axes, escalation recommendation.
Fails when: Critique is shallow (eval flags low confidence) — itself a trigger for human review.
Hands off to: Verifier, with adversary critique appended to the evidence trail.

Equivalent layer to: LLM-as-judge eval patterns, golden-dataset regression scorers, safety review checkpoints.

The verifier is the gate runner. It checks that all required gates passed, writes the contract's final acceptance state, and either closes the contract (lane A) or routes it to the owner inbox (lane B/C).

Reads: All evidence + adversary critique + required gate list from the contract spec.
Writes: Final gate-pass state, audit summary, close timestamp or escalation reason.
Fails when: Any required gate is unfulfilled — contract enters gate-pending until owner resolves it.
Hands off to: Owner inbox (lane B/C) or contract archive (lane A passed).

Equivalent layer to: LangFuse trace span aggregation, CI gate-pass checks, approval-workflow state machines.

// contract lifecycle

Where every contract goes.

Click a state to read what it means and what transitions into it.

archived

Contract closed and durable. Evidence, tool calls, gate decisions, and approver (if any) are retained for audit. Read-only from this point — the audit trail outlives the runtime.

// owner inbox

Where humans actually decide.

Lane B and lane C contracts route here. The operator sees the contract title, a trust strip summarising agent confidence, the full evidence timeline, and three actions: Approve, Hold, Deny.

R2

Refund — Booking #4421 · $284 CAD

router → executor → adversary → verifier · 8 evidence items

R1

Reply to guest — late check-in for unit 12

router → executor → adversary → verifier · 5 evidence items

R3

Tax filing draft — quarterly remittance

adversary flagged: source doc >90d old · 11 evidence items

Evidence timeline · Refund #4421

router → classified as DECIDE · R2 · lane B
executor → fetched booking, cancellation policy, refund window
executor → drafted refund authorisation ($284, partial)
adversary → "matches policy; guest 1d outside full-refund window — partial is correct"
verifier → Evidence ✓ · Adversary ✓ · Verifier ✓ · waiting on owner

The owner inbox is the only place CentrOS lets agents act on R2/R3 work. Approvals are reversible by design: every action has a rollback path, and the audit trail captures who approved on what evidence. Lane A contracts never touch this surface — they auto-close once gates pass.

// verticals

Where it runs in production.

CentrOS isn't a thought experiment. It runs on workflows where someone is actually waiting on an answer.

Hospitality operations

23 units · GTA · primary production vertical

Triggers: guest email/SMS, vendor invoices, maintenance tickets, compliance asks
Atomic work units: guest reply, refund decision, booking adjustment, vendor PO, compliance log entry
Lane B/C ratio: roughly 60% — most actions touch revenue or regulated communication
Linters: refund policy bounds, communication tone, regulatory date windows

Conscious Creations

Accommodations consulting · lighter-weight automations

Triggers: client guest enquiries, ops workflow asks
Built the public guest-triage demo here — Infinite-Summoner runPipeline pattern
Python/FastAPI automations alongside CentrOS contracts for shared workflows
Pairs with operator judgment — never fakes domain licensure

New vertical = new triggers, new atomic work units, new tool boundaries, new linters, new approval rules. Not a new platform.

// traction

Traction.

Built solo. Ships fast. Runs in production daily. Concrete signals below so the runtime story isn't only a version number.

3,980+

Vitest tests passing on main

347

test files across the runtime + workbench

23

hospitality units running CentrOS in production

4

vendor primitives consolidated into 1 governed layer

Production deployment: 23-unit hospitality vertical · daily live ops since Feb 2026 · CI-gated releases on main
Build cadence: Solo founder build · daily commits · 3,980+ tests across 347 files passing on main
Compounding practice: 3+ years of LLM tool-calling, multi-agent patterns, RAG, evals, and HITL approval design formalized into the runtime
Design-partner posture: Selective conversations with operators in regulated and high-stakes verticals · email for an intro

// vs the parts

One runtime, not five integrations.

All of these tools are good. Stitched together, they leave seams — different governance models, different observability surfaces, different state stores. CentrOS is one runtime where the seams don't exist.

What you'd stitch together	CentrOS gives you
LangGraph state machine	Router + policy lanes + contract state
Temporal durable execution	Contract lifecycle + gate runner + rollback gate
LangFuse traces & eval scores	Gate evidence + run history + owner inbox audit trail
Pydantic typed tool I/O	Zod contract schemas + Drizzle row schemas + linter passes
MCP tool servers	Tool registry + scoped auth + audit logs per call
n8n / Workato / Zapier connectors	Same trigger surface + risk lanes + approval gates + verifier
Separate approval-workflow tool	Owner inbox: Approve / Hold / Deny on the same contract row

// stack

What it's built on.

Data

PostgreSQL
Drizzle ORM
Zod schemas
Migration-managed

Runtime

TypeScript
Node.js
Hono
Vitest

Agents

LLM tool-calling
OpenAI SDK
Anthropic SDK
Structured outputs

UI

Expo · React Native
Workbench shell
Owner inbox
Modes: Ask · Books · Legal · Plan

CI / Eval

Vitest suite (3,980+)
Contract / gate tests
Synthetic eval cases
Gate-pass on main

// try it

Live demo — guest triage in 3 stages.

The same 3-stage pipeline from the main site, embedded here for convenience. Mock mode runs entirely in-browser. Classification → plan → draft, with HITL escalation surfaced in the JSON output.

Mock mode: in-browser, free, deterministic. Live mode: needs cd demo && npm start + OPENAI_API_KEY. Open demo full-screen ↗

// status

Where this actually is.

Honest specs. CentrOS is built solo, ships fast, and runs in production.

Built since: Feb 2026 — solo founder build, daily commits, 3,980+ tests passing on main
Status: Production deployment on a 23-unit hospitality vertical · closed showcase · selective design-partner conversations
Practice behind it: 3+ years of LLM tool-calling, multi-agent patterns, RAG, HITL — formalised into CentrOS Feb 2026
Want access or to collaborate?: Email me ↗

I'm Shren Patel — I built CentrOS.

Open to Canada-remote Staff AI Engineer, Agent Runtime, AI Platform, AI Architect, Workflow Automation, and Forward-Deployed roles. The platform is the proof; let's talk about how it maps to your problem.

← Pick the role you're hiring for Email Shren LinkedIn ↗