Shren Patel

AI Agent Orchestration Engineer

Production AI orchestration. Multi-agent workflows with approval gates, audit trails, and the engineering rigor enterprises actually deploy.

LangGraph Temporal LangFuse MCP RAG HITL TypeScript · Python
Canada remote · NA-friendly · EST/PST · Sold IP to Wattpad WEBTOON Studios · 40M+ reads on AXED
Summary

AI engineer with 10+ years shipping software at Coursera, Faire, D2L, and as an independent builder. Also creator of AXED, a WEBTOON Original that sold to Wattpad WEBTOON Studios — 40M+ reads, book and animated TV adaptations releasing in 2025.

I now build AI agent systems — software that uses LLMs to do real work (draft replies, process invoices, triage requests) with human approval before anything irreversible happens.

Currently building CentrOS, an AI orchestration platform that turns messy business workflows into reliable, auditable agent systems. Deployed in production at a 23-unit hospitality operation — daily live ops, 3,980+ tests on main. Read the deep-dive →

Open to Canada-remote Staff AI Engineer, AI Platform, AI Architect, and Forward-Deployed roles.

3.5 yrs
LLMs across dev & ops workflows
2+ yrs
Tool-calling & agent patterns
10+ yrs
Shipping software since 2013
40M+
WEBTOON reads · AXED IP sold to Wattpad WEBTOON Studios
Experience

Experience

Founder, CentrOS — AI Agent Orchestration Platform

Remote · Canada · Feb 2026 – Present
  • Built CentrOS production multi-agent orchestration runtime: typed tool contracts, eval gates, LangFuse traces, HITL approval interrupts — collapsed 4 vendor primitives into 1 governed layer.
  • Designed agent runtime on LangGraph state graphs + Temporal-style durability: durable execution, checkpointing, crash recovery, retries, versioned rollouts — 0 silent run loss across 6 months.
  • Built eval harness + LangFuse trace pipeline: golden datasets, task-level scorers, regression suites, prompt versioning, failure taxonomy — caught 22+ silent regressions before release.
  • Implemented hybrid RAG pipelines on pgvector + OpenSearch: query rewriting, source ranking, chunk hygiene, freshness boundaries, selective context packing — cut prompt token use 40% with grounding intact.
  • Designed HITL interrupt model: confidence-gated handoff, approval checkpoints before irreversible actions, reversible operations, audit trails, escalation policy — 100% of high-risk tool calls human-reviewed.
  • Built MCP-style tool registry with scoped OAuth permissions, JSON-schema validation, retry policy, side-effect logging — turned ad-hoc API calls into 100% audited typed agent tool invocations.
  • Deployed CentrOS in production at a 23-unit hospitality operation (guest comms, refunds, vendor invoices, compliance) — 3,980+ Vitest tests passing on main, CI-backed live ops since Feb 2026.
  • Shipped 5+ reusable platform primitives (connectors, secrets, approval gates, eval hooks, deploy templates) — cut new-workflow setup from weeks to 2 days while preserving runtime + security standards.

Stack: TypeScript · Hono · PostgreSQL · Drizzle · Vitest · LLM tool-calling

Vertical Automation & Ops Lead, Conscious Creations

Remote · Canada · Sep 2022 – Present
  • Prototyped accommodations workflows on CentrOS: 3-stage guest-triage pipeline (classify → plan → draft) in Python/FastAPI with HITL approval gates before any irreversible action.
  • Designed AI-to-human handoff: shared state contract, full conversation history, durable IDs, operator-readable summaries — 0 lost context across 100% of escalation events in live ops.

Stack: Python · FastAPI · LLM tool-calling · workflow automation

Operations & Portfolio Manager, Independent — Property Management

Greater Toronto Area · Sep 2022 – Present
  • Ran ops on 20-unit + 3-unit residential portfolio (intake, scheduling, follow-ups, exceptions) — mapped 12+ recurring workflows informing CentrOS agent-automation vs HITL boundaries.

Independent Engineer / Builder, Software, AI automation & creative IP

GTA · Remote · Sep 2022 – Jan 2026
  • Built LLM workflow automation since ChatGPT launch (Nov 2022): tool-calling/API automations from 2023; LangGraph multi-agent patterns from 2024; MCP integrations from late 2024 — 3+ years compounding.
  • Shipped AXED IP rights deal with Wattpad WEBTOON Studios — book + 2-season animated television adaptation (2025) — plus content automation tooling formalized into CentrOS platform Feb 2026.

Stack: Python · TypeScript · Node · LLM tool-calling · MCP · multi-agent orchestration

Senior Frontend Developer, Faire

Remote · Canada · Jun 2021 – Aug 2022
  • Drove apparel pre-order initiatives at Faire: $5M+ GMV across 3+ international markets over 6-month staged workstreams; React/TypeScript/MobX shipped behind A/B guardrails.
  • Drove site-wide perf gains via A/B testing + staged rollouts; same guardrail discipline now applied to agent versioning, shadow traffic, and canary releases at CentrOS — 5+ wins shipped.

Stack: React · TypeScript · MobX · Jest · Storybook · A/B testing

Creator, WEBTOON Original — AXED, WEBTOON

Remote · Jun 2018 – Dec 2020
  • Built Python/JS image-processing + consistency-check pipelines for 445K+ subscriber WEBTOON Original at 2x/week ship cadence — 10% fewer re-uploads, hundreds of operator hours saved.
  • Designed queue → tooling → review → ship operator workflow at 2M+ monthly views on WEBTOON Original — precursor to today's HITL agent + human production system at CentrOS.
  • Brokered AXED IP rights deal with Wattpad WEBTOON Studios — book + 2-season animated television adaptation releasing 2025 — multi-year multi-format commercial agreement structured solo.

Stack: Python · OpenCV · image processing

iOS Mobile & Web Developer, Coursera

Mountain View, CA · Jan 2017 – Apr 2017
  • Shipped iOS course-download feature in 4 months, lifting course completion 20%+ across Coursera's 24M+ learner base; Swift + offline-sync architecture.
  • Built React GUI-creation infrastructure at Coursera; cut new-view setup from days to 1 hour and compounded across 5+ teams as the company standard.
  • Built Python search tooling lifting course content discovery 60% (pre-LLM NLP era) — same retrieval problem CentrOS now solves with embeddings, hybrid RAG, and pgvector.

Stack: React · Python · iOS SDK · Swift · NLP/search

Software Designer / Web Developer, Desire2Learn (D2L)

Toronto · May 2016 – Aug 2016
  • Built D2L flashcard module: auto-question-generation + variable-interval testing — boosted student information retention up to 7× vs baseline review across pilot cohort.

Stack: JavaScript · Node.js · React · SQL

Mobile Developer, SpringBoard Data Management

Mississauga · Sep 2015 – Dec 2015
  • Shipped re-hauled iOS app with secure authentication + Objective-C overhaul; cut load times 80% on legacy codebase under a 4-month delivery window.
  • Built deductions PDF parser automation cutting client manual processing 90%+ — early business-workflow automation pattern carried forward into the LLM tool-calling era.

Stack: iOS SDK · Objective-C · PDF processing · auth systems

Software Developer, Minted LLC

San Francisco · Aug 2014 – Dec 2014
  • Built Python image-metadata extractor + printer-safe color-space converter on AWS pipeline — 50% reduction in reprint requests across Minted's print catalog.
  • Built conversion-loss trackers + shipped 3+ hotfixes during Cyber Monday peak (10x+ traffic spike) — early production-incident response discipline.

Stack: Python · image processing · color space conversion · AWS

Graphics & Portability Developer, TransGaming Inc

Toronto, ON · Jan 2014 – Apr 2014
  • Built C++ dynamic libraries emulating Windows on macOS for Cider wrapper engine; shipped 3+ QoL features into Guild Wars 2 + Eve Online (3M+ combined players).

Stack: C++ · Objective-C · Mac OSX SDK · Windows SDK

Three years of compounding AI practice — formalized into a production runtime.

PeriodWhat I was building
2013 – 202210 years shipping production software — web, mobile, games, data tooling (Faire, Coursera, D2L, Minted, WEBTOON)
Nov 2022 →LLMs deployed across dev & business workflows (ChatGPT → GPT-4 API) · 3.5 years
Jun 2023 →Tool-calling pipelines + custom Python/JS automation runtimes · 3 years
2024 →Graph-style multi-agent orchestration, RAG, evals, HITL approvals (LangGraph era) · 2+ years
Late 2024 →MCP tool integrations · 1.5 years
Feb 2026 →CentrOS — production AI orchestration platform deployed at a 23-unit hospitality vertical · daily live ops, 3,980+ tests on main

The compounding practice is in the runtime: 3+ years of LLM tool-calling, multi-agent workflows, RAG, evals, and HITL approval patterns, formalized into one governed production layer. Pairs with domain experts (legal, accounting, ops) on regulated work so high-stakes decisions always get human sign-off at the approval step.

Skills

Skills

Click to expand. AI / agent skills first, then the rest of the stack.

AI agent orchestration
  • Multi-agent workflow design & task decomposition
  • Tool/function calling, structured outputs, schema-validated arguments
  • Stateful workflow orchestration, checkpointing, crash recovery, replay
  • Human-in-the-loop approvals, escalation paths, audit trails, governance
  • RAG, retrieval pipelines, embeddings, context windows, prompt management
  • Agent evaluation, tracing, observability, latency & token-cost monitoring
  • MCP servers, webhooks, OAuth-scoped tool access, third-party API integrations
  • Router · executor · adversary · verifier patterns; confidence-gated handoff
  • POCs, demos, eval-gated rollout & production hardening for business workflows
Languages

Proficient

  • Python 3
  • JavaScript · TypeScript
  • C / C++
  • HTML5 · CSS

Familiar

  • Objective-C · C# · VB / VB.Net
  • Arduino · Processing
  • MIPS / ARM assembly · uC++
  • GML
Frameworks · libraries · SDKs
  • React + React Native
  • MobX + Redux
  • Jest + Storybook
  • Node.js + Express
  • FastAPI + Django + SQLAlchemy
  • TensorFlow + OpenCV
  • Hugging Face
  • Socket.IO + WebGL
  • Bootstrap + SASS / LESS
  • Selenium + Cocos2d-x
  • jQuery + Angular
  • Android SDK · iOS SDK
  • Windows SDK · Mac OSX SDK
  • Cocoa Frameworks
Databases & cloud

Databases

  • PostgreSQL · MySQL · MongoDB · SQLite
  • Vector DBs: pgvector · OpenSearch · FAISS

Cloud

  • GCP: BigQuery · Cloud Functions · Cloud Run · Vertex AI
  • AWS: Lambda · Step Functions · DynamoDB · S3 · SQS · EventBridge · API Gateway
  • Kubernetes · Docker · infrastructure-as-code
  • Serverless architectures · event-driven pipelines
  • Heroku · CI/CD deployment
Development tools
  • Git + Git-SVN
  • GitHub + Bitbucket
  • JIRA + Trello
  • Unix Terminal
  • Apache Server · MKS Integrity · RegEx
  • VSCode · PyCharm · XCode
  • Eclipse · Netbeans · Sublime Text
Integrations & ops
  • REST APIs · webhooks · OAuth
  • Slack · email · calendar · CRM-style workflows
  • Booking & hospitality / accommodations systems
  • PDF / document parsing pipelines
  • Git · CI · Unix · AWS · Heroku
Other
  • Forward-deployed mindset: gather requirements, ship POCs, iterate with stakeholders
  • Strong written communication for runbooks, eval reports & customer-facing demos
  • Background in high-scale web, mobile, and workflow automation since 2013
Education

Education

University of Waterloo

Bachelor of Software Engineering (BSE) — Computer Software Engineering

Selected work

Selected work

References

References

Named references from past colleagues, managers, and operating partners available on request — typically furnished after a first intro call so I can pick the most relevant voice for your role and team.

  • Engineering reference — past senior IC or lead at Faire, Coursera, WEBTOON, or D2L · available on request
  • Operating reference — partner from the 23-unit hospitality vertical or Conscious Creations · available on request
  • Commercial reference — counterparty from the AXED IP rights deal (Wattpad WEBTOON Studios) or a Faire cross-functional partner · available on request

Full reference list → (private, by request)

Role fit · interactive

Hiring for a specific role?

Pick the role you're hiring for. Each tab shows the tailored pitch and the capabilities that matter most — with a link to the CentrOS deep-dive.

Staff AI Engineer

Production agent code daily: LangGraph/Temporal-style stateful workflows, LangFuse-style observability, RAG, tool-calling, persistent runtimes, HITL. CentrOS is built around exactly these concerns — typed contracts, eval gates, owner inbox.

I'd bring orchestration-layer thinking into the team: durable multi-step execution, runtime state, traceability, retrieval quality, and the patterns that make agent behavior observable and trustworthy.

Strong match: Acquia Staff AI Engineer, Grafana Staff AI Engineer (RevOps)

What I bring

  • Stateful multi-agent workflows with typed state contracts
  • Non-deterministic eval harnesses with regression gates
  • RAG pipelines with retrieval ranking and freshness controls
  • HITL approval surfaces with audit trails
  • LangFuse-style observability over trace + cost + failure modes

Agent Runtime Engineer

The layer beneath agent behavior most teams underinvest in: state, execution, recovery, and production ownership. I treat agents as systems that need durable runtime semantics — not just clever prompts.

Session state, tool execution, retries, checkpointing, autoscaling, observability, developer ergonomics. CentrOS's contract lifecycle, gate runner, and rollback model translate directly to LLM streaming and crash recovery.

Strong match: Cresta Backend, AI Agent Runtime

What I bring

  • Crash-resumable execution with durable state on PostgreSQL
  • Gate runners: retries, checkpoints, rollback, stop-ship
  • Scoped tool-execution sandboxing with versioned deploys
  • Explicit failure taxonomy per stage — no silent corruption
  • Streaming LLM responses with cancellation and timeout policies

AI Platform Engineer

The capability layer that lets multiple teams deploy AI safely without bespoke setup: connectors, execution patterns, governance, tooling. CentrOS is built for repeatable platform leverage — golden paths over one-off demos.

Connector layer, execution environments, isolation boundaries, access controls, self-service onboarding, regulated deployment standards. Particularly relevant in fintech / public-sector settings where compliance rigor matters.

Strong match: Alpaca Senior AI Platform Engineer

What I bring

  • Tool registries with typed schemas and scoped permissions
  • Policy lanes (auto / confirm / careful) as an access-boundary model
  • Audit trails as first-class state, not afterthought logging
  • Self-service workflow templates with opinionated guardrails
  • CI-backed gate tests validating platform changes per merge

AI Architect

Translates ambiguous business processes into cloud-native AI systems with tool calling, retrieval, state management, and approval flows. The hands-on plus architecture-shaping zone where orchestration-layer thinking becomes most useful.

Reusable orchestration patterns for multi-turn, multi-agent workflows. Production engineering standards: CI/CD, logging, monitoring, secrets, testing, rollback. Good fit for consulting and product environments at once.

Strong match: AOT Technologies AI Architect, Acquia Enterprise AI Architect

What I bring

  • Reference architectures for agent workflows — blueprints, not one-offs
  • Framework-parity thinking: same primitives across LangGraph / Temporal / MCP
  • Risk-tiered governance language that stakeholders can use
  • Multi-vertical domain experience (hospitality, accommodations, ops)
  • Production engineering standards: CI / logging / monitoring / rollback

Workflow / GTM Automation Engineer

Treats agent systems as operational infrastructure, not side projects. Connects models, tools, retrieval, and humans into workflows that actually move work across Slack, CRMs, email, calendars, and internal APIs.

Modular tool access, orchestration patterns, runtime state, observability, escalation logic. Direct fit for MCP servers, multi-agent routing, and self-service workflow templates across Revenue Operations and similar GTM teams.

Strong match: Grafana Staff AI Engineer, Revenue Operations

What I bring

  • Router → executor patterns with specialized agents and fallbacks
  • MCP / tool-registry contracts with scoped permissions
  • RAG pipelines tuned for freshness and source attribution
  • Self-service workflow templates with guardrailed config
  • Narrow action scopes; approvals before any side effect

AI Agent Integrations Engineer

Sits at the boundary where agent systems become real products: integrations, context continuity, and reliable handoff between automation and humans. CentrOS handles that boundary by design.

Context propagation, session lifecycle, retries, fallbacks, and observability around the places where agents most often fail in production: API boundaries, action side-effects, and escalation points.

Strong match: Cresta Backend, AI Agent Integrations

What I bring

  • Evidence timelines preserved across AI → human handoff
  • Context-rich session transfer with no lost history
  • Audit-logged tool calls: idempotency, retries, rollback per integration
  • Explicit session lifecycle phases with state ownership transitions
  • Observability over handoff statistics and API failure patterns

Forward-Deployed AI Engineer

Customer workflow → production agent. POCs, integrations, live deployments. Pair with domain experts; ship the runtime; respect compliance. The platform was built running real accommodations ops — that's exactly the muscle.

Process mapping, action boundaries, approval nodes, target-state agent architecture. Strong on stakeholder translation: turn messy requirements into reliable systems that operate under business and compliance constraints.

Fit: any FDE/solutions role where the work is real customer workflows.

What I bring

  • Active production deployment: 23-unit hospitality vertical under direct management
  • POC → prod path through gated expansion, not big-bang launches
  • HITL demo surfaces tailored for stakeholder reviews
  • Domain-expert pairing (legal / accounting / ops) at confirmation gates
  • Domain-specific controls layered onto a shared workflow shell

Serverless Production AI Engineer

Python/Node, function calling, multi-agent orchestration, vector RAG, serverless AWS (Lambda · Step Functions · DynamoDB · SQS · EventBridge), Langfuse, synthetic testing. The production-engineering discipline that makes integration-heavy AI platforms reliable.

Event-driven systems, state machines, queues, contract tests, dependency mapping. CentrOS's contract/gate model maps directly to Step-Function-style state machines with explicit checkpoints.

Strong match: Robots & Pencils Staff AI Engineer

What I bring

  • Contract / gate state machines (Step Functions–style)
  • Tool registries with typed schemas and idempotent activities
  • Vector retrieval and streaming-response patterns
  • CI-backed contract/gate test suites running on main
  • Token-cost discipline: model routing, prompt design, caching
Contact

Contact

Canada remote · NA-friendly · EST/PST. Happy to walk through CentrOS architecture, framework parity, or a tailored eval/observability slice for your domain.