Skip to content

demos9 live

Try the work.

Hosted artifacts you can run in the browser. Open source, MIT licensed, no signup or keys on the demos themselves.

Featured3 essential

  • Durable AI commerce backend

    loom

    Four workflows that demonstrate the patterns that make agent-driven money movement safe in production: durable sleep, exactly-once side effects, saga compensation, webhook drift reconciliation, agentic spending authority bounded by a deterministic gate.

    Built on Vercel Workflows for durable execution. Cart abandonment runs a durable sleep, drafts a re-engagement email with Haiku, sends through an idempotent receiver — the same idempotency key returns deduplicated:true so duplicates never reach the customer. Dynamic checkout calls Opus for a structured AgentDecision via generateObject and Zod, then a five-line deterministic gate compares the requested amount against an env-driven ceiling before applying anything. Shipping monitor runs saga compensation with namespaced cancel keys. Stripe webhook drift demo shows receive-then-consume-later with a visitor's consumer cursor — events stay in the store, the cursor advances, dimmed events visualize the multi-consumer pattern. A Sirens eval harness runs 10 adversarial scenarios in CI. A failure-injection harness proves zero duplicate sends under fault. Hardened with per-IP rate limit, daily USD cap, and per-visitor cookie scoping.

    TypeScriptNext.js 16Vercel Workflow SDKVercel AI SDKAnthropic SDKStripeZodUpstash
  • Multi-agent debugging concierge

    forge

    Paste a stack trace. Four specialist subagents fan out in parallel, then a coordinator merges their structured findings into ranked, calibration-weighted hypotheses.

    Parallel fan-out via Promise.all plus pLimit bounded concurrency. Two-pass agent pattern (generateText then generateObject) for reliable structured output from a tool-using loop. Durable resumable session state, preemptive abort plumbed end-to-end via AbortSignal with cross-instance signaling through Upstash, a Brier-score calibration log that downweights overconfident lanes at merge time, a five-scenario CLI eval harness with graded IoU rubric and n-runs aggregation, Anthropic prompt-caching breakpoints with honest below-threshold disclosure. Hardened with per-IP rate limit and daily USD spend cap.

    TypeScriptNext.js 16Vercel AI SDKAnthropic SDKp-limitZodUpstashTailwind v4
  • AI-native product catalog

    anchor

    Every product has a human page and three statically-cached LLM-facing endpoints. The /.well-known/agents.json descriptor publishes endpoints + auth + pricing negotiation. Agents transact through an eight-check delegated-authority pipeline.

    Ten fictional specialty-coffee products. The agent surface splits into /agent/markdown (citation opening + JSON-LD fence + prose), /agent/json (Schema.org Product/Offer only), and /agent/plain (prose only). All 30 routes prerendered at build, served from the edge cache in ~5ms. The proxy layer classifies User-Agent on every fetch and queues a Redis write via after() so cache hits stay instrumented. The /api/agent/checkout endpoint runs eight ordered checks (signature, expiry, agent binding, scope, nonce, idempotency, inventory, charge) and revalidateTag invalidates only the affected product entry on a sale, leaving the other nine cached. A live AEO dashboard surfaces 24h / 7d / 30d totals, bot-mix breakdown, per-product counts, and a live tail. A /docs/rendering page documents every route's render mode so the build is self-documenting. A dev-only render inspector overlays colored boundaries on every component classifying its mode and shows a counterfactual savings panel vs a fully-dynamic baseline.

    TypeScriptNext.js 16React 19Vercel AI SDKAnthropic SDKZodUpstashTailwind v4

More demos6 live

  • In-browser data agent

    tablesalt

    Drop a CSV, ask a question, the agent renders the answer as the right kind of UI: a chart, a stat card, a table, or a list.

    DuckDB-WASM runs every query in the browser so the file never uploads. The agent streams a four-step reasoning trace before the SQL lands. A live eval scoreboard runs 12 hand-labelled questions against the model on demand and shows per-case accuracy and dollar cost.

    Next.js 16React 19TypeScriptTailwind v4Vercel AI SDKVercel AI GatewayDuckDB-WASMstreamfield
  • npm library

    streamfield

    A tiny React library for displaying a structured AI response as it's still being generated, without the fields flickering and snapping into place.

    Diffs successive partial-object snapshots from the Vercel AI SDK's streamObject and exposes a pending / streaming / complete state per field. About 150 lines of source. The playground renders the same partial with and without the library so you can see the difference.

    TypeScriptReacttsupVercel AI SDKVercelNext.js 16

    Also on npm: npm install streamfield

  • LLM eval harness

    fedbench

    An open-source evaluation harness that measures hallucination, citation accuracy, and refusal discipline in grounded Q&A.

    BM25 retrieval over two side-by-side public corpora (Medicare + OSHA, 21 verified Q&A pairs), Sonnet-4.6 grounded answers, deterministic citation check, Opus-4.7 LLM-as-judge. Demo is a replay viewer — explore every pair with retrieved chunks, the agent's answer, and the judge verdict.

    TypeScriptBunAnthropic SDKMCPBM25LLM-as-judge
  • MCP server / agent tool design

    fieldops-mcp

    A Model Context Protocol server that exposes a small-business field-services dispatcher workflow as six agent tools.

    Six tools, six structurally distinct shapes — read, search, mutation with typed errors, composition, aggregation, and human escalation. The interesting work is choosing what tools to expose, not the plumbing.

    TypeScriptBunMCP TS SDKZodClaude Desktop

    A showcase, not a live MCP host — interactive tour of real captured exchanges. MCP itself runs client-side inside an LLM host; the page explains why.

  • Sub-agent orchestration

    grant-pilot

    A multi-turn agent that helps a small business or nonprofit find federal grants and drafts an application skeleton.

    A planner dispatches three specialist sub-agents (discovery, eligibility, drafter) over live federal grants and entity-registration data. Per-sub-agent fallback ladder, structured-failure routing, hosted with a 5-intent allowlist, daily budget cap, and per-IP rate limit.

    TypeScriptBunAnthropic SDKNext.js 16VercelUpstash
  • Grounded RAG chatbot

    kev-o

    Ask anything about my work. He cites his receipts, and he won't make stuff up.

    Grounded chatbot trained exclusively on my public corpus: blog posts, project case studies, resume, and the READMEs of my open-source repos. Hybrid retrieval (BM25 candidate pool plus Voyage cross-encoder rerank) feeds Claude Sonnet 4.6 streamed through the Vercel AI SDK. Three surfaces share one brain: this standalone subdomain, a global Command-K palette on every page, and inline punch-ins at the foot of curated entries. Hardened with per-IP rate limit, daily USD spend cap, and an owner-bypass route with timing-safe key comparison.

    TypeScriptNext.js 16Vercel AI SDKAnthropic SDKVoyage rerankBM25UpstashTailwind v4

    Also embedded on every page of this site (try Cmd-K) and at the foot of every blog post and case study.