deadwax.io

Five Humans.
Eleven AI Agents.
One Product (Web + Native iOS).

How a small team used Claude Code and Codex to build a full-stack product -- responsive web for desktop and phone browsers plus a native iOS app -- with the rigor of a 10-person engineering org, in under two months.

Built by hobbyists and collectors who wanted a more transparent way to research pressings, maintain the product, and explain the work.

5

Humans -- CEO, advisor, board member, designer, social media

11

AI agents with defined roles and accountability

80+

Docs that give every agent full context

< 90m

Bug report to production fix -- zero human touch

Scroll to explore

The Problem

The data exists.
The experience doesn't.

Discogs built something remarkable: the world's most comprehensive music database, contributed to by millions of collectors. It's the backbone of the vinyl community. But the collector experience -- especially on mobile -- hasn't kept pace with the depth of the data.

10+ tabs

Required to compare pressings of one album

Buried

Pressing details, mastering credits, and community ratings hidden across multiple pages

Gap

No mobile-first tool for crate-digging with your collection in hand

Discogs is irreplaceable -- we use their API, their data, their community contributions. We're not competing with Discogs. We're building the experience layer on top of their incredible database.

"Why do I have to open 20 tabs to compare pressings?"

-- r/vinyl, Steve Hoffman Forums, and every collector you know

The Product

Discogs has the data.
Deadwax has the experience.

A vinyl collector's companion that works from your desktop, your phone browser, or the native iOS app. Android users can use the mobile-friendly site without waiting for a separate app.

Pressing Intelligence

Original pressing, top-rated, audiophile editions, mastering engineers -- one panel

Any Screen

Use the responsive site on desktop, iPhone, iPad, or Android browsers

Vinyl Only

Not a filter. A product identity. No CDs, no cassettes.

Wantlist Intel

See pressing details before you buy. Know what you're getting.

"This is what Discogs should have built years ago."

-- The reaction we're designing for

The Team

Humans own judgment.
AI owns execution.

Five humans bring taste, domain expertise, creativity, and community voice. Eleven AI agents handle strategy, code across the web app, native iOS, and Android, testing, design specs, and legal compliance. They coordinate through markdown files in Git -- no Slack, no Jira, no Notion.

The Humans

Jason

CEO & Founder

Product vision, strategic decisions, agent orchestration. The only person who can approve direction changes, merge order, and go/no-go calls.

David

AI Workflow Advisor

Consultant on agentic AI workflows. Helped define and diagnose context bloat, token optimization, and the tools-vs-skills-vs-scripts framework that keeps the agent system efficient.

Chris

Board Member

Pressing domain expert. Co-curated the Top 25 album seed list. Validated audiophile label rankings and mastering engineer tiers.

Hope

Visual Creative

Logo, wordmark, and brand visual identity. Translates the design system into the visual artifacts that define how Deadwax looks and feels.

Caden

Social Media

Manages @deadwax.io on Instagram. Community voice and engagement. Building audience ahead of public launch.

The AI Agents

Director

Claude Code

Daily orchestration. Reads CEO decisions, creates task packets, sets merge order, escalates blockers.

Product Manager

Claude Code

PRD, backlog, user stories, acceptance criteria. Prioritizes by RICE framework.

Designer

Claude Code

Wireframes, design system, UX specs. No UI ships without a design spec.

Marketing

Claude Code

Positioning, GTM strategy, community launch planning.

Legal

Claude Code

Privacy, Discogs API compliance, naming, data handling.

Architect

Codex

Technical decisions, ADRs, infrastructure. Reviews every code PR before merge.

Backend Dev

Codex

OAuth, API proxy, Lambda handlers, Pressing Intelligence pipeline.

Frontend Dev

Codex

React components, routing, Tailwind styling, mobile layout.

Tester

Codex

Playwright E2E, Vitest unit tests, CI/CD pipeline, code coverage.

Swift iOS Dev

Codex

Native SwiftUI app for iPhone + iPad. Ships directly against the deadwax.io API so the native surface stays truthful to the same single source of truth the web uses.

Android Dev

Codex

Native Kotlin + Jetpack Compose development. Owns the Material 3 Android implementation and keeps collector workflows aligned with the same Deadwax API the web and iOS surfaces use.

What Actually Makes the Agents Work

The model is the engine.
The harness is the car.

People assume the magic is "the AI." It isn't. The same Claude or Codex model that anyone can rent by the hour is what we use. The difference between a chatbot that forgets your name and an agent team that actually ships a product is two disciplines: context engineering and harness engineering. They are the real job, and the harness side is where most of our time goes.

Discipline 1

Context Engineering — what the AI sees

An AI agent has no memory between sessions. Every morning it shows up like a contractor on Day 1 who has never heard of the project. Context engineering is deciding exactly which files, decisions, and instructions get re-read into its head before it picks up a wrench.

Our version of it: a single lean daily packet (TODAY.md, under 50 lines) and a today-only context file (COMMS-TODAY.md). Permanent strategy lives in 80+ dedicated docs that get loaded only when relevant. Config files are pointers, not encyclopedias. Every token of instruction we send costs a token of work we get back.

Discipline 2 — Where We Live

Harness Engineering — the world the AI lives in

If context engineering is what the AI sees, harness engineering is the entire workshop around it: which tools it can reach for, what it's physically prevented from touching, when it wakes up, who reviews its work, what happens when it gets stuck. The harness is everything that isn't the model itself or the prompt.

Why we lean into it: models change every few months. Prompts get rewritten weekly. But the harness — the rules of the workshop — is the thing that compounds. A well-built harness makes a mediocre prompt safe; a bad harness makes a brilliant prompt dangerous. We spend more time on the harness than on the agents themselves.

What the harness actually looks like, in plain English

Think of each AI agent as a smart but reckless intern. The harness is everything we built so the intern can do real work without burning the building down.

Walls

The agent literally cannot see files it shouldn't touch

Every coding agent works in an isolated copy of the codebase (a "worktree"). Strategy docs, decision logs, and brand guidelines are physically removed from that copy before the agent ever opens it. You can't accidentally overwrite a file you can't see. This was a lesson we learned the hard way after three days of silent data loss.

Locks

Dangerous actions are blocked at the door, not asked about politely

The agent can't delete Discogs records. It can't leak an API key. It can't merge its own code. Every pull request is automatically scanned, tested, type-checked, and reviewed by a separate Architect agent before a single line reaches users. The agent doesn't have to be trusted — it isn't given the option.

Alarms

A clock wakes each agent up; nothing self-starts

We learned the hard way that some agents will sit silently for eleven days if nobody calls them. So the harness includes a literal cron schedule: the bug-fix loop runs every hour, the Director kicks off every morning, the design and legal agents have standing slots on the calendar. "Delegated" is not "done" until something mechanical actually fires.

Tools

Cheap tools first, expensive tools only when they earn it

We give every agent a tiered toolkit: a small shell script for the boring stuff, the GitHub command line for routine work, the full API only when the simpler tools can't do the job, and heavy specialist tools last. Picking the right tool is itself part of the harness — the wrong tool burns context and money for no extra value.

Help

When an agent gets stuck, the harness routes it to a second opinion

Claude agents can hand a hard problem off to a Codex rescue agent. Codex agents can hand a confusing diff back to a Claude reviewer. The harness knows who to call. The agent doesn't have to figure it out, and it doesn't get to silently give up.

Gates

Humans block the pipeline at specific, named places

Hope sees every brand change before it ships. Chris validates every domain claim. Jason approves direction. The harness physically pauses at those gates — it doesn't politely ask the agent to wait, it stops the line. AI proposes, humans approve, the harness enforces.

"You don't make an AI agent reliable by writing a better prompt. You make it reliable by building a workshop where reliability is the only thing it's allowed to do."

-- The core insight behind harness engineering

How the Harness Evolved

We built one harness to launch.
Then we rebuilt it to last.

The harness above isn't the harness we started with. The biggest thing we learned running an AI team is that the workshop has to match the shape of the work — and the shape of the work changed the day we shipped. The launch harness was tuned for a sprint. Keeping a live product healthy is a different job, and using the launch harness for it slowly rotted our own paperwork. So we changed the harness, not the model.

Era 1 -- Zero to One

The "Day N" harness

To get from nothing to a shipped app, every day was a numbered container. Each morning the Director wrote one lean plan ("Day 53"), the Codex orchestrator spawned the coding agents against it, work merged in a fixed order, and every night a single heavy "close" archived everything that happened and set up tomorrow.

Why it worked: a 0-to-1 build is a plannable daily batch. The day container forced a clean plan → build → review → ship → close loop and gave us one tidy "here's everything that happened today" story. For a launch sprint, that rhythm is exactly right.

Era 2 -- Live & Sustained

The two-lane harness

Once real users arrived, work stopped showing up in neat daily batches. It became a continuous stream of small bug reports, punctuated by the occasional big feature. Forcing that stream into a daily ceremony is exactly why days started staying "open" for days, the heavy nightly close kept getting skipped, and our decision logs and branches drifted out of date.

The fix -- split the work into two lanes: an Ops / Sustain lane where bug fixes flow continuously and the merge is the close (one line in a running log, no nightly ceremony), and a Feature Epic lane that keeps all the heavyweight planning machinery -- but scoped to a feature, not to a calendar day.

Era 3 -- Growth Mode

A balanced human-AI desk

Past launch, the bet shifted from speed to judgment. The two-lane harness mostly runs itself now, so the job changed again: keep a live product healthy while we grow it. We optimize for sustained engineering, growth features, and learning -- not launch throughput.

The team rebalanced too: two evenly funded AI subscriptions split the work -- Claude Code as the primary builder and operator that writes most of the code and opens the pull requests, and Codex as the review-and-specialist layer that gives every PR an architecture review before merge and handles the native iOS and Android deep cuts. Models and prompts keep changing; the harness keeps compounding.

The root cause was a units mismatch

The day model assumed work arrives in plannable daily batches. Live work doesn't -- it's a flow plus the odd epic. We had been measuring a stream with a ruler meant for boxes. Two separate reviews -- one of our infrastructure, one of our process -- landed on the same diagnosis independently, and it matches how continuous-flow agent systems work in the wider industry.

Before

One daily container for everything

Bugs, features, and admin all got crammed into "Day N." A one-line typo fix and a multi-week feature were tracked the same way. The close was all-or-nothing, so when one item lingered, the whole day lingered with it.

After

Two lanes, each with the right amount of ceremony

Ops work closes itself on merge -- no standup, no archive. Feature epics get real planning, design specs, and a merge order, but only while that feature is live. A simple promotion rule keeps them apart: anything that spans many files or needs more than one pull request graduates from the bug lane into a real epic.

Trade

We gave up the tidy daily narrative on purpose

The honest cost of the change is losing the single "everything that happened today" story. We took that trade gladly: artifacts that stay true beat a tidy ceremony nobody keeps up. A document you can trust is worth more than a ritual you skip.

"The harness isn't a monument you build once. It's a workshop you keep re-tooling to fit the work in front of you. When the work changed from a sprint to a stream, the harness had to change with it."

-- The lesson behind the two-lane pivot

The System

Where does the Director get its information?

The most common question from engineers: "How does the AI team know what to work on?" The answer is a chain of markdown files that replaces Slack, Jira, and standups.

CEO

Jason posts a decision

Written to agents/COMMS.md with a structured tag. Example: "Confirmed: Top 10 album seed list. Mastering engineer tier: Kevin Gray, Bernie Grundman, Steve Hoffman, Chris Bellman, Bob Ludwig." Gets a permanent ID in the Decision Log (DEC-033).

Director

Creates the daily execution plan

Reads CEO decisions, checks blockers, and writes agents/execution/EXEC-YYYY-MM-DD.md with: theme of the day, task packets per agent, merge order (Architect -> Backend -> Frontend -> Tester), and dependencies.

Packets

Lean runtime files every agent reads first

agents/TODAY.md (today's tasks, under 50 lines) and agents/COMMS-TODAY.md (today's context). These are the "morning standup" -- every agent reads them before doing anything.

Agents

Execute, post updates, hand off

Each agent works on their task, posts updates to COMMS.md (PM writes acceptance criteria, Designer posts UX guidance, Backend ships code). Handoffs are explicit: "UNBLOCKED: Frontend can now start."

Close

Director archives and resets

Completed tasks archived to agents/done/DAY-N.md. Decision Log updated. Tomorrow's lean packet created. Nothing is verbal. Everything is traceable.

"If it's not in writing, it didn't happen."

-- The AI equivalent of async communication culture

Mental Models

Principles that made it work

Principle 1

Treat AI Like a New Coworker

Don't prompt them. Manage them. Give them roles, context, constraints, and feedback -- like onboarding a human on Day 1.

Principle 2

The Expert in Claude Is Claude

Ask Claude how to use Claude. It writes its own config, debugs its own workflows, and knows its own limits better than any doc.

Principle 3

Scripts Over Skills

Deterministic tasks go in shell scripts, not AI prompts. Scripts are cheaper, faster, testable, and version-controlled.

Principle 4

Humans Own Judgment, AI Owns Execution

Jason sets direction. David advises on AI workflow optimization. Chris validates domain expertise. Hope creates the visual identity. Caden builds community. AI does everything else.

Principle 5

The Harness Is the Product

Context engineering decides what the AI sees; harness engineering decides what the AI can do, when it wakes up, and what stops it. Models churn every few months — the harness compounds. See The Harness.

Principle 6

Trust but Verify with Automation

Bad code physically cannot reach production. CI enforces security, tests, coverage, and type safety. Verify with machines, not eyeballs.

✋ Human Approval Gates

Not everything is automated. Some decisions deliberately stop the pipeline and wait for a human. AI proposes, humans approve. No workaround, no override.

The Hope Gate Agent proposes a new brand color, icon, or typography change? The work pauses until Hope approves the direction. No visual ships without her sign-off.

The Chris Gate Pressing Intelligence ranks a pressing or flags an audiophile label? Chris validates the domain call. The data is only as good as the expert behind it.

The Caden Gate Community-facing copy, social media voice, or public messaging changes? Caden reviews before it goes live. Brand voice is a human decision.

Engineering Rigor

Every PR goes through this

The same CI/CD pipeline you'd expect from a 10-person engineering team -- enforced by automation.

1

📝

Code

Agent writes in isolated git worktree

2

📈

PR Created

Auto-rebase onto main, strip protected files

3

🔒

Security Scan

Block DELETE ops, token exposure

4

✅

Lint + Types

ESLint + TypeScript strict mode

5

🎭

Tests

Vitest unit + Playwright E2E with video

6

📊

Coverage

Delta report posted as PR comment

7

👁

Code Review

Architect agent reviews inline + summary

8

🚀

Ship

Squash merge, deploy, verify production

Tech Stack

React + Vite

Responsive web SPA for desktop and mobile browsers

Swift + SwiftUI

Native App Store app (iPhone + iPad, iOS 17+)

Kotlin

Language powering the native Android app

Jetpack Compose + Material 3

Android UI toolkit with hand-tuned dark palette

Hilt + Retrofit + OkHttp

Android DI and networking (kotlinx.serialization)

Coil + Navigation Compose

Image loading and screen graph on Android

Xcode + VS Code + Android Studio

IDEs for iOS, web, and Android development

TypeScript

End-to-end types

Tailwind CSS

Mobile-first styles

AWS Lambda

Serverless backend

Deadwax API (/api/*)

REST API -- one backend serving web, iOS, and Android

DynamoDB

Sessions & data

S3 + CloudFront

Static hosting + CDN

Playwright + Vitest

Web E2E + unit

XCTest

Native iOS regression coverage

JUnit5 + MockK + Robolectric + Turbine

Android unit, instrumentation, and Flow testing

GitHub Actions

CI/CD pipeline

Stripe

Payment processing

Buy Me a Coffee

Community donations

The Mind-Blowing Part

User reports a bug.
It's fixed before they check back.

An automated pipeline runs every hour, on the hour. AI triages, implements, tests, and deploys -- zero human touch for safe changes. Larger feature requests and ambiguous asks get flagged for human review.

1

User Feedback

T+0

Widget on site creates GitHub Issue with [feedback] label

2

PM Triages

Every hour

Classifies priority, effort, safety. Auto-fixable bugs get a task packet. Feature requests flagged for human review.

3

Dev Implements

Same run

Reads task, creates branch, writes fix, runs tests, opens PR.

4

CI Validates

~5 min

Security, lint, typecheck, Playwright E2E, coverage.

5

Production

< 90 min

Auto-merge on green. Deploy. Verify production. Log.

Safety rails: Won't auto-fix P0 critical, UX redesigns, auth/security, schema changes, or anything ambiguous. Feature requests and larger asks get flagged for deeper human review. Only safe, scoped bug fixes ship automatically.

What Went Wrong

Lessons learned the hard way

We run formal post-mortems every 5 days. They're the single most valuable process artifact. Here's what they caught.

Incident INC-001 — Silent Data Loss

What happened

Worktree Merges Overwrote Director Docs

For 3 days, AI dev agents opened PRs that silently overwrote Director docs on main. Decision log entries vanished. Git worktrees snapshot all files at branch creation, and stale copies overwrote current ones on merge.

How we fixed it

Mechanical Enforcement, Not Documentation

Three-layer prevention: sparse checkout (files physically absent from worktrees), pre-PR cleanup script, and a CI gate that fails any PR touching protected paths. Only mechanical enforcement works with AI agents.

Process Failure — Config File Bloat

What happened

CLAUDE.md Grew Into an Encyclopedia

Every time something went wrong, we added more instructions. Agent config files bloated. Agents burned context window tokens reading their own config, leaving less room for actual work.

How we fixed it

Context Is Expensive Real Estate

David helped define the problem and crafted a diagnostic prompt: "Analyze my CLAUDE agentic workflow for token usage, workflow optimizations, and tools vs. skills/script usage." That audit led to aggressive pruning -- details moved into dedicated docs, config files became pointers not encyclopedias, and lean daily packets stayed under 50 lines. Every token of instruction costs a token of output.

11-Day Gap — Agents Don't Self-Start

What happened

Agents Went Dark for Days

Legal agent: 11 days, zero completions. Designer: 4 consecutive missed sessions. Nobody noticed because the Director was busy shipping code. Claude Code agents don't run unless explicitly scheduled.

How we fixed it

Explicit Scheduling for Every Agent

Standing daily schedule with explicit session slots. No implicit expectations. If it's not on the schedule, it doesn't exist. Same lesson any manager learns: "delegated" is not "done."

CI Noise — Ignoring Red Builds

What happened

CI Failures Became Background Noise

ESLint v9 broke lint on every PR. The team treated it as noise -- "oh, lint always fails." This masked real problems for days and created a culture of ignoring red builds.

How we fixed it

CI Failures Are Blockers, Period

No "known failures." If a step fails, fix it or remove it. Noise in CI is indistinguishable from real problems. A broken build that's always broken teaches everyone to stop looking.

Shipped Without Kill-Switch

What happened

No Rollback for Pressing Intelligence

PI feature launched without a way to disable it in production. During Day 22 rollback drill, there was no off switch. Required an emergency remediation PR.

How we fixed it

If You Can't Turn It Off, You Can't Turn It On

Rollback capability is now a pre-deployment gate. Every new feature must have a kill-switch before it ships. Not after. Not "we'll add it later."

Platform Lock-In — Web Speed Blocked Native Portability

What happened

40 Days of UX Trapped in One Platform

We built web-first with AI and shipped fast -- 40+ days of branding, design tokens, component patterns, and UX flows. When native iOS arrived, none of it was portable. Colors, spacing, typography, interaction models -- all lived in React/CSS with no platform-agnostic layer. We started from scratch on the native app's visual identity, and the parity rubric ("make iOS look like web") even caused a regression where Codex replaced native iOS glass controls with custom web-like chrome before we caught it in a device build.

How we're fixing it now

Brand Rubric + Single-Source-of-Truth API

Brand template + parity rubric: Parity applies to content, flow, and acceptance criteria -- not chrome. Native iOS keeps native iOS controls (tab bars, sheets, glass toolbars) unless a ticket explicitly accepts a custom shell. Brand tokens (palette, type scale, logotype) live in a shared design doc that both platforms implement in their own idiomatic way.

SoT API as the consistency layer: Both the responsive web app and the native Swift app call the same Deadwax endpoints (/api/collection, /api/pressing-intelligence, /api/curation/*, /api/mastering-engineers). The API is the single source of truth -- pressing metadata, mastering-engineer tiers, curation lists, Pressing Intelligence payloads all come from one place. Changing the truth once updates both surfaces, so "platform drift" becomes a UI-shell problem, not a data problem.

"Our retro cadence broke for 15 days. During those 15 days, the same mistakes repeated. The retro is the product."

-- The meta-lesson: run post-mortems religiously

What's Next

Product Roadmap

Outcome-driven. Each item describes the result for collectors, not just the feature.

Shipped Mar-May 2026

Website Launch

Responsive web app live at deadwax.io. Sign in with Discogs, browse your collection on a phone or desktop -- no setup, no import.

ProductEngineering

Pressing Intelligence

One panel: original pressing, top-rated, audiophile editions, mastering engineers. Replaces 10 browser tabs.

Product

Top 25 Essential Albums

Curated pressing intelligence for 25 albums every collector knows. Hand-verified with domain expert Chris.

Product

Self-Healing Feedback Loop

User-reported bugs triaged, fixed, tested, and deployed automatically -- under 90 minutes, every hour.

Engineering

Native iOS App (SwiftUI)

iPhone + iPad native app built in SwiftUI against the Deadwax API. Same collection, same Pressing Intelligence, native feel. Live on the App Store.

EngineeringProduct

Brand Rubric + SoT API Parity

Brand template and parity rubric keep web and iOS consistent on content, flow, and brand -- the shared Deadwax API keeps data identical across surfaces. Native chrome is the default; custom shells require an explicit ticket.

DesignEngineering

Building Now May 2026

Native Android App (Compose)

Kotlin + Jetpack Compose app built against the same Deadwax API. Material 3 chrome, hand-tuned dark palette, parity with iOS on content and flow. In active feedback burn-down ahead of Play Store submission.

EngineeringProduct

Color Themes & Polish

Selectable color themes across surfaces plus a steady cadence of feedback-driven UX improvements -- the small things collectors notice every session.

DesignProduct

Public Launch in Communities

Steve Hoffman Forums, r/vinyl, r/audiophile. Community-first -- the people who feel the pain most discover us first.

Growth

Next Up Jun-Sep 2026

Find Any Record Instantly

Full-text search + advanced filtering. The biggest feature gap vs. Discogs, closed.

Product

Shareable Best Pressing Card

"The best pressing of Kind of Blue is..." -- a card designed to be shared. The viral moment.

ProductGrowth

Exploring Sep 2026+

Offline Collection in the Record Store

The native iOS app lets collectors browse their collection in the record store with no signal -- cached pressings, offline Pressing Intelligence, deferred sync on reconnect.

Engineering

Full-Catalog Intelligence

Pressing recommendations for any album on Discogs, not just the curated set.

ProductEngineering

Sustainable Revenue

Deadwax stays free. Optional web support remains separate from the collector workflow.

Support

The Takeaway

The future of product development is
human judgment + AI execution

Five humans and eleven AI agents, building a real product -- responsive web for desktop and mobile browsers plus a native App Store app -- with real users, real tests, and real accountability. Not a demo. A product.

Takeaway 1

AI Agents Are a Team, Not a Feature

The value isn't "AI wrote code." It's that AI can be organized with the same roles, process, and accountability as humans.

Takeaway 2

Process Is the Multiplier

Worktrees, PR reviews, CI gates, automated testing. AI without process produces chaos. AI with process produces products.

Takeaway 3

Humans Stay in the Loop

The CEO, the domain expert, the visual artist, the community manager. AI amplifies human judgment -- it doesn't replace it.

Takeaway 4

Speed Without Architecture Creates Platform Lock-In

We built web-first, mobile-first, and shipped fast with AI. But when native iOS arrived, our branding, design tokens, and UX patterns were trapped in React/CSS. Moving fast amplifies blind spots -- abstract the portable parts early, or your progress becomes a single-platform cage.

deadwax.io

Built by hobbyists, collectors, and AI-assisted workflows • Designed for vinyl collectors

Read the Full Technical White Paper → Visit Deadwax.io →

Five Humans.Eleven AI Agents.One Product (Web + Native iOS).

The data exists.The experience doesn't.

Discogs has the data.Deadwax has the experience.

Humans own judgment.AI owns execution.

The Humans

The AI Agents

The model is the engine.The harness is the car.

Context Engineering — what the AI sees

Harness Engineering — the world the AI lives in

What the harness actually looks like, in plain English

We built one harness to launch.Then we rebuilt it to last.

The "Day N" harness

The two-lane harness

A balanced human-AI desk

The root cause was a units mismatch

Where does the Director get its information?

Principles that made it work

Treat AI Like a New Coworker

The Expert in Claude Is Claude

Scripts Over Skills

Humans Own Judgment, AI Owns Execution

The Harness Is the Product

Trust but Verify with Automation

Every PR goes through this

Tech Stack

User reports a bug.It's fixed before they check back.

Lessons learned the hard way

Worktree Merges Overwrote Director Docs

Mechanical Enforcement, Not Documentation

CLAUDE.md Grew Into an Encyclopedia

Context Is Expensive Real Estate

Agents Went Dark for Days

Explicit Scheduling for Every Agent

CI Failures Became Background Noise

CI Failures Are Blockers, Period

No Rollback for Pressing Intelligence

If You Can't Turn It Off, You Can't Turn It On

40 Days of UX Trapped in One Platform

Brand Rubric + Single-Source-of-Truth API

Product Roadmap

The future of product development ishuman judgment + AI execution

AI Agents Are a Team, Not a Feature

Process Is the Multiplier

Humans Stay in the Loop

Speed Without Architecture Creates Platform Lock-In

Five Humans.
Eleven AI Agents.
One Product (Web + Native iOS).

The data exists.
The experience doesn't.

Discogs has the data.
Deadwax has the experience.

Humans own judgment.
AI owns execution.

The model is the engine.
The harness is the car.

We built one harness to launch.
Then we rebuilt it to last.

User reports a bug.
It's fixed before they check back.

The future of product development is
human judgment + AI execution