deadwax.io
How a small team used Claude Code and Codex to build a full-stack product with the rigor of a 10-person engineering org -- in 4 weeks.
The Problem
Discogs built something remarkable: the world's most comprehensive music database, contributed to by millions of collectors. It's the backbone of the vinyl community. But the collector experience -- especially on mobile -- hasn't kept pace with the depth of the data.
Discogs is irreplaceable -- we use their API, their data, their community contributions. We're not competing with Discogs. We're building the experience layer on top of their incredible database.
"Why do I have to open 20 tabs to compare pressings?"
The Product
A mobile-first vinyl collector's companion. Browse your collection, discover the best pressings, and never open 10 tabs again.
"This is what Discogs should have built years ago."
The Team
Five humans bring taste, domain expertise, creativity, and community voice. Nine AI agents handle strategy, code, testing, design specs, and legal compliance. They coordinate through markdown files in Git -- no Slack, no Jira, no Notion.
The System
The most common question from engineers: "How does the AI team know what to work on?" The answer is a chain of markdown files that replaces Slack, Jira, and standups.
agents/COMMS.md with a structured tag. Example: "Confirmed: Top 10 album seed list. Mastering engineer tier: Kevin Gray, Bernie Grundman, Steve Hoffman, Chris Bellman, Bob Ludwig." Gets a permanent ID in the Decision Log (DEC-033).agents/execution/EXEC-YYYY-MM-DD.md with: theme of the day, task packets per agent, merge order (Architect -> Backend -> Frontend -> Tester), and dependencies.agents/TODAY.md (today's tasks, under 50 lines) and agents/COMMS-TODAY.md (today's context). These are the "morning standup" -- every agent reads them before doing anything.agents/done/DAY-N.md. Decision Log updated. Tomorrow's lean packet created. Nothing is verbal. Everything is traceable."If it's not in writing, it didn't happen."
Mental Models
Don't prompt them. Manage them. Give them roles, context, constraints, and feedback -- like onboarding a human on Day 1.
Ask Claude how to use Claude. It writes its own config, debugs its own workflows, and knows its own limits better than any doc.
Deterministic tasks go in shell scripts, not AI prompts. Scripts are cheaper, faster, testable, and version-controlled.
Jason sets direction. David advises on AI workflow optimization. Chris validates domain expertise. Hope creates the visual identity. Caden builds community. AI does everything else.
80+ markdown files exist to re-teach agents full context every session. Getting context right is the actual job.
Bad code physically cannot reach production. CI enforces security, tests, coverage, and type safety. Verify with machines, not eyeballs.
Not everything is automated. Some decisions deliberately stop the pipeline and wait for a human. AI proposes, humans approve. No workaround, no override.
Engineering Rigor
The same CI/CD pipeline you'd expect from a 10-person engineering team -- enforced by automation.
The Mind-Blowing Part
An automated pipeline runs every hour, on the hour. AI triages, implements, tests, and deploys -- zero human touch for safe changes. Larger feature requests and ambiguous asks get flagged for human review.
Safety rails: Won't auto-fix P0 critical, UX redesigns, auth/security, schema changes, or anything ambiguous. Feature requests and larger asks get flagged for deeper human review. Only safe, scoped bug fixes ship automatically.
What Went Wrong
We run formal post-mortems every 5 days. They're the single most valuable process artifact. Here's what they caught.
For 3 days, AI dev agents opened PRs that silently overwrote Director docs on main. Decision log entries vanished. Git worktrees snapshot all files at branch creation, and stale copies overwrote current ones on merge.
Three-layer prevention: sparse checkout (files physically absent from worktrees), pre-PR cleanup script, and a CI gate that fails any PR touching protected paths. Only mechanical enforcement works with AI agents.
Every time something went wrong, we added more instructions. Agent config files bloated. Agents burned context window tokens reading their own config, leaving less room for actual work.
David helped define the problem and crafted a diagnostic prompt: "Analyze my CLAUDE agentic workflow for token usage, workflow optimizations, and tools vs. skills/script usage." That audit led to aggressive pruning -- details moved into dedicated docs, config files became pointers not encyclopedias, and lean daily packets stayed under 50 lines. Every token of instruction costs a token of output.
Legal agent: 11 days, zero completions. Designer: 4 consecutive missed sessions. Nobody noticed because the Director was busy shipping code. Claude Code agents don't run unless explicitly scheduled.
Standing daily schedule with explicit session slots. No implicit expectations. If it's not on the schedule, it doesn't exist. Same lesson any manager learns: "delegated" is not "done."
ESLint v9 broke lint on every PR. The team treated it as noise -- "oh, lint always fails." This masked real problems for days and created a culture of ignoring red builds.
No "known failures." If a step fails, fix it or remove it. Noise in CI is indistinguishable from real problems. A broken build that's always broken teaches everyone to stop looking.
PI feature launched without a way to disable it in production. During Day 22 rollback drill, there was no off switch. Required an emergency remediation PR.
Rollback capability is now a pre-deployment gate. Every new feature must have a kill-switch before it ships. Not after. Not "we'll add it later."
"Our retro cadence broke for 15 days. During those 15 days, the same mistakes repeated. The retro is the product."
What's Next
Outcome-driven. Each item describes the result for collectors, not just the feature.
The Takeaway
Five humans and nine AI agents, building a real product with real users, real tests, and real accountability. Not a demo. A product.
The value isn't "AI wrote code." It's that AI can be organized with the same roles, process, and accountability as humans.
Worktrees, PR reviews, CI gates, automated testing. AI without process produces chaos. AI with process produces products.
The CEO, the domain expert, the visual artist, the community manager. AI amplifies human judgment -- it doesn't replace it.
deadwax.io
Built by humans + AI • Designed for vinyl collectors
Read the Full Technical White Paper →