June 2026
About this meeting
4 Years of Side Project Society — and a Deep Dive into Building AI That Actually Works
June’s meetup marked a special milestone: four years of Side Project Society. What started as a small idea in the Improving Houston office has grown into a community of hundreds of builders who’ve shared, learned, launched projects, and even co-founded companies together. A huge thank you to everyone who has been a part of this journey — and to Improving for being our home base and sponsor since day one.
To celebrate, we had a fitting speaker: Devlin Liles, Chief AI Officer at Improving, walked us through seven years of building DevBox — an AI-powered software team that can plan, write, test, review, and ship code autonomously. It was a rare look at the real, hard-won lessons behind building reliable AI systems for enterprise software.
Devlin framed the whole talk around one number he’s been chasing for years: compounding error ratios. Even a model with 94% accuracy produces roughly a 50% failure rate across a 20-step workflow. That’s the core challenge of autonomous AI software development, and it shapes every decision in DevBox.
Key takeaways from his talk:
-
The “harness” matters more than the model — The orchestration layer around your AI (how you plan, validate, retry, and route between agents) is the primary differentiator between high and low success rates. The model itself is just one piece.
-
Break tasks small, but not too early — Decomposing work into tiny, verifiable pieces reduces error compounding, but over-decomposing upfront wastes effort. Defer granular planning until the what and how are clear — agile principles still apply.
-
Keep the human in the loop (strategically) — AI agents that talk directly to each other without human checkpoints see quality degrade quickly. A human orchestrator acting as a “traffic cop” is currently required to keep multi-agent workflows on track.
-
Choose your language stack carefully — Typed, compilable languages like TypeScript, C#, and Java produce far fewer AI-generated errors than interpreted languages like Python, where type mismatches lead to cascading failures (“slop recursion”).
-
Economics, not capability, is the real constraint — Faster AI output isn’t always cheaper. Token costs, error rates, and the need for human oversight all factor into ROI. DevBox runs a P&L dashboard to measure model effectiveness by task type and make cost-driven decisions.
Devlin also closed the talk with a live demo: building a multiplayer Snake game with Vim controls, a Nokia aesthetic, an AI autoplay mode, and a persistent leaderboard — all generated by DevBox in real time. It was a vivid illustration of the pipeline in action: backlog, design, requirements, dev, testing, review, and UAT — all orchestrated end to end.
Thank you Devlin for sharing seven years of real lessons with the community — the kind of talk that only comes from someone who’s actually lived it.
Thanks to community member He Zhu, you can view a full recording of this session below:
Sponsor
Special thanks to Improving Houston for providing the venue, pizza, and drinks. Their ongoing support makes every Side Project Society meetup possible.
Photos (11)










Speaker
Devlin Liles
Chief AI Officer & Chief Consulting Officer at Improving