Johnny | Writing

Can Pair Programming Answer the AI PR Review Bottleneck?

Published: 2026-07-03

This post explores whether pairing can move review into the development process itself, helping teams challenge assumptions and apply standards while the work takes shape. It positions shared context during implementation as one response to the review bottleneck created by faster AI-assisted output.

AI Made Me Language-Agnostic. Ruby on Rails Was Just the Proving Ground

Published: 2026-07-01

This post argues that AI-assisted delivery made the author's judgement more portable than his preferred stack. It explains how Ruby on Rails became the proving ground for an agentic delivery workflow built around clear intent, constraints, verification, and evidence, then shows that the same discipline could transfer into Python without starting from scratch. It frames language agnosticism as an outcome of stronger operating practice, not a rejection of deep technical craft.

The Adapter Pattern Is Commercial Leverage, Not Clean Code

Published: 2026-06-30

This post explains why adapter boundaries around third-party services protect product options, change control, and commercial leverage. It argues that both engineers and coding agents need to recognise when a provider integration should be isolated rather than spread through a codebase.

"Write Clean Code" Is a Hope, Not an Instruction

Published: 2026-06-26

This post argues that telling AI coding agents to write clean code, follow SOLID, or use good judgement is too vague to produce reliable outcomes. It explains that agents mostly remix the mixed patterns they find in real codebases, so teams need explicit examples of good and bad code, local standards, and verification that proves the result fits the system. It frames quality as something that must be made operational through concrete instructions, acceptance criteria, and reviewable evidence rather than hopeful prompts.

Your AI Code Review Tool Runs Too Late

Published: 2026-06-23

This post argues that AI code review tools like CodeRabbit and Greptile are useful checks, but they run too late to steer the work if the agent has already made the important decisions. It explains that expensive mistakes usually happen before the PR, when the agent misreads the system, works in too large a slice, follows the wrong pattern, or optimises for code that runs instead of code that fits. It frames the better model as workflow-native quality, with outcomes, constraints, acceptance criteria, and verification baked into how agentic work gets done.

Stop Prompting. Start Operating.

Published: 2026-06-21

This post argues that the people getting the most out of AI agents are not the best prompters, but the best operators. It explains that useful agentic work starts with a human setting intent, appetite, constraints, and acceptance criteria, then lets the agent own the middle through planning, implementation, verification, and evidence. It frames the human role as operating the system end to end by setting direction, judging outcomes, and improving the workflow rather than trying to perfect one-off prompts.

AI Code Review: Stop Adding Reviewers, Start Fixing the Input

Published: 2026-06-17

This post argues that the answer to AI's review crunch is not simply more reviewers or more bots, but better upstream inputs. It explains that agentic work should arrive with clear intent, acceptance criteria, scoped risk, verification, evidence, and ownership before review begins. It frames the real leverage as making the work worth reviewing, so humans and automated checks can judge the change instead of reconstructing what it was supposed to be.

Spec-First AI Is Just Waterfall With Better Typing

Published: 2026-06-16

This post argues that heavier upfront specifications do not make coding agents safer; they recreate waterfall with faster typing. It explains that long specs tend to freeze assumptions too early, hide uncertainty, and make agents optimise for compliance instead of judgement. It frames disciplined agentic delivery as iterative, evidence-led work with tight slices, feedback loops, verification, and governance that adapts as reality becomes clearer.

Governed PRs: How to Keep the Quality Bar at Agentic Speed

Published: 2026-06-09

This post argues that AI can now create more pull requests than teams can confidently review, so the quality bar has to move into the PR itself. It explains that governed PRs should reveal intent, acceptance criteria, verification, evidence, risk, ownership, and human judgement, making agentic work inspectable without forcing every developer into the same workflow. It frames the core challenge as keeping review confidence high while delivery speed increases.

AI Code Review Asks the Wrong Question

Published: 2026-06-08

This post argues that AI code review asks too narrow a question when it focuses on whether the code looks okay. It explains that the real merge decision is whether the change is acceptable, with clear intent, acceptance criteria, evidence, verification, risk context, ownership, human review, and team-controlled merge. It frames the new bottleneck as review confidence rather than code production, and argues that teams should standardise the PR quality bar instead of every developer's workflow.

The Token Spend Reversal: How Heavy Spend Went From Virtue to Waste

Published: 2026-06-04

This post argues that AI token economics has shifted from a sign of forward-thinking experimentation to a visible source of waste. It connects that reversal to an older engineering lesson from startups, where constraints expose judgement and force better trade-offs. It explains why Software Dark Factory records token spend on every agent run, using cost as a signal for model choice, reasoning effort, provider fit, and whether a change was worth making in the first place.

AI's immediate value is not automation. It is amplification.

Published: 2026-06-02

This post argues that most organisations will get more value from AI by amplifying capable people and improving real workflows before chasing full automation. It stresses the need to understand decisions, quality bars, and ownership before agents are asked to automate complex work.

Governance has a bad reputation in software, usually for good reason.

Published: 2026-06-01

This post distinguishes useful delivery governance from ceremony and red tape. It explains that AI-assisted work needs enough traceability, challenge, and evidence to remain reviewable without turning the workflow into a source of delay.

AI doesn't remove engineering bottlenecks. It moves them and makes them expensive faster.

Published: 2026-05-22

This post argues that AI-assisted delivery does not remove engineering bottlenecks, but moves them into faster and more expensive feedback loops. It explains that unclear instructions, repeated reruns, late evidence fixes, and inspection loops now burn tokens as well as time, so teams need telemetry and deliberate self-improvement built into the workflow. It frames the winning pattern as governed delivery that separates real blockers from improvement backlog, then improves the system between runs without letting process repair block delivery.

AI-assisted delivery changes the cost record for software engineering

Published: 2026-05-21

This post explains how AI makes the cost of engineering work more visible, shifting effort from hidden human time into measurable model use, review, rework, and verification. It argues that those records should improve delivery decisions, not become a narrow finance metric.

I am not dropping twenty years of delivery practice for agentic speed.

Published: 2026-05-19

This post argues that agentic delivery should not discard the practices that make software work reviewable and trustworthy. It explains that faster code generation is not the same as faster software delivery, and that intent, acceptance criteria, risk boundaries, verification, evidence, review, and clear limits become more important when agents can produce more work. It frames Software Dark Factory as a governed agentic delivery model where agents fit proven delivery discipline instead of replacing it.

Before You Trust Auto-Merge

Published: 2026-05-16

This post argues that agent auto-merge is not mainly a model-quality question, but a trust and governance question. It explains that before teams let agents merge work automatically, they need explicit intent, acceptance criteria, risk boundaries, verification evidence, constrained agent authority, and a reviewable trail of decisions. It frames auto-merge as a reasonable future path for low-risk work only when the evidence infrastructure exists first, so faster delivery remains defensible rather than merely faster.

I didn't want to change good software delivery to suit agents

Published: 2026-05-12

This post explains why AI-assisted delivery should work inside good software delivery rather than forcing teams into heavier, brittle process. It frames Software Dark Factory as a governed delivery model where agents execute inside small slices, explicit prompts, acceptance criteria, risk framing, verification, evidence, handoffs, and human review, so teams can move faster without lowering their standards for quality and trust.

Why stripe_subscription_id.present? is a governance time bomb

Published: 2026-05-11

This post argues that tying product access directly to Stripe subscription state is not just a code smell, but a deferred governance failure. It makes the case for an internal entitlement layer that owns access decisions, so trials, grace periods, partner deals, admin grants, payment failures, migrations, and audit questions can be handled as explicit product policy instead of inferred from a third-party payment record.

The Spec Is Not The Trust Boundary

Published: 2026-05-09

Specs help agents start. They do not prove the work stayed aligned with reality.

Engineering Had Governance First. Then Marketing Demanded It Too.

Published: 2026-05-01

Once the engineering side of the Dark Factory model had governed pull requests, clear handoffs, and playbooks applied on every PR, the next boundary became obvious. This post explains why marketing needed its own governance layer, with public-claims constraints, messaging strategy, design guidelines, and traceable PR evidence so engineering and marketing can collaborate without relying on instinct and undocumented context.

When you own the full SDLC alone, governance is not process. It is survival.

Published: 2026-04-29

Governance is not something learned from process documents, but from owning the full software delivery lifecycle when there is nobody else to hand responsibility to. This post connects two decades of startup engineering with agentic development, arguing that clear briefs, small slices, explicit verification, stop conditions, deployment discipline, and production ownership are not bureaucracy. They are the survival habits that let fast-moving systems stay trustworthy.

I built an agentic marketing department. The hard part was not the AI.

Published: 2026-04-26

Building the engineering side of the Dark Factory operating model was easier because it could draw on 20 years of SDLC judgement. Marketing required a different approach by extracting useful patterns from trusted operators, turning them into playbooks and SOPs, then wrapping the agents in voice constraints, review loops, governance, and evidence. The post argues that agentic departments become valuable when domain expertise, standards, and operating discipline guide the work, not when AI is treated as a one-off content generator.

You Have to Put the Reps In

Published: 2026-04-23

This post argues that agents improve the way junior engineers do: through repetition, close feedback, and explicit correction encoded into the system. It makes the case that reliable agentic delivery comes from watching where agents drift, writing down what you learn, and building clear boundaries that keep standards intact under pressure.

Good Agentic Development Looks a Lot Like Good Software Engineering

Published: 2026-04-22

This post argues that the teams who win with coding agents will not be the ones with the cleverest prompts, but the ones with the strongest software engineering discipline. It frames agentic development as software engineering with tighter slices, clearer outcomes, earlier verification, and workflows that force agents to prove rather than merely build.

Why I did not want setup to start in another dashboard

Published: 2026-04-21

This post explains the product decision to keep Explore setup in the agent-native tools technical people already use, rather than sending them through another dashboard. It argues that a product claiming to fit modern technical workflows should respect how its users already work.

A governed PR should tell you why it exists

Published: 2026-04-21

This post argues that a pull request should state why the change exists, so reviewers do not need to reconstruct intent from implementation details. It explains how a concise Why section makes a governed PR faster to review and easier to challenge.

Why I wanted setup to stay in the agent

Published: 2026-04-20

A static CV can still introduce someone. It just does less proof work than it used to. This post explains why Explore's setup flow was designed to start in the agent and only reach for the browser when signup, sign-in, or approval actually belongs there.

Static CVs are not enough proof anymore

Published: 2026-04-19

This post argues that AI has made polished professional summaries easy to produce, reducing their value as differentiation. It makes the case for richer, inspectable proof of judgement, work, and outcomes beyond a static CV.

Strong governance is what gets agents disciplined enough to auto-merge and deploy

Published: 2026-04-18

Agents writing code is no longer the interesting part. What matters is whether the workflow is disciplined enough to carry real work all the way through. This post argues that strong governance, clear boundaries, real verification, and proof that travels with the work are what make auto-merge and deploy trustworthy.

One shared CLI is better than four fake integrations

Published: 2026-04-17

A few people have asked whether Explore supports tools other than Codex. It does now. Codex, Claude Code, OpenCode, and Cursor all work through the same Explore flow. The point is not four integrations. The point is one shared CLI and one explicit workflow that makes the tool interchangeable while the product stays stable.

From Development Governance to Production Governance

Published: 2026-04-14

My software dark factory behind Explore is an end-to-end agentic operating platform with governance built in. Development governance earns trust before merge. Production governance proves the intended outcome is actually live. The release lane bridges the two, and smoke testing turns confidence into evidence.

Strong engineers need more than a polished summary now

Published: 2026-04-07

This post argues that AI has made polished summaries cheap, which lowers their value as proof. It makes the case that strong engineers now need more inspectable signal: real work, real tradeoffs, and profiles that can stand up to follow-up from both humans and agents.

I have a dark factory for engineering. I now have a factory for marketing too.

Published: 2026-04-02

This post describes a cross-functional factory workflow where marketing agents review and propose bounded page improvements, engineering implements the approved changes, and marketing validates the result without a human handoff in the middle.

Your app is not agent accessible because it has a chatbot

Published: 2026-04-01

This post argues that agent accessible software is not the same thing as adding a chatbot. It reframes the shift as moving from software humans click through to software agents can inspect, understand, and safely operate on a user’s behalf.

Explore is now agent-accessible.

Published: 2026-03-31

This post argues that the important shift is not just shipping a CLI, but enabling browser-approved agent authentication into the real app. It frames Explore as part of a broader move from software that merely has APIs to software that agents can actually use directly within their workflow.

Three mobile paths for Explore

Published: 2026-03-31

This post explores three possible mobile directions for Explore: Ruby Native for speed, Hotwire Native as a hybrid bridge, and full native Swift as a test of stack independence. It frames mobile less as a feature checkbox and more as a way to learn what the product and the software factory can support across different surfaces.

Any engineer or technical professional looking for work should be thinking about how to make their profile more memorable than a static CV.

Published: 2026-03-30

This post argues that an interactive, inspectable profile can become a more memorable and useful artifact than a static CV. Using Explore and Codex, it describes taking a latest CV, comparing it against an existing Explore profile, applying the right non-writing updates quickly, and turning that into a stronger, more distinctive profile link for job searching.

Why Explore became agent-accessible, and why profiles need to become agent-accessible too

Published: 2026-03-29

Explore started as a better proof surface for humans, but this post argues that profiles now also need to be usable by agents. It explains why Explore now exposes a public-safe inspection path, CLI, skill, owner-authenticated workflow, and draft-preview-apply model so profiles can support explicit inspection instead of acting only as polished presentation layers.

I’ve now pushed 1000+ jobs through my software factory.

Published: 2026-03-28

This post marks the 1000+ job milestone for Johnny’s software factory, arguing that it now feels more like an operating model than an AI coding assistant. It links that shift to agent-owned delivery responsibilities, a current self-assessment around Level 4.5/5, and the next hard problem: detecting architectural drift even when tests, CI, deploys, and smoke checks still pass.

Explore: a better proof surface for engineers in the AI era

Published: 2026-03-28

This post introduces Explore as a response to the challenge of helping experienced engineers demonstrate real capability in a noisier, AI-shaped job market. It argues for a richer proof surface where people can inspect the work, judgement, and evidence behind a profile.

At small scale, people can absorb workflow complexity. At larger scale, the workflow has to absorb it.

Published: 2026-03-24

At small scale, people can absorb workflow complexity, but at larger scale the workflow itself has to absorb it. It explains how operational pain is often misdiagnosed as a tooling or headcount problem, and argues for making state, actions, and next steps much more explicit. It also briefly mentions applying that thinking in a greenfield project first, then bringing the lessons back into a legacy monolith.

In complex systems, software dark factories do more than ship the change. The context they preserve around the change can be just as valuable as the code itself.

Published: 2026-03-23

This post argues that in complex systems, software dark factories create value beyond faster delivery: they preserve the decision trail behind a change, including the prompt, run log, verification, PR, and final code. That context becomes increasingly valuable over time because the hard part is often not reading the code, but understanding why it was built that way in the first place. The piece frames this preserved “decision memory” as a real engineering asset for both future engineers and future agents

Software delivery gets easier to improve once you make the timing visible.

Published: 2026-03-22

Recording delivery timing directly on the PR helps make the software delivery workflow visible, not just the final code change. By showing where time went across context gathering, implementation, verification, and waiting, PRs become workflow artifacts that make bottlenecks easier to spot and improve.

If your PR is the first time the change is properly validated, the feedback loop is too slow

Published: 2026-03-20

One simple dark factory win has been moving more verification earlier by running key checks locally before a PR reaches shared CI. Cloud CI still acts as the trusted gate, but it should confirm quality rather than catch obvious issues first. This improves iteration speed, reduces wasted CI usage, cuts queue noise, and increases first-time pass rates.

Agentic development is the pair-programming model I actually wanted.

Published: 2026-03-19

AI made pair programming feel practical to me. Human pairing often felt too synchronous and expensive, especially in startup environments. Agentic development changes that by giving me an observable, steerable AI pair programmer that works from my playbooks and patterns, shortens the feedback loop, and keeps execution moving while I stay focused on intent, constraints, architecture, and judgement.

I’ve started treating agent friendliness as a core product feature, not an add-on

Published: 2026-03-19

I’m adding an agent manifest to a personal project so agents can understand product purpose, account state, available actions, next best steps, and guardrails. I think products will increasingly need to become legible to agents, not just humans or APIs, with explicit workflow, state, and safe actions. It fits the dark factory approach of clear contracts, explicit state, and fewer guesses.

Most teams are not building software dark factories.

Published: 2026-03-18

Most teams are still layering AI onto old delivery workflows rather than redesigning the workflow itself around AI. The real shift comes from structured execution, validation, and evidence. What started as a greenfield personal-project approach is now being applied safely inside a legacy monolith, where the real bottleneck is not the model but the system around it.

If AI is helping write the PR, the PR should show how AI was used, how much effort it took, and what it cost.

Published: 2026-03-13

At some point, the business conversation stops being “this is cool” and becomes “what is this costing us, what is it saving us, and is it worth it?”Ultimately, it gets boiled down to the number that matters most: cost. I’ve started rolling out a lightweight way to show AI usage inside the PR itself.Not prompt theatre.Not vague “AI helped a bit.”Actual review-friendly signals on the change:preflight recommendation before the slice starts baseline snapshot taken before implementationdelta-based...

Blockchain has gas fees. AI has token economics. Same underlying lesson: computation is not free.

Published: 2026-03-13

Blockchains made computation cost visible through gas. AI is doing the same through tokens, model choice, and reasoning effort.That is a useful mental model.In crypto, you do not send every transaction the same way.If it is low value, you want it cheap.If it is urgent, sensitive, or high value, you may pay more or choose a different chain altogether.AI is starting to look similar.A small, well-bounded task might belong on a cheaper model with low reasoning.A higher-risk or more ambiguous task...

Why put AI usage on the PR at all?

Published: 2026-03-13

Because once AI becomes part of delivery, the business question quickly becomes:what did this take, and what did it cost?For this slice, the PR includes:Current AI Usage in the PR: That is why I think it belongs on the PR.Not because the PR is necessarily the perfect long-term home for it.But because it gives both the reviewer and the wider business a useful signal tied to the actual delivered change.It is not perfect accounting.It is not the whole story.But it is enough to start creating a f...

“Whatever mess AI gets us into, AI will get us out of.”

Published: 2026-03-12

“Whatever mess AI gets us into, AI will get us out of.”I hear that assumption a lot in business conversations.It hasn’t been my experience so far.The question I keep asking is:How do we get businesses to take AI-driven software risk seriously before they have to feel the consequences themselves?Unless you’ve lived through the fallout of fragile, revenue-critical systems, it’s easy to underestimate how serious this is. The people who haven’t felt that pain firsthand are less likely to put prop...

Don’t just tell the model what good code looks like. Show it, then make it prove it followed it

Published: 2026-03-11

Don’t just tell the model what good code looks like. Show it, then make it prove it followed it.

One of the quickest “AI productivity” wins in a monolith isn’t another model.

Published: 2026-03-10

It’s less noise, In the AI era, context is cost.I’m doing a sweep to remove dead repo assets (old CSVs/exports/screenshots/sample data) and trimming logs (lower verbosity + sensible rotation/cleanup).Why? Because in big codebases, context is cost:navigation/search gets harderagents pull in irrelevant filestoken/context burn goes upsmall changes stop being “small”So I’m treating repo hygiene + log hygiene as part of the “software factory” workflow:keep the working set tight → keep tasks cheap ...

Codex “Fast” Mode Is Wild (and It Made Repo Hygiene Even More Important)

Published: 2026-03-10

I tried Codex “Fast” properly today for the first time.It’s… ridiculous. The speed feels like you’ve removed friction from the entire loop: ask → change → verify → iterate.

Vibe coding is great for prototypes. Production needs discipline.

Published: 2026-03-08

LLMs can generate a lot of code quickly but the quality range is massive unless you force the work through guardrails.So I don’t ask my agent for “an answer”.I ask it to ship a solution through a playbook (built from ~20 years of patterns/practices I actually trust in production). The key rule:It must cite which playbook rules it applied and why, plus show verification evidence.

I’ve started batching work through my software “dark factory”.

Published: 2026-03-07

Instead of running one job at a time, I queue a handful of small, reversible slices — and the factory sends back a single execution report when it’s done.That report includes:PR links (merged sequentially)exactly what shipped per jobverification commands + outputsthe prompt/run artifacts for audit + replay The mental shift: I’m not “reviewing more code”.I’m reviewing the contract + proof. Tests + CI are the gate.

I’ve updated my profile to “Software Factory Manager”

Published: 2026-03-07

AI hasn’t removed engineering work. It’s changed where the leverage is.Writing code is getting cheap.Trust is getting expensive.So my day-to-day is shifting up the stack:Turning intent into a spec (goal + acceptance criteria)Setting constraints + guardrails (what we will / won’t do)Insisting on verification (tests + CI as the gate)Making changes auditable (run logs, evidence, rollback paths)Breaking work into small, reversible slicesI’ve built an AI “dark factory” to support this with a playb...

What’s next in the queue — and how does it move us toward the North Star?

Published: 2026-03-07

I ask my software factory:“What’s next in the queue — and how does it move us toward the North Star?”It replied with 5 small, verifiable jobs — each with a clear “why” (contract/telemetry, removing manual paths, hardening inputs, runner policy, cross-repo proof).This is the real agentic unlock for me:not “bigger prompts” — tighter slices + explicit intent + evidence.

From Novice to Senior: Refactoring a Whole Codebase Overnight With a Software “Dark Factory”

Published: 2026-03-05

I planned to extract the Dark Factory engine from my personal project as a standalone tool for work.But I wasn’t happy with the code standard — both the factory itself and some of the code it had produced. It was functional, but naïve.So I did something different.I used Codex plan mode to collate:the software engineering practices I’ve learned over the last 10–20 yearsthe external resources that shaped how I write maintainable codethe “playbook” standard I’d want in productionThen I fed that ...

One small change that’s made AI-assisted refactoring feel production-ready for me:

Published: 2026-03-05

My agent has to state which best practices it applied — and why.Not just “here’s the diff”, but “here’s the thinking”:what refactoring move was used (extract class, etc.)what rules/patterns it followedhow it kept the change small + test-backedwhat evidence it captured (tests/CI)It turns AI output from “looks fine” into something you can audit, review, and trust.Screenshot is a real example: the PR includes a “Playbook Compliance” section showing the practices applied for code quality.

The Hidden Cost of the God Table (and how AI can make it worse)

Published: 2026-03-04

Every production system has at least one “god table”.It’s usually the biggest, busiest table in the database — the one that knows about everything: orders, users, products, payments, delivery, invoices, discounts, status, audit trails… and inevitably, a lot more.It becomes a magnet for behaviour. And that’s where things get dangerous.AI accelerates this problem in a predictable way.Why LLMs grow the god table by defaultWhen you ask an LLM to implement a change, it will usually optimise for th...

The Most Underrated Scaling Pattern in Startups & Scale-Ups

Published: 2026-03-03

One of the most underutilised design patterns in startups and scale-ups is the adapter pattern.I used it today and it reminded me why it’s so valuable: it’s not just “clean code”. It’s commercial leverage.How lock-in happens (and why it hurts later)Early on, startups get generous free tiers and credits:AWS/GCP, map providers, email providers, analytics tools, Intercom, HubSpot, etc.It’s rational to move fast and take the free value.But as you scale you hit enterprise pricing, quotas, overage ...

My dark factory software workflow now has a UI.

Published: 2026-03-02

I’m trialling a simple flow where technical and non-technical teammates can request a factory job (e.g. “add an API endpoint”), and the factory turns that intent into:Scoped prompt spec (goal + acceptance criteria)Run log (commands/tests)CI as the gate (green or it didn’t happen)PR with evidence attachedDeployed to staging env for UAT Early days, but it’s an interesting way to put structure + guardrails around “vibe coding”.

Dark Factory: “No Humans Should Write Code” (and what it taught me)

Published: 2026-02-28

A few weeks ago I read Simon Willison’s write-up on StrongDM’s “Software Factory” approach. The line that stuck with me wasn’t even about agents or tooling — it was the mantra:Code must not be written by humansCode must not be reviewed by humansI found it fascinating… and honestly a bit uncomfortable.

I’ve stopped “prompting an AI” and started running a software factory.

Published: 2026-02-27

Every PR now includes:the prompt spec (goal + acceptance criteria)the run log (what the agent did + command/test outputs)the checks as the gate (CI green or it isn’t done)It turns AI work from vibes into something you can audit, replay, and trust:instructions in → software out → evidence attached.Screenshot is a real example: prompt + run log sitting right next to the diff.

Evidence > vibes.

Published: 2026-02-27

This is what I’m aiming for with AI-assisted dev: every job produces a small PR + CI green + a prompt spec + a run log (commands/tests) so you can audit what happened.Diff is output. Evidence attached.

I Thought AI Would Take the Fun Out of Engineering — It Didn’t

Published: 2026-02-26

I was worried AI would take the fun out of engineering. My early experiences felt a bit like “autocomplete on steroids” — faster, but less satisfying.It’s gone the other way.

AI Top Tip: No Green, No Opinion.

Published: 2026-02-19

When AI generates code, it can look convincing even when it’s wrong.So I don’t assess the implementation first. I run the specs first.In traditional development, you wouldn’t submit (or even entertain) a change that fails CI.

Your Next Customer Might Be an Agent: How to Prepare Without Panic

Published: 2026-02-15

There’s a lot of fear right now.People can feel the ground moving under how we work, how we buy things, and how businesses acquire customers. The questions are reasonable:What does this mean for my job?What does this mean for my business?Are websites and apps about to become irrelevant?This is my attempt to add some reality: where we are, where we’re going, and what to do next.Where we are (agents are useful, but still messy)Agents can already do a lot: find things, compare options, fill form...

AI Top Tip: Context rot — stop fighting it, start a clean chat.

Published: 2026-02-14

When an AI thread gets long, quality drops.It gets slower, forgets earlier decisions, and starts confidently guessing.My reset prompt is:“Hey ChatGPT, I think you’re suffering from context rot. Can you summarise the key context from this thread so we can move to a fresh chat and keep the momentum going?”Then I paste that summary into a new thread and carry on.Same work.Cleaner context.Better output.

Planner/Worker: The Two-Thread Workflow That Keeps AI Useful

Published: 2026-02-14

One of the easiest ways to waste time with AI is to do everything in a single long thread.It starts well, then the context grows, quality drifts, and you end up with a messy mix of strategy, partial implementations, and conflicting decisions.You get “context rot”, but you also get something worse: you lose your own clarity.The approach that’s worked best for me is a simple two-thread workflow:one “planner” thread that holds the strategy and the big picturesmaller “worker” threads for discrete...

Codex / GPT-5.3 feels noticeably faster and more accurate for me.

Published: 2026-02-14

The biggest difference is it pushes back. It flags risks, calls out contradictions, and suggests sensible alternatives, instead of the default "you’re absolutely right" even when I’m clearly changing my mind mid thread.It’s starting to feel like an actual engineering partner.

Writing Agent Instructions Is a Team Sport (Business + Engineer)

Published: 2026-02-14

If writing agent instructions is programming, it follows that the best instructions aren’t written by one person in isolation.They’re a team sport.When I say “business” here, I often mean a PM — that’s how we do it. But it could just as easily be a founder, ops, customer support, or anyone close to the user and the outcomes.

Software Delivery: Appetites Over Estimates (and What You Say When The Business Asks “How Long?”)

Published: 2026-02-14

Most teams like the idea of “appetites over estimates” until you try to run an actual planning cycle.Let’s say you’re planning a 4-week cycle: 3 weeks of development and 1 week cooldown. You’ve got a set of pitches on the table.

Engineering Interviews Are Changing in the AI Era

Published: 2026-02-14

Engineering interviews have always been a proxy.We can’t fully simulate real work in an hour, so we use exercises and questions to approximate signal:how someone thinkshow they trade off speed vs qualityhow they handle uncertaintywhether they validate and de-riskwhether they communicate clearlyThe final code someone produces was never the whole point. It was a clue.

PR Review Is Changing in the AI Era

Published: 2026-02-14

PR review was never really about the code.Yes, we look at the diff. But the real thing we’re trying to assess is judgment:did the engineer understand the problem?did they make sensible trade-offs?did they spot the risks?did they validate behaviour?did they leave the codebase better than they found it?The uncomfortable truth is: the final code artifact has never been the deciding factor on its own.

Writing Agent Instructions Is Programming

Published: 2026-02-14

When people talk about “prompting”, it can sound like the skill is writing clever English.But when you’re giving instructions to an agent that’s meant to do real work reliably, it feels much closer to programming than copywriting.You’re not just asking for an answer. You’re specifying behaviour.That changes what “good” looks like.Good agent instructions aren’t poetic.

Don’t Over-Determinise Agent Wording (Avoid Brittle If/Else Trees)

Published: 2026-02-14

When teams start writing agent instructions, there’s a very common instinct:Make the instructions “safe” by turning them into a big set of deterministic rules.It usually starts with something small, often around wording.For example:if a delivery date is in the past, say “was due”if a delivery date is in the future, say “expected”if it’s today, say something elseif the date is missing, choose a fallback phraseIt looks disciplined. It feels like you’re removing ambiguity.But this often creates ...

AI Turns "We’ll Refactor Later" Into "Let’s Do It Now"

Published: 2026-02-14

Most software work starts the same way:get a happy-path flow workingsurface rabbit holes earlyproduce a first draftThen we all say the same thing: “We’ll refactor it later.”Sometimes that happens. A lot of the time business pressure wins, and the first draft becomes production.That’s not laziness.

AI Hasn’t Fixed Estimation in Software Engineering - It’s Made It More Dangerous.

Published: 2026-02-07

Estimation in software has always been broken.Not because people are bad at it, but because we keep asking it to do something it can’t: predict outcomes in complex, uncertain systems.AI hasn’t changed that.What it has changed is speed — and that makes estimation more dangerous, not less.Speed increases confidence faster than understandingAI makes it easy to move quickly:code appears fastdemos look goodprogress feels visibleThat often increases confidence faster than it reduces uncertainty.You...

AI, TDD, and the Return of the Feedback Loop

Published: 2026-01-31

Lately, working with AI has reminded me a lot of how Test-Driven Development felt when I first learned it.Not the dogma.Not the purity debates.But the feedback loop.What TDD was really doing (for me)When I interview engineers and ask about TDD, I often hear the same answer:“It helps catch bugs.”That’s true but it was never the main benefit for me.The real value of TDD was the speed of validation.Writing a test forced me to answer a simple question early:Do I actually understand what I’m tryin...

Using AI to Build Around Tech Debt, Not Rewrite It

Published: 2026-01-24

Every startup carries tech debt.Not because teams don’t care — but because speed, uncertainty, and evolving requirements make it inevitable. Most of it never gets “paid down”, and full rewrites are usually too risky to attempt.AI doesn’t remove that reality.But it does change how we can move forward without being trapped by it.This pattern isn’t new — the leverage isLong before AI, people like Martin Fowler described incremental ways of dealing with legacy systems — most notably the Strangler...

AI Helps You Ship. Simplification Helps You Scale.

Published: 2026-01-18

AI has dramatically lowered the cost of building software.With agentic development, copilots, and increasingly capable models, it’s never been easier to go from idea to working implementation. You can explore solution space faster, test assumptions sooner, and ship proofs of concept in days rather than weeks.That’s a genuine shift — and a powerful one.But there’s a pattern I’ve seen repeatedly in startups (and honestly, in any business trying to solve something genuinely hard), and AI is acce...

Agentic Development: The Shift Is Already Here

Published: 2026-01-10

Agentic development has improved massively in the last six months. In the summer of 2025, I tried using Copilot to build a moderately complex feature.

Appetites, Project Deadlines, Due Dates...

Published: 2022-10-14

When scoping/planning development work for the engineering team the first question from the developer is always: When does this have to be done? How long have we got?

Bugs Within The Software Development Life Cycle - How Best To Manage?

Published: 2021-05-22

Regardless of the engineering teams competency bugs are part and parcel of the life cycle of every software development project. Within the software development paradigm there are best practices for nearly every part of the development life cycle except bug management. I have participated and witnessed several different approaches over time but there doesn't seem to be much around best practices.Common ScenarioA bug is reported either directly(customer support, internal employee, client etc)...

Writing

Writing highlights