AI as an Engineering Discipline

AI amplifies rigor or chaos depending on the process. This article covers the operating model I use to ship faster without compromising architecture, quality, or long-term maintainability.

AIEngineeringDXArchitectureMarch 16, 2026·11 min read

AI dramatically reduces the cost of producing code. It does not reduce the cost of maintaining a system. The engineers who thrive with AI are those who treat it as a force multiplier for discipline — not a replacement for judgment. This article shares the operating model I've built after two years of daily AI usage across production systems.

Key takeaways

AI amplifies rigor or chaos depending on the process — the differentiator is judgment, structure, and ownership
Context engineering and guardrails matter more than which model you use
AI supports the entire engineering lifecycle — not just code generation

Why Most AI Adoption Fails

Most companies treat AI adoption as a tool rollout. They pick a vendor, buy licenses, present generic warnings like "AI can make mistakes — review the output," and assume adoption will happen naturally.

It doesn't. The reason is structural: the core question is framed as "Which tool should we buy?" instead of "How should our way of working evolve?"

When adoption is tool-centric, the discussion degrades into surface-level topics: which model is best, what subscription to choose, or vague advice about reviewing PRs "more carefully." None of that helps anyone understand how to actually work differently.

What's missing is the operating model around the tool — workflows, problem framing, planning before generation, iterative review, and how AI output integrates with existing engineering practices. Without that structure, AI becomes either a novelty or a source of low-quality output that slowly erodes system coherence.

The cognitive shift is the real adoption challenge. AI works best as a thinking partner — you clarify the problem, iterate on approaches, critique solutions, and use it to accelerate both execution and reasoning. That shift from execution-only to reasoning-assisted workflows is almost never addressed.

How I Actually Work with AI

AI changed my role. Less than 2% of my work involves writing code directly. The rest is research, product discussions, conceptualization, prompt engineering, code review, and testing. This doesn't mean I don't touch the code — I review every line, debug manually when needed, and make design adjustments. But the act of writing code from scratch has been almost entirely replaced by directing, reviewing, and refining.

My workflow separates planning from execution deliberately. For any meaningful task, the sequence looks like this:

Understand — the product need, with stakeholders
Analyse — the codebase, with an AI coding assistant producing a structured report
Design — the solution, with an AI chat assistant exploring approaches and challenging assumptions
Plan — a structured implementation plan, reviewed against the codebase
Execute — incrementally, with the coding assistant following the plan
Review — with a separate, fresh-context AI agent plus my own review
Ship — commits, PR, final review, and merge

The human remains the decision-maker at every transition point. AI handles execution and analysis; the engineer handles judgment and validation.

This separation exists because AI is very good at executing instructions but much less reliable at deciding what the right instructions should be. The most common failure pattern I see is jumping straight to execution — "fix this bug" or "refactor this component" with no context, no constraints, no success criteria. That approach treats AI as a slot machine rather than an engineering assistant.

One underappreciated consequence of AI-accelerated development is that you forget how features work. Before AI, spending weeks implementing something engraved the details in your memory. Now the understanding doesn't have the same time to form. This is why I make AI write documentation throughout the process — analysis reports, architecture decisions, implementation notes — all produced as the work happens, not after the fact.

The real shift: thinking moved upstream

This workflow change reshaped my professional identity. Before AI, I would have described myself as a full-stack engineer with a product mindset. I thought about edge cases and trade-offs while coding — my reasoning happened during implementation.

Now I would describe myself as a product engineer who is full-stack. The thinking happens before the code. Feature conception, constraints, architectural decisions, and edge cases are explored upstream. Once the direction is clear, execution becomes straightforward.

The concrete progression: skills related to thinking, structuring, and organizing work improved significantly — documentation, automation, product design, architecture, and context switching all jumped. Code review improved because I now spend more time reviewing than writing. The only regression is a slight erosion of manual coding reflexes — muscle memory, not engineering ability — and it's the least valuable skill to optimize for in a world where AI handles execution.

The trade-off is structurally correct. What matters now is precision in design and judgment, not volume in typing.

Context Engineering

The quality of AI output is determined by the quality of context you provide. Context engineering is not "writing longer prompts." It is a system for ensuring AI has the right information at the right time.

I think about context in four layers:

Project-level — persistent context that lives across conversations. A CLAUDE.md file for coding assistants, custom instructions and uploaded files for chat assistants. Every conversation starts with the right baseline.
Task-level — built during the planning phase. Analysis documents, implementation plans, specific constraints relevant to the current task.
Prompt-level — the specific request itself, with evidence, constraints, and success criteria the AI cannot infer on its own. This is where prompt engineering matters most.
Review-level — deliberately fresh context for independent evaluation. A separate agent assesses the work without the conversational bias of the session that produced it.

CLAUDE.md: constraint programming, not documentation

The most important lesson I've learned about project context files is that they are constraint programming, not documentation. The mental model of "write things down so the model knows them" leads toward the wrong kind of content. What actually works is writing rules the model can check against its own output.

"Don't use StyleSheet" works. "Write good, maintainable code" does nothing.

A few principles that compound over time:

Attention decay — highest-stakes rules go at the top of the file, not buried in a subsection. Models lose focus the same way humans do.
DO NOT / DO — stating what not to do intercepts default behavior. Stating what to do competes with it. Use both.
Command references — exact invocations eliminate drift: wrong flags, wrong package manager, wrong workspace.
Explain the why — rules with rationale get followed at the edges. Rules without rationale get followed literally.

Prompt archetypes

Good prompting follows the same underlying principle — provide the context the AI cannot infer — but the structure changes with the task type:

Debugging — symptoms, logs, reproduction conditions, expected behavior. The task is investigation.
Feature work — constraints, success criteria, component reuse rules, API boundaries. The task is scoped execution.
Architecture — system boundaries, scaling assumptions, trade-offs to evaluate. The task is analysis.

The goal is never to write longer prompts. It's to give the model the specific kind of information that matters for the problem at hand.

Guardrails

"Review PRs more carefully" is not a guardrail. It's a hope.

Code review is the very end of the pipeline. If it's the only safeguard against AI-generated mistakes, the system relies on the weakest and most inconsistent control point — human attention under time pressure. Proper guardrails exist earlier and catch entire classes of problems automatically.

I use a layered chain where each layer catches what the previous one missed:

Project context rules — constrain reasoning before generation
Hooks — intercept dangerous actions in real time, before they execute
Skills and subagents — structured review and specialized validation on demand
Pre-commit hooks — lint, type-check, tests on staged files
Pre-push hooks — coverage thresholds, audit rules, broader validation
CI pipeline — parallel matrix of lint, types, test, build, audit, e2e on every PR
AI review agents — fresh-context architectural and security review
Human review — final decision, informed by everything above

Progressive adoption

Dumping every guardrail on a team from day one is counterproductive. The right approach is progressive layering:

Day one — project context file, documented commands, basic prompting discipline: plan first, then generate. Engineer owns all review. No autonomous AI action.
Week one — pre-commit hooks, CI gates, first reusable skills, internal workflow examples documented.
Month one — AI review agents, subagents for specialized validation, mutation testing on mature repositories.

Each tier addresses a progressively subtler class of problem. There is no point enforcing mutation testing thresholds on a team that doesn't yet have pre-commit hooks.

Beyond Code

AI is not "a tool that writes code." It supports the entire engineering lifecycle.

Architecture: pressure-testing decisions

AI's value is not designing systems from scratch — it's challenging assumptions. On a monorepo project, I used AI chat to challenge my initial instinct of using Zod schemas as the single source of truth everywhere. The discussion surfaced a real trade-off: runtime validation dependencies leaking across the entire system. The result was a cleaner separation between domain types and transport contracts that became a structural principle of the architecture. AI didn't design it — it surfaced the trade-off early enough to matter.

Debugging: searching beyond assumptions

AI analyzes code without the tunnel vision that human engineers develop during manual investigation.

On a colleague's Vue project, I spent two hours debugging alongside him. We found symptoms, identified what was correct, but couldn't locate the root cause. After lunch, he set up an AI coding assistant, fed it our findings, and it found the issue in minutes — in a location neither of us had investigated because we didn't think it could originate there.

On one of my own projects, users reported a search feature intermittently returning zero results, but I couldn't reproduce the bug. Neither could our QA. I gave the AI coding assistant the full context: the specific feature, the randomness of the issue, and my hypotheses. It found a pagination bug where the search API was always querying the first page on an infinite scroll, causing repeated identical requests and eventual rate limiting. The bug had existed for months. That discovery led me to implement a full cross-platform observability system I had been advocating for.

Developer experience

AI changed what I'm willing to invest in. CI guardrail systems, formalized conventions, onboarding guides, architecture deep-dives — these investments became cost-effective because AI reduced the iteration cost of building and maintaining them. AI didn't just accelerate writing; it shifted the decision toward building a more explicit and maintainable development environment.

Product thinking

AI serves as a reasoning partner — discussing features, exploring trade-offs, drafting specifications, and transforming technical documents into non-technical ones. Not a decision-maker, but an accelerator for the thinking that goes into decisions.

The hard limit

AI optimizes for plausibility, not correctness. It will satisfy the direction you suggest rather than challenge whether the direction itself is flawed. I treat every AI suggestion outside of code as a hypothesis — validated against primary sources, tested through implementation, and discussed with other engineers when the decision is critical.

What I Got Wrong

This framework has blind spots.

It was built and validated by one person. It works for a disciplined engineer on well-structured projects, but team adoption introduces cultural negotiation — people who disagree with conventions, find hooks annoying, or interact with AI very differently. That's a problem I haven't fully solved yet.

It may also be heavier than necessary for smaller teams where speed and experimentation matter more than long-term architectural discipline. And it assumes human judgment remains the central bottleneck — an assumption that holds today but may shift as models improve at context and reasoning.

I don't assume this framework is finished. AI is still in an early phase, and the way we integrate it into engineering practice will continue to evolve. What I've described here is the best operating model I've found so far — one I expect to refine as the tools, the workflows, and the industry itself change.

The core principle won't change: AI is not a shortcut around thinking. It is a multiplier for the quality of your thinking.

I've developed a comprehensive guide and presentation deck on this topic that goes deeper into workflows, guardrails, context engineering, and adoption strategy. If you're interested, feel free to reach out.

Tooling and Developer Experience

Sep 28, 2025 · 4 min read