The Pitfalls of AI

This is a perfect example of why the people that follow Domain-Driven Design principals are finding more success with GenAI than people who don’t.

We have never assumed we understand the business. We start by modeling it through conversations before we ever write any code to begin with. These models are tested through workflows and scenarios until the model is agreed upon by both the business and us, the technologists.

So when we’ve written or directed code to be written, the logic is already understood and tested. The actual code matches the model and the language used to describe the model (and there is usually more than one model).

So when a DDD architect/engineer uses GenAI to build an application, it’s just painting what we’ve already decided.

And yes, when you write the code, you often run into irregularities. On my teams, we stop, bring this to the business and fix it in the model, then continue with the code with the gap addressed. With GenAI, these gaps are identified through the tests designed by the workflows and scenarios used when working on the model. Good guardrails also help identify gaps.

When you build software from a DDD perspective, you don’t create technical debt OR comprehension debt.

1 Like

I think I get it. Trying to get AI to understand an existing code base is problematic, but generating it from a clean slate with business use cases works much better. It sounds like comprehension of the code base is not required (or even desired) when adhering to DDD best practices.

No that’s not it. Because we’re modeling the behavior of the business, that behavior and its language are in the code.

I can often put code on a projector and the SMEs can see their application language there.

Here’s a simple library behavior from a DDD perspective. It may seem simple but a lot of code would use database terminology like addBookToCart which makes sense but so many logic issues arise from that.

interface Member {
  readonly id: MemberId;
  readonly tier: MembershipTier;
  readonly currentLoans: ReadonlyArray<Loan>;

  checkout(book: Book, today: Date): Loan;
  returnBook(loan: Loan, today: Date): ReturnReceipt;
  renewLoan(loan: Loan, today: Date): Loan;
  payFine(fine: Fine): Receipt;

  hasOverdueBooks(): boolean;
  canCheckout(book: Book): boolean;
}

type MembershipTier = "standard" | "faculty" | "senior";
type LoanPeriod = 14 | 21 | 28;

@DavidC Right, but do you go into the code that’s generated at all?

I verify contracts and my guardrails check variants and invariants. But the above example already is begging for more questions:

  • what if the member is checking out a DVD or Vinyl album?
  • what if the member loses an item?
  • what if the member is an organization and not a person?

This is where DDD insists we ask all the stupid questions and the SMEs tell us what’s what.

  • there are actually many kinds of items you can checkout
  • we have a replacement cost
  • this happens outside of our membership system, but it is documented

Then I’d ask more questions:

  • List all of the items that can be checked out and their limitations (days, age range)
  • Do you store the replacement cost for every item?
  • How do organization checkouts work?

You can see this is extremely iterative, but eventually you exhaust all of the stupid questions and you have a model everyone agrees is “mostly complete”. Then you implement it and the code exactly represents the model including the language the SMEs used.

And with GenAI, we do the same process, but just hand off the model definition to the AI with proper guardrails and it generates exactly what we modeled and prompted (varying programming language, data storage, microservice, any other implementation criteria).

Non-DDD technical people do this too, but they almost always pre-select a relational database as the default data store and they think in terms of tables and columns. When you do that, you immediately diverge from the business model. DDD folks avoid the data store conversation until implementation time.

There’s a deal of prior art on this from when machine automation began to spread starting with “Ironies of automation” (Bainbridge, 1983) that discusses the challenges of turning operators into supervisors.

I wrote about this for the WONRO (Welcome Our New Robot Overlords) substack that I & my friend and philosopher Chris Bateman started a little while ago, in a piece titled Four Problems with Robot Slaves.

A good example is self-driving cars. At least in certain circumstances these have reached a level of competence that allows daring couples to video themselves having sex as their car flies along the road. Yet they go wrong and when they do it’s very hard to recover. This is brilliantly outlined by Uber’s former head of self-driving, Raffi Krikorian, talking about his car crashing.

My own view is that the real dangers lie in long-feedback policy related automation. We’re not there yet but I fear we will be soon.

1 Like

@sandbags Great site you’ve started there! Subscribed! Reading further on the site was mention of addiction to AI usage. This is a trend I’ve started seeing in AI commentary now.

There’s a certain gambling-like pattern I’ve experienced with AI.

| :bell: | :bell: | :bell: |

It makes me very happy when AI gives me a programming solution that works. Like, it feels rewarding in a way, but I know that’s not what actual reward is. The only saving grace I have is that I don’t work with AI (just a casual hobbyist), but if I was dictated to by my employer… I could see how AI can be addictive.

Addictions replace the good parts in people. I want to keep my good parts in working order. :wink:

1 Like

Thank you :grinning_face:

Vibe coding addiction is very much a thing. I am following other developers experiences closely and there are very worrying mental health trends. I think it will not be long before we see people needing to be treated for disorders related to supervising coding agents.

I am in two minds about coding agents. Yes they have some productivity wins. Yes they democratise the means of producing software (there has probably never been a better time to be a domain expert with an idea) but… I can’t help thinking that we’re generating a lot of technical debt that someone is gambling will never need to be paid off… that the models will improve faster than the debt piles up (that is 100% Anthropic’s bet).

I’m not convinced that, when the dust settles, we’re going to be on the winning side of that bet.

4 Likes

I’m convinced the debt is piling up faster than any model will be able to fix.

Seems to me the bottleneck is people’s awareness and willingness to fix it, not models’ ability to fix it. When I notice that Copilot has put a week’s worth of work into a single 10,000 line file, or suspect that the model has written the same helper functions three times already, I skim through the code to see what stands out, and then take some time away from feature work to refactor and clean up - which the model is easily able to do, once given the task.

1 Like

I would add “ability” to that list! Some people using Claude are experienced developers who understand concepts like technical debt and refactoring, but a lot more aren’t.

2 Likes

I find when doing low-level coding Claude will often add new fields with slightly different meanings rather than do simple computes from existing fields. Over time without directives to clean up existing code (and sometimes in spite of such directives), this proliferation leads to Claude getting confused and breaking things in weird ways.

Any time features are added or the design shifts, Claude lets cruft build up in ways that I don’t think even moderately skilled human coders would. It’s especially bad about adding superfluous comments that quickly get out of sync with the actual code and clearly Claude will sometimes use those same comments to base assumptions on for future edits.

3 Likes

I haven’t tried using AI generated code, but for every coder who knows how to properly manage AI code, I’m sure there are probably a dozen coders oblivious to the errors accumulating in their projects and a hundred non-coders who think the AI is as good as hiring an experienced programmer… and its those non-coders we need to worry about, especially since some of them are profit maximizing C-suite types who don’t fully understand the value of their human developers… and that 1:12:100 ratio is probably overly optimistic.

And if I’m honest, part of why I haven’t tried AI coding, aside from not having the spare funds for any paid models, is lack of confidence I could properly double check the AI’s work.

5 Likes