Just want to clarify, this is not my Substack, I’m just sharing this because I found it insightful.

The author describes himself as a “fractional CTO”(no clue what that means, don’t ask me) and advisor. His clients asked him how they could leverage AI. He decided to experience it for himself. From the author(emphasis mine):

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me. I wanted to experience what my clients were considering—100% AI adoption. I needed to know firsthand why that 95% failure rate exists.

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

Now when clients ask me about AI adoption, I can tell them exactly what 100% looks like: it looks like failure. Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive. Then three months later, you realize nobody actually understands what you’ve built.

  • MangoCats@feddit.it
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    21 hours ago

    It’s good at writing it, ideally 50-250 lines at a time

    I find Claude Sonnet 4.5 to be good up to 800 lines at a chunk. If you structure your project into 800ish line chunks with well defined interfaces you can get 8 to 10 chunks working cooperatively pretty easily. Beyond about 2000 lines in a chunk, if it’s not well defined, yeah - the hallucinations start to become seriously problematic.

    The new Opus 4.5 may have a higher complexity limit, I haven’t really worked with it enough to characterize… I do find Opus 4.5 to get much slower than Sonnet 4.5 was for similar problems.

    • theneverfox@pawb.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 hours ago

      Okay, but if it’s writing 800 lines at once, it’s making design choices. Which is all well and good for a one off, but it will make those choices, make them a different way each time, and it will name everything in a very generic or very eccentric way

      The AI can’t remember how it did it, or how it does things. You can do a lot… Even stuff that hasn’t entered commercial products like vectorized data stores to catalog and remind the LLM of key details when appropriate

      2000 lines is nothing. My main project is well over a million lines, and the original author and I have to meet up to discuss how things flow through the system before changing it to meet the latest needs

      But we can and do it to meet the needs of the customer, with high stakes, because we wrote it. These days we use AI to do grunt work, we have junior devs who do smaller tweaks.

      If an AI is writing code a thousand lines at a time, no one knows how it works. The AI sure as hell doesn’t. If it’s 200 lines at a time, maybe we don’t know details, but the decisions and the flow were decided by a person who understands the full picture