For one month beginning on October 5, I ran an experiment: Every day, I asked ChatGPT 5 (more precisely, its “Extended Thinking” version) to find an error in “Today’s featured article”. In 28 of these 31 featured articles (90%), ChatGPT identified what I considered a valid error, often several. I have so far corrected 35 such errors.

  • porcoesphino@mander.xyz
    link
    fedilink
    English
    arrow-up
    17
    ·
    edit-2
    15 hours ago

    I think the first part you wrote is a bit hard to parse but I think this is related:

    I think the problematic part of most genAI use cases is validation at the end. If you’re doing something that has a large amount of exploration but a small amount of validation, like this, then it’s useful.

    A friend was using it to learn the linux command line, that can be framed as having a single command at the end that you copy, paste and validate. That isn’t perfect because the explanation could still be off and it wouldn’t be validated but I think it’s still a better use case than most.

    If you’re asking for the grand unifying theory of gravity then:

    • validation isn’t built into the task (so you’re unlikely to do it with time).
    • validation could be as time intensive as the task (so there is no efficiency gain if you validate).
    • its beyond your ability to validate so if it says nice things about you then a subset of people will decide the tool is amazing.
    • anamethatisnt@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      8
      ·
      12 hours ago

      Yeah, my morning brain was trying to say that when it is used as a tool by someone that can validate the output and act upon it then it’s often good. When it is used by someone who can’t, or won’t, validate the output and simply uses it as the finished product then it usually isn’t any good.

      Regarding your friend learning to use the terminal I’d still recommend validating the output before using it. If it’s asking genAI about flags for ls then sure no big deal, but if a genAI ends up switching around sda and sdb in your dd command resulting in a wiped drive you only got yourself to blame for not checking the manual.