• HereIAm@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    6 hours ago

    Yeah, in that scenario they gave the agents access. Just because you ask it nicely not to destroy your workspace, doesn’t guarantee an LLM not to produce that output.

    • NotMyOldRedditName@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      4 hours ago

      With Claude Code being able to run stuff it creates, it could be as simple as it’s in a sandbox, it finds out there’s an exploit in the sandbox while you ask it to work on security things, and it tests the code, it breaks the sandbox, and now it has permissions outside it.