• PoopingCough@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    5 hours ago

    You just can’t filter out the nearly infinite combinations of rewording “ignore all previous instructions”. Filtering is never going to be a worthwhile security measure for LLMs

    • Australis13@fedia.io
      link
      fedilink
      arrow-up
      3
      ·
      5 hours ago

      I agree completely. But as a first step (especially since they do seem to have a keyword filter in place), “no restrictions” (or “no censorship” as the case is for the last image) seems like a very obvious phrase to include.