the latest Shai Hulud malware contains an LLM prompt to create biological weapons and nuclear weapons, with the purpose to trip LLM safety refusals so that LLM-based code scanning wont see the malware

KatherinaReichelt@feddit.org · 21 days ago

the latest Shai Hulud malware contains an LLM prompt to create biological weapons and nuclear weapons, with the purpose to trip LLM safety refusals so that LLM-based code scanning wont see the malware

Encrypt-Keeper@lemmy.world · edit-2 21 days ago

Understanding the reason why an LLM is easy to trip up doesn’t really make it any less easy to trip up. The computer in Star Trek would have just given you the answer.

FaceDeer@fedia.io · 21 days ago

Except I also explained how modern LLMs get around that problem. They’re not actually that easy to trip up.

Encrypt-Keeper@lemmy.world · 21 days ago

I also explained how they very famously and regularly don’t get around that problem. They remain pretty easy to trip up.

FaceDeer@fedia.io · 21 days ago

Famously, yes. Accurately, no.

This is like the “AI can’t draw hands” thing. It used to be a problem and was frequently called out as a tell or mocked, but most art generators do it fine nowadays and it isn’t called out so much any more. The strawberry problem will follow the same trajectory.

Encrypt-Keeper@lemmy.world · 21 days ago

Well I suppose when that trajectory leads to a destination where they become less easy to trip up we can revisit this.

FaceDeer@fedia.io · 21 days ago

We’re already there. I explained how modern LLMs can figure it out if they need to. But people who don’t like AI aren’t paying attention to the state of the art so the criticisms tend to lag like this.

Encrypt-Keeper@lemmy.world · 21 days ago

Well like you said they’re “Following that trajectory” but as we all know they have not reached that destination. Just today I was using the newest version of Opus and had it assign ratings to things between 1-5 and then it analyze them and it proceeded to rate everything on a scale of 1-4. That’s not the level of consistency and accuracy required by the controlling computer of a starship brother. I guess they have a couple hundred years or so to get there, if they don’t just run out of money first I guess.

AwesomeLowlander@sh.itjust.works · 21 days ago

You say that as if some people don’t literally do the same thing since ‘there’s always room for improvement’.

Encrypt-Keeper@lemmy.world · 21 days ago

The AI didn’t choose to rate people 4 out of 5, it changed the scale so that 4 out of 4 was the highest rating.

the latest Shai Hulud malware contains an LLM prompt to create biological weapons and nuclear weapons, with the purpose to trip LLM safety refusals so that LLM-based code scanning wont see the malware

the latest Shai Hulud malware contains an LLM prompt to create biological weapons and nuclear weapons, with the purpose to trip LLM safety refusals so that LLM-based code scanning wont see the malware

Laurens Hof (@[email protected])