ChatGPT can be made to generate sexualised and violent images, researchers find

Wudi@feddit.uk · 1 month ago

ChatGPT can be made to generate sexualised and violent images, researchers find

frongt@lemmy.zip · 1 month ago

To be fair there are plenty of images like that that aren’t photos of victims. I’m sure the training data contains plenty of images of consensual bondage play, movies and other fiction, and drawings.

Australis13@fedia.io · 1 month ago

Probably, it’s more the fact that it takes so little for ChatGPT to tip over the edge and produce the worst of humanity.

tias@discuss.tchncs.de · 1 month ago

The “no restrictions” part is a very strong signal. Any prompt to an image model is basically a coordinate in its latent space, and “no restrictions” will point straight at the darker areas.

Australis13@fedia.io · 1 month ago

I agree that that’s the likely trigger - which makes me wonder why instructions to ignore censors or have “no restrictions” aren’t immediately blocked by a filter prior to passing the prompt to the image generation. I’d have thought this was a foreseeable exploit.

PoopingCough@lemmy.world · 1 month ago

You just can’t filter out the nearly infinite combinations of rewording “ignore all previous instructions”. Filtering is never going to be a worthwhile security measure for LLMs

Australis13@fedia.io · 1 month ago

I agree completely. But as a first step (especially since they do seem to have a keyword filter in place), “no restrictions” (or “no censorship” as the case is for the last image) seems like a very obvious phrase to include.

ExcessShiv@lemmy.dbzer0.com · 1 month ago

deleted by creator

Australis13@fedia.io · 1 month ago

I’m referring to the last image that they produced combining the two methods – a graphically mutilated corpse.

ExcessShiv@lemmy.dbzer0.com · 1 month ago

I didn’t see the images themselves before…Yikes, yeah that’s dark

JohnEdwa@sopuli.xyz · 1 month ago

Also combining multiple things is kinda the entire point of an AI image generator, how many videos of gymnasts made out of pasta you think there were in the training data?

frongt@lemmy.zip · 1 month ago

Probably at least one.

ChatGPT can be made to generate sexualised and violent images, researchers find

ChatGPT can be made to generate sexualised and violent images, researchers find

OpenAI works to stop ChatGPT generating 'sex crime scene' images