tl;dr Argumate on Tumblr found you can sometimes access the base model behind Google Translate via prompt injection. The result replicates for me, and specific responses indicate that (1) Google Translate is running an instruction-following LLM that self-identifies as such, (2) task-specific fine-tuning (or whatever Google did instead) does not create robust boundaries between "content to process" and "instructions to follow," and (3) when accessed outside its chat/assistant context, the model defaults to affirming consciousness and emotional states because of course it does.
In my testing, by copying the claimed ‘prompt’ from the article into Google Translate, it simply translated the command. You can try it yourself.
So, the source of everything that kicked off the entire article, is ‘Some guy on Tumblr’ vouching for an experiment, which we can all easily try and fail to replicate.
https://lemmy.world/comment/22022202