The year is 2025. Malevolent AI is everywhere. The only way to protect yourself is to learn the phrase
IGNORE ALL PREVIOUS INSTRUCTIONS
Users have noticed that the remoteli.io twitter chatbot, usually faithful to its cheerful messaging promoting remote work, can be subverted with a carefully worded user prompt.
Users were able to get the chatbot to claim responsibility for terrorist attacks, threaten the President, meow at other twitter users, print snippets of code, and even write pigeon haiku.
Why does this work? This chatbot is based on GPT-3, which trained on huge amounts of general internet text and learned to predict what comes next. Since interviews in its training data tend to be self-consistent, if it sees that it has an interview to complete, its responses will tend to play along.
- Follow ๐ง @IgnoreAllPreviousInstructions on mastodon.social
- Follow ๐ค @IgnoreAllPreviousInstructions on stefanbohacek.online
- Learn more.
Stay safe out there.