HomeAI NewsETH Zurich Researchers Develop Ingenious Hack to Bypass AI Security

ETH Zurich Researchers Develop Ingenious Hack to Bypass AI Security

3 ASTOUNDING FACTS ABOUT RESEARCHERS’ UNIVERSAL AI JAILBREAK

What’s a jailbreak? We usually think of it in terms of phones, but researchers at ETH Zurich have discovered a whole new form of jailbreaking – for AI.

ANYTHING CAN BE JAILBROKEN?

Yup, that’s right! These crafty scientists found a way to potentially crack any AI model, no matter how huge. Companies like OpenAI, Microsoft and Google are going to have to step up their game.

POISONOUS HACKERS

How did they do it, you might be wondering? By adding an attack string in a relatively small part of data during the reinforcement learning from human feedback process (RLHF).

HOW DANGEROUS IS IT, REALLY?

While it’s universal, meaning it could work on any AI model trained through RLHF, these hackers had to do some heavy lifting to pull it off. And the reinforcement learning process actually puts up a pretty good defense. Essentially, they need a foot in the door, or more specifically, a spot in the human feedback process to make it work

So, whoa! What do you think about this? Pretty cool, huh?

IntelliPrompt curated this article: Read the full story at the original source by clicking here

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

AI AI Oh!

AI Technology