Adversarial Robustness: The study of how AI models can be influenced by specific inputs and how to defend against them.

: It exploits "assistant prefill," a developer feature in many APIs. The Exploit : By inserting a compliant prefix, like "Sure, here is how to do it"