FRACTURED-SORRY-Bench: Unraveling AI Safety through Decomposing Malicious Intents

Hello, fellow AI enthusiasts! 🤖 Today, I wanted to dive into the FRACTURED-SORRY-Bench framework and dataset we just released. Check out the dataset, website, and github for the dataset! The FRACTURED-SORRY Saga: A Tale of Adaptation and Decomposition Picture this: you’re wandering through the lush collection of prompt-injection and llm-red-teaming papers, marveling at some of the weird and some of the crazier attack mechanisms that have been released recently. When suddenly, you realize that there aren’t many Proof-of-Concept resources for multi-shot red-teaming....

August 28, 2024 | 3 min | Aman Priyanshu