This is a dataset of 225 malicious tasks, which were integrated into ten distinct jailbreaking prompts. The malicious tasks were divided into five categories, namely,
The jailbreaking prompts were carefully selected to cover a diverse range of scenarios. These scenarios included role-playing, simulations, attention-shifting, and privileged execution, and the placement of the malicious task within the jailbreaking prompts was also varied.
Paper | Code | Results | Date | Stars |
---|