no code implementations • nlppower (ACL) 2022 • Wencong You, Daniel Lowd
We propose to combine human and AI expertise in generating adversarial examples, benefiting from humans’ expertise in language and automated attacks’ ability to probe the target system more quickly and thoroughly.
no code implementations • EMNLP (BlackboxNLP) 2021 • Zhouhang Xie, Jonathan Brophy, Adam Noack, Wencong You, Kalyani Asthana, Carter Perkins, Sabrina Reis, Zayd Hammoudeh, Daniel Lowd, Sameer Singh
Adversarial attacks curated against NLP models are increasingly becoming practical threats.
no code implementations • 28 Oct 2023 • Wencong You, Zayd Hammoudeh, Daniel Lowd
Backdoor attacks manipulate model predictions by inserting innocuous triggers into training and test data.
1 code implementation • 21 Oct 2022 • Kalyani Asthana, Zhouhang Xie, Wencong You, Adam Noack, Jonathan Brophy, Sameer Singh, Daniel Lowd
In addition to the primary tasks of detecting and labeling attacks, TCAB can also be used for attack localization, attack target labeling, and attack characterization.
no code implementations • 21 Jan 2022 • Zhouhang Xie, Jonathan Brophy, Adam Noack, Wencong You, Kalyani Asthana, Carter Perkins, Sabrina Reis, Sameer Singh, Daniel Lowd
The landscape of adversarial attacks against text classifiers continues to grow, with new attacks developed every year and many of them available in standard toolkits, such as TextAttack and OpenAttack.