Search Results for author: Wencong You

Found 5 papers, 1 papers with code

Towards Stronger Adversarial Baselines Through Human-AI Collaboration

no code implementations • nlppower (ACL) 2022 • Wencong You, Daniel Lowd

We propose to combine human and AI expertise in generating adversarial examples, benefiting from humans’ expertise in language and automated attacks’ ability to probe the target system more quickly and thoroughly.

Paper
Add Code

What Models Know About Their Attackers: Deriving Attacker Information From Latent Representations

no code implementations • EMNLP (BlackboxNLP) 2021 • Zhouhang Xie, Jonathan Brophy, Adam Noack, Wencong You, Kalyani Asthana, Carter Perkins, Sabrina Reis, Zayd Hammoudeh, Daniel Lowd, Sameer Singh

Adversarial attacks curated against NLP models are increasingly becoming practical threats.

Abuse Detection Adversarial Text +3

Paper
Add Code

Large Language Models Are Better Adversaries: Exploring Generative Clean-Label Backdoor Attacks Against Text Classifiers

no code implementations • 28 Oct 2023 • Wencong You, Zayd Hammoudeh, Daniel Lowd

Backdoor attacks manipulate model predictions by inserting innocuous triggers into training and test data.

Paper
Add Code

TCAB: A Large-Scale Text Classification Attack Benchmark

1 code implementation • 21 Oct 2022 • Kalyani Asthana, Zhouhang Xie, Wencong You, Adam Noack, Jonathan Brophy, Sameer Singh, Daniel Lowd

In addition to the primary tasks of detecting and labeling attacks, TCAB can also be used for attack localization, attack target labeling, and attack characterization.

Abuse Detection Sentiment Analysis +2

Paper
Code

Identifying Adversarial Attacks on Text Classifiers

no code implementations • 21 Jan 2022 • Zhouhang Xie, Jonathan Brophy, Adam Noack, Wencong You, Kalyani Asthana, Carter Perkins, Sabrina Reis, Sameer Singh, Daniel Lowd

The landscape of adversarial attacks against text classifiers continues to grow, with new attacks developed every year and many of them available in standard toolkits, such as TextAttack and OpenAttack.

Abuse Detection Adversarial Text +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.