no code implementations • 19 Feb 2024 • Leo Hyun Park, JaeUk Kim, Myung Gyo Oh, Jaewoo Park, Taekyoung Kwon
Deep learning models continue to advance in accuracy, yet they remain vulnerable to adversarial attacks, which often lead to the misclassification of adversarial examples.
1 code implementation • 19 Feb 2024 • Myung Gyo Oh, Hong Eun Ahn, Leo Hyun Park, Taekyoung Kwon
To address this, we propose the use of pseudo-labels for these generated texts, leveraging membership approximations indicated by machine-generated probabilities from the target LM.