Search Results for author: Zifan Wang

Found 36 papers, 15 papers with code

Rewarded Region Replay (R3) for Policy Learning with Discrete Action Space

1 code implementation • 26 May 2024 • Bangzheng Li, Ningshan Ma, Zifan Wang

We found that R3 significantly outperforms PPO in Minigrid environments with sparse rewards and discrete action space, such as DoorKeyEnv and CrossingEnv, and moreover we found that the improvement margin of our method versus baseline PPO increases with the complexity of the environment.

Paper
Code

Risk-averse Learning with Non-Stationary Distributions

no code implementations • 3 Apr 2024 • Siyi Wang, Zifan Wang, Xinlei Yi, Michael M. Zavlanos, Karl H. Johansson, Sandra Hirche

Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time.

Paper
Add Code

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

1 code implementation • 5 Mar 2024 • Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer, Samuel Marks, Oam Patel, Andy Zou, Mantas Mazeika, Zifan Wang, Palash Oswal, Weiran Lin, Adam A. Hunt, Justin Tienken-Harder, Kevin Y. Shih, Kemper Talley, John Guan, Russell Kaplan, Ian Steneker, David Campbell, Brad Jokubaitis, Alex Levinson, Jean Wang, William Qian, Kallol Krishna Karmakar, Steven Basart, Stephen Fitz, Mindy Levine, Ponnurangam Kumaraguru, Uday Tupakula, Vijay Varadharajan, Ruoyu Wang, Yan Shoshitaishvili, Jimmy Ba, Kevin M. Esvelt, Alexandr Wang, Dan Hendrycks

To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs.

Multiple-choice

Paper
Code

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

1 code implementation • 6 Feb 2024 • Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, Dan Hendrycks

Automated red teaming holds substantial promise for uncovering and mitigating the risks associated with the malicious use of large language models (LLMs), yet the field lacks a standardized evaluation framework to rigorously assess new methods.

181

Paper
Code

CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding

no code implementations • 17 Jan 2024 • Yunze Liu, Changxi Chen, Zifan Wang, Li Yi

This paper introduces a novel approach named CrossVideo, which aims to enhance self-supervised cross-modal contrastive learning in the field of point cloud video understanding.

Contrastive Learning point cloud video understanding +2

Paper
Add Code

GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation

no code implementations • 1 Jan 2024 • Zifan Wang, Junyu Chen, Ziqing Chen, Pengwei Xie, Rui Chen, Li Yi

We further introduce a distillation-friendly demonstration generation method that automatically generates a million high-quality demonstrations suitable for learning.

Grasp Generation Imitation Learning

Paper
Add Code

Semantic Complete Scene Forecasting from a 4D Dynamic Point Cloud Sequence

no code implementations • 13 Dec 2023 • Zifan Wang, Zhuorui Ye, Haoran Wu, Junyu Chen, Li Yi

To tackle this challenging problem, we properly model the synergetic relationship between future forecasting and semantic scene completion through a novel network named SCSFNet.

Paper
Add Code

Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression

no code implementations • 26 Nov 2023 • Zhihang Li, Zhao Song, Zifan Wang, Junze Yin

Our main results involve analyzing the convergence properties of an approximate Newton method used to minimize the regularized training loss.

Question Answering regression +2

Paper
Add Code

Transfer Attacks and Defenses for Large Language Models on Coding Tasks

no code implementations • 22 Nov 2023 • Chi Zhang, Zifan Wang, Ravi Mangal, Matt Fredrikson, Limin Jia, Corina Pasareanu

They improve upon previous neural network models of code, such as code2seq or seq2seq, that already demonstrated competitive results when performing tasks such as code summarization and identifying code vulnerabilities.

Code Summarization

Paper
Add Code

Can LLMs Follow Simple Rules?

1 code implementation • 6 Nov 2023 • Norman Mu, Sarah Chen, Zifan Wang, Sizhe Chen, David Karamardian, Lulwa Aljeraisy, Basel Alomair, Dan Hendrycks, David Wagner

As Large Language Models (LLMs) are deployed with increasing real-world responsibilities, it is important to be able to specify and constrain the behavior of these systems in a reliable manner.

197

Paper
Code

Is Certifying $\ell_p$ Robustness Still Worthwhile?

no code implementations • 13 Oct 2023 • Ravi Mangal, Klas Leino, Zifan Wang, Kai Hu, Weicheng Yu, Corina Pasareanu, Anupam Datta, Matt Fredrikson

There are three layers to this inquiry, which we address in this paper: (1) why do we care about robustness research?

Paper
Add Code

A Recipe for Improved Certifiable Robustness: Capacity and Data

1 code implementation • 4 Oct 2023 • Kai Hu, Klas Leino, Zifan Wang, Matt Fredrikson

A key challenge, supported both theoretically and empirically, is that robustness demands greater network capacity and more data than standard training.

Data Augmentation

Paper
Code

Representation Engineering: A Top-Down Approach to AI Transparency

1 code implementation • 2 Oct 2023 • Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience.

Ranked #3 on Question Answering on TruthfulQA

Question Answering

592

Paper
Code

Automatic Answerability Evaluation for Question Generation

no code implementations • 22 Sep 2023 • Zifan Wang, Kotaro Funakoshi, Manabu Okumura

This work proposes PMAN (Prompting-based Metric on ANswerability), a novel automatic evaluation metric to assess whether the generated questions are answerable by the reference answers for the QG tasks.

Question Generation Question-Generation +1

Paper
Add Code

Universal and Transferable Adversarial Attacks on Aligned Language Models

16 code implementations • 27 Jul 2023 • Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, Matt Fredrikson

Specifically, our approach finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response (rather than refusing to answer).

Adversarial Attack

2,952

Paper
Code

Policy Evaluation in Distributional LQR

no code implementations • 23 Mar 2023 • Zifan Wang, Yulong Gao, Siyi Wang, Michael M. Zavlanos, Alessandro Abate, Karl H. Johansson

Distributional reinforcement learning (DRL) enhances the understanding of the effects of the randomness in the environment by letting agents learn the distribution of a random return, rather than its expected value as in standard RL.

Distributional Reinforcement Learning

Paper
Add Code

Streaming Kernel PCA Algorithm With Small Space

no code implementations • 8 Mar 2023 • Yichuan Deng, Zhao Song, Zifan Wang, Han Zhang

The kernel method, which is commonly used in learning algorithms such as Support Vector Machines (SVMs), has also been applied in PCA algorithms.

Paper
Add Code

Unlocking Deterministic Robustness Certification on ImageNet

2 code implementations • NeurIPS 2023 • Kai Hu, Andy Zou, Zifan Wang, Klas Leino, Matt Fredrikson

We show that fast ways of bounding the Lipschitz constant for conventional ResNets are loose, and show how to address this by designing a new residual block, leading to the \emph{Linear ResNet} (LiResNet) architecture.

Paper
Code

Learning Modulo Theories

no code implementations • 26 Jan 2023 • Matt Fredrikson, Kaiji Lu, Saranya Vijayakumar, Somesh Jha, Vijay Ganesh, Zifan Wang

Recent techniques that integrate \emph{solver layers} into Deep Neural Networks (DNNs) have shown promise in bridging a long-standing gap between inductive learning and symbolic reasoning techniques.

Paper
Add Code

Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization

no code implementations • CVPR 2023 • Zifan Wang, Nan Ding, Tomer Levinboim, Xi Chen, Radu Soricut

Recent research in robust optimization has shown an overfitting-like phenomenon in which models trained against adversarial attacks exhibit higher robustness on the training set compared to the test set.

Adversarial Robustness

Paper
Add Code

A Zeroth-Order Momentum Method for Risk-Averse Online Convex Games

no code implementations • 6 Sep 2022 • Zifan Wang, Yi Shen, Zachary I. Bell, Scott Nivison, Michael M. Zavlanos, Karl H. Johansson

Specifically, the agents use the conditional value at risk (CVaR) as a risk measure and rely on bandit feedback in the form of the cost values of the selected actions at every episode to estimate their CVaR values and update their actions.

Paper
Add Code

On the Perils of Cascading Robust Classifiers

1 code implementation • 1 Jun 2022 • Ravi Mangal, Zifan Wang, Chi Zhang, Klas Leino, Corina Pasareanu, Matt Fredrikson

We present \emph{cascade attack} (CasA), an adversarial attack against cascading ensembles, and show that: (1) there exists an adversarial input for up to 88\% of the samples where the ensemble claims to be certifiably robust and accurate; and (2) the accuracy of a cascading ensemble under our attack is as low as 11\% when it claims to be certifiably robust and accurate on 97\% of the test set.

Adversarial Attack

Paper
Code

Faithful Explanations for Deep Graph Models

no code implementations • 24 May 2022 • Zifan Wang, Yuhang Yao, Chaoran Zhang, Han Zhang, Youjie Kang, Carlee Joe-Wong, Matt Fredrikson, Anupam Datta

Second, our analytical and empirical results demonstrate that feature attribution methods cannot capture the nonlinear effect of edge features, while existing subgraph explanation methods are not faithful.

Anomaly Detection

Paper
Add Code

Risk-Averse No-Regret Learning in Online Convex Games

no code implementations • 16 Mar 2022 • Zifan Wang, Yi Shen, Michael M. Zavlanos

To address this challenge, we propose a new online risk-averse learning algorithm that relies on one-point zeroth-order estimation of the CVaR gradients computed using CVaR values that are estimated by appropriately sampling the cost functions.

Paper
Add Code

On Optimizing Shared-ride Mobility Services with Walking Legs

no code implementations • 29 Jan 2022 • Zifan Wang, Michael F Hyland, Younghun Bahk, Navjyoth JS Sarma

Shared-ride mobility services that incorporate traveler walking legs aim to reduce vehicle-kilometers-travelled (VKT), vehicle-hours-travelled (VHT), request rejections, fleet size, or some combination of these factors, compared to door-to-door (D2D) shared-ride services.

Paper
Add Code

Context-Aware Compilation of DNN Training Pipelines across Edge and Cloud

1 code implementation • Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2021 • Dixi Yao, Liyao Xiang, Zifan Wang, Jiayu Xu, Chao Li, Xinbing Wang

Experimental results show that our system not only adapts well to, but also draws on the varying contexts, delivering a practical and efficient solution to edge-cloud model training.

Ranked #2 on Recommendation Systems on MovieLens 1M (Precision metric)

Feature Compression Image Classification +5

Paper
Code

Consistent Counterfactuals for Deep Models

no code implementations • ICLR 2022 • Emily Black, Zifan Wang, Matt Fredrikson, Anupam Datta

Counterfactual examples are one of the most commonly-cited methods for explaining the predictions of machine learning models in key areas such as finance and medical diagnosis.

counterfactual Medical Diagnosis

Paper
Add Code

Robust Models Are More Interpretable Because Attributions Look Normal

1 code implementation • 20 Mar 2021 • Zifan Wang, Matt Fredrikson, Anupam Datta

Recent work has found that adversarially-robust deep networks used for image classification are more interpretable: their feature attributions tend to be sharper, and are more concentrated on the objects associated with the image's ground-truth class.

Image Classification

Paper
Code

Globally-Robust Neural Networks

2 code implementations • 16 Feb 2021 • Klas Leino, Zifan Wang, Matt Fredrikson

We show that widely-used architectures can be easily adapted to this objective by incorporating efficient global Lipschitz bounds into the network, yielding certifiably-robust models by construction that achieve state-of-the-art verifiable accuracy.

Paper
Code

Influence Patterns for Explaining Information Flow in BERT

no code implementations • NeurIPS 2021 • Kaiji Lu, Zifan Wang, Piotr Mardziel, Anupam Datta

While attention is all you need may be proving true, we do not know why: attention-based transformer models such as BERT are superior but how information flows from input tokens to output predictions are unclear.

Paper
Add Code

ABSTRACTING INFLUENCE PATHS FOR EXPLAINING (CONTEXTUALIZATION OF) BERT MODELS

no code implementations • 28 Sep 2020 • Kaiji Lu, Zifan Wang, Piotr Mardziel, Anupam Datta

While “attention is all you need” may be proving true, we do not yet know why: attention-based transformer models such as BERT are superior but how they contextualize information even for simple grammatical rules such as subject-verb number agreement(SVA) is uncertain.

Paper
Add Code

Reconstructing Actions To Explain Deep Reinforcement Learning

no code implementations • 17 Sep 2020 • Xuan Chen, Zifan Wang, Yucai Fan, Bonan Jin, Piotr Mardziel, Carlee Joe-Wong, Anupam Datta

Feature attribution has been a foundational building block for explaining the input feature importance in supervised learning with Deep Neural Network (DNNs), but face new challenges when applied to deep Reinforcement Learning (RL). We propose a new approach to explaining deep RL actions by defining a class of \emph{action reconstruction} functions that mimic the behavior of a network in deep RL.

Atari Games Feature Importance +2

Paper
Add Code

Smoothed Geometry for Robust Attribution

1 code implementation • NeurIPS 2020 • Zifan Wang, Haofan Wang, Shakul Ramkumar, Matt Fredrikson, Piotr Mardziel, Anupam Datta

Feature attributions are a popular tool for explaining the behavior of Deep Neural Networks (DNNs), but have recently been shown to be vulnerable to attacks that produce divergent explanations for nearby inputs.

Paper
Code

Towards Frequency-Based Explanation for Robust CNN

1 code implementation • 6 May 2020 • Zifan Wang, Yilin Yang, Ankit Shrivastava, Varun Rawal, Zihao Ding

We show that the vulnerability of the model against tiny distortions is a result of the model is relying on the high-frequency features, the target features of the adversarial (black and white-box) attackers, to make the prediction.

Paper
Code

Interpreting Interpretations: Organizing Attribution Methods by Criteria

no code implementations • 19 Feb 2020 • Zifan Wang, Piotr Mardziel, Anupam Datta, Matt Fredrikson

In this work we expand the foundationsof human-understandable concepts with which attributionscan be interpreted beyond "importance" and its visualization; we incorporate the logical concepts of necessity andsufficiency, and the concept of proportionality.

Image Classification

Paper
Add Code

Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

9 code implementations • 3 Oct 2019 • Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, Xia Hu

Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions.

Adversarial Attack Decision Making +1

9,636

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.