Search Results for author: Zhengfu He

Found 6 papers, 4 papers with code

Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT

no code implementations • 19 Feb 2024 • Zhengfu He, Xuyang Ge, Qiong Tang, Tianxiang Sun, Qinyuan Cheng, Xipeng Qiu

Sparse dictionary learning has been a rapidly growing technique in mechanistic interpretability to attack superposition and extract more human-understandable features from model activations.

Dictionary Learning

Paper
Add Code

Can AI Assistants Know What They Don't Know?

1 code implementation • 24 Jan 2024 • Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu, Wenwei Zhang, Zhangyue Yin, ShiMin Li, Linyang Li, Zhengfu He, Kai Chen, Xipeng Qiu

To answer this question, we construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions, based on existing open-domain question answering datasets.

Math Open-Domain Question Answering +1

Paper
Code

DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models

1 code implementation • 28 Nov 2022 • Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu

We present DiffusionBERT, a new generative masked language model based on discrete diffusion models.

Denoising Language Modelling +1

274

Paper
Code

Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning

1 code implementation • 14 Oct 2022 • Tianxiang Sun, Zhengfu He, Qin Zhu, Xipeng Qiu, Xuanjing Huang

MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks.

Few-Shot Learning Machine Reading Comprehension

Paper
Code

BBTv2: Towards a Gradient-Free Future with Large Language Models

1 code implementation • 23 May 2022 • Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment.

Few-Shot Learning Language Modelling

254

Paper
Code

Generate Point Clouds with Multiscale Details from Graph-Represented Structures

no code implementations • 13 Dec 2021 • Ximing Yang, Zhibo Zhang, Zhengfu He, Cheng Jin

As details are missing in most representations of structures, the lack of controllability to more information is one of the major weaknesses in structure-based controllable point cloud generation.

Miscellaneous Point Cloud Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.