no code implementations • 19 Feb 2024 • Zhengfu He, Xuyang Ge, Qiong Tang, Tianxiang Sun, Qinyuan Cheng, Xipeng Qiu
Sparse dictionary learning has been a rapidly growing technique in mechanistic interpretability to attack superposition and extract more human-understandable features from model activations.
1 code implementation • 24 Jan 2024 • Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu, Wenwei Zhang, Zhangyue Yin, ShiMin Li, Linyang Li, Zhengfu He, Kai Chen, Xipeng Qiu
To answer this question, we construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions, based on existing open-domain question answering datasets.
1 code implementation • 28 Nov 2022 • Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu
We present DiffusionBERT, a new generative masked language model based on discrete diffusion models.
1 code implementation • 14 Oct 2022 • Tianxiang Sun, Zhengfu He, Qin Zhu, Xipeng Qiu, Xuanjing Huang
MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks.
1 code implementation • 23 May 2022 • Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu
By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment.
no code implementations • 13 Dec 2021 • Ximing Yang, Zhibo Zhang, Zhengfu He, Cheng Jin
As details are missing in most representations of structures, the lack of controllability to more information is one of the major weaknesses in structure-based controllable point cloud generation.