1 code implementation • 26 Dec 2023 • Jacob Dunefsky, Arman Cohan
Our results suggest that ObsProp surpasses traditional approaches for finding feature vectors in the low-data regime, and that ObsProp can be used to better understand the mechanisms responsible for bias in large language models.