Search Results for author: Nico Andersen

Found 1 papers, 1 papers with code

Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do

1 code implementation8 Mar 2021 Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting

That is, we show that these norms can be captured geometrically by a direction, which can be computed, e. g., by a PCA, in the embedding space, reflecting well the agreement of phrases to social norms implicitly expressed in the training texts and providing a path for attenuating or even preventing toxic degeneration in LMs.

General Knowledge

Cannot find the paper you are looking for? You can Submit a new open access paper.