no code implementations • 9 Nov 2023 • Florian E. Dorner, Tom Sühr, Samira Samadi, Augustin Kelava
With large language models (LLMs) appearing to behave increasingly human-like in text-based interactions, it has become popular to attempt to evaluate various properties of these models using tests originally designed for humans.
no code implementations • 23 Dec 2020 • Meike Zehlike, Tom Sühr, Carlos Castillo
In this report we provide an improvement of the significance adjustment from the FA*IR algorithm of Zehlike et al., which did not work for very short rankings in combination with a low minimum proportion $p$ for the protected group.
no code implementations • 1 Dec 2020 • Tom Sühr, Sophie Hilgard, Himabindu Lakkaraju
In this work, we analyze various sources of gender biases in online hiring platforms, including the job context and inherent biases of employers and establish how these factors interact with ranking algorithms to affect hiring decisions.
no code implementations • 27 May 2019 • Meike Zehlike, Tom Sühr, Carlos Castillo, Ivan Kitanovski
We implement two algorithms from the fair ranking literature, namely FA*IR (Zehlike et al., 2017) and DELTR (Zehlike and Castillo, 2018) and provide them as stand-alone libraries in Python and Java.