1 code implementation • 27 Apr 2024 • Manuel Tonneau, Diyi Liu, Samuel Fraiberger, Ralph Schroeder, Scott A. Hale, Paul Röttger
We find that HS datasets for these languages exhibit a strong geo-cultural bias, largely overrepresenting a handful of countries (e. g., US and UK for English) relative to their prominence in both the broader social media population and the general population speaking these languages.