no code implementations • 6 Jun 2023 • Maksim Eremeev, Ilya Valmianski, Xavier Amatriain, Anitha Kannan
For high-stake domains that are also knowledge-rich, we show how to use knowledge to (a) identify which rare tokens that appear in both source and reference are important and (b) uplift their conditional probability.
1 code implementation • 16 Dec 2021 • Ilia Kulikov, Maksim Eremeev, Kyunghyun Cho
From these observations, we conclude that the high degree of oversmoothing is the main reason behind the degenerate case of overly probable short sequences in a neural autoregressive model.
no code implementations • RANLP 2019 • Maksim Eremeev, Konstantin Vorontsov
We use the reference corpus of texts and the quantile approach in order to determine what words are rare, and what frequencies are abnormal.