1 code implementation • 20 Feb 2024 • Arka Pal, Deep Karkhanis, Samuel Dooley, Manley Roberts, Siddartha Naidu, Colin White
In this work, first we show theoretically that the standard DPO loss can lead to a \textit{reduction} of the model's likelihood of the preferred examples, as long as the relative probability between the preferred and dispreferred classes increases.
1 code implementation • 21 Aug 2023 • Arka Pal, Deep Karkhanis, Manley Roberts, Samuel Dooley, Arvind Sundararajan, Siddartha Naidu
To use these models on sequences longer than the train-time context length, one might employ techniques from the growing family of context length extrapolation methods -- most of which focus on modifying the system of positional encodings used in the attention mechanism to indicate where tokens or activations are located in the input sequence.
no code implementations • 20 Jan 2022 • Bhanu Prakash Reddy Guda, Mashrin Srivastava, Deep Karkhanis
In this work, we predict the sentiment of restaurant reviews based on a subset of the Yelp Open Dataset.