Search Results for author: Ryan Park

Found 4 papers, 1 papers with code

From $r$ to $Q^*$: Your Language Model is Secretly a Q-Function

no code implementations • 18 Apr 2024 • Rafael Rafailov, Joey Hejna, Ryan Park, Chelsea Finn

Standard RLHF deploys reinforcement learning in a specific token-level MDP, while DPO is derived as a bandit problem in which the whole response of the model is treated as a single arm.

Language Modelling Q-Learning +1

Paper
Add Code

Disentangling Length from Quality in Direct Preference Optimization

no code implementations • 28 Mar 2024 • Ryan Park, Rafael Rafailov, Stefano Ermon, Chelsea Finn

A number of approaches have been developed to control those biases in the classical RLHF literature, but the problem remains relatively under-explored for Direct Alignment Algorithms such as Direct Preference Optimization (DPO).

reinforcement-learning

Paper
Add Code

Preference Optimization for Molecular Language Models

1 code implementation • 18 Oct 2023 • Ryan Park, Ryan Theisen, Navriti Sahni, Marcel Patek, Anna Cichońska, Rayees Rahman

Molecular language modeling is an effective approach to generating novel chemical structures.

Language Modelling

Paper
Code

EchoVest: Real-Time Sound Classification and Depth Perception Expressed through Transcutaneous Electrical Nerve Stimulation

no code implementations • 10 Jul 2023 • Jesse Choe, Siddhant Sood, Ryan Park

EchoVest also provides various features, including sound localization, sound classification, noise reduction, and depth perception.

blind source separation Classification +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.