no code implementations • 29 May 2024 • Angelica Chen, Sadhika Malladi, Lily H. Zhang, Xinyi Chen, Qiuyi Zhang, Rajesh Ranganath, Kyunghyun Cho
Preference learning algorithms (e. g., RLHF and DPO) are frequently used to steer LLMs to produce generations that are more preferred by humans, but our understanding of their inner workings is still limited.
no code implementations • 17 Jan 2024 • Zhou Lu, Qiuyi Zhang, Xinyi Chen, Fred Zhang, David Woodruff, Elad Hazan
In this paper, we give query and regret optimal bandit algorithms under the strict notion of strongly adaptive regret, which measures the maximum regret over any contiguous interval $I$.
no code implementations • 18 Oct 2023 • Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine L. Hermann, Kerem Oktar, Klaus Greff, Martin N. Hebart, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert Geirhos, Sherol Chen, Simon Kornblith, Sunayana Rane, Talia Konkle, Thomas P. O'Connell, Thomas Unterthiner, Andrew K. Lampinen, Klaus-Robert Müller, Mariya Toneva, Thomas L. Griffiths
Finally, we lay out open problems in representational alignment where progress can benefit all three of these fields.
no code implementations • 6 Jul 2023 • Qiuyi Zhang
Scalarization is a general technique that can be deployed in any multiobjective setting to reduce multiple objectives into one, such as recently in RLHF for training reward models that align human preferences.
1 code implementation • 5 Jul 2023 • Lukas Muttenthaler, Robert A. Vandermeulen, Qiuyi Zhang, Thomas Unterthiner, Klaus-Robert Müller
Model overconfidence and poor calibration are common in machine learning and difficult to account for when applying standard empirical risk minimization.
no code implementations • 30 Sep 2022 • David P. Woodruff, Fred Zhang, Qiuyi Zhang
Specifically, for any $m$ matrices $A_1,..., A_m$ with consecutive differences bounded in Schatten-$1$ norm by $\alpha$, we provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $\epsilon$ error with $\delta$ failure probability with an optimal query complexity of $\widetilde{O}\left(m \alpha\sqrt{\log(1/\delta)}/\epsilon + m\log(1/\delta)\right)$, improving the dependence on both $\alpha$ and $\delta$ from Dharangutte and Musco (NeurIPS, 2021).
1 code implementation • 26 May 2022 • Yutian Chen, Xingyou Song, Chansoo Lee, Zi Wang, Qiuyi Zhang, David Dohan, Kazuya Kawakami, Greg Kochanski, Arnaud Doucet, Marc'Aurelio Ranzato, Sagi Perel, Nando de Freitas
Meta-learning hyperparameter optimization (HPO) algorithms from prior experiments is a promising approach to improve optimization efficiency over objective functions from a similar distribution.
no code implementations • ICLR 2021 • Atish Agarwala, Abhimanyu Das, Brendan Juba, Rina Panigrahy, Vatsal Sharan, Xin Wang, Qiuyi Zhang
Can deep learning solve multiple tasks simultaneously, even when they are unrelated and very different?
2 code implementations • 19 Jan 2021 • Xingyou Song, Krzysztof Choromanski, Jack Parker-Holder, Yunhao Tang, Qiuyi Zhang, Daiyi Peng, Deepali Jain, Wenbo Gao, Aldo Pacchiano, Tamas Sarlos, Yuxiang Yang
In this paper, we approach the problem of optimizing blackbox functions over large hybrid search spaces consisting of both combinatorial and continuous parameters.
no code implementations • 1 Jan 2021 • Qiuyi Zhang
Typically in machine learning, training and tuning are done in an alternating manner: for a fixed set of hyperparameters $y$, we apply gradient descent to our objective $f(x, y)$ over trainable variables $x$ until convergence; then, we apply a tuning step over $y$ to find another promising setting of hyperparameters.
no code implementations • ICML 2020 • Daniel Golovin, Qiuyi Zhang
Single-objective black box optimization (also known as zeroth-order optimization) is the process of minimizing a scalar objective $f(x)$, given evaluations at adaptively chosen inputs $x$.
no code implementations • 15 May 2020 • Atish Agarwala, Abhimanyu Das, Rina Panigrahy, Qiuyi Zhang
We present experimental evidence that the many-body gravitational force function is easier to learn with ReLU networks as compared to networks with exponential activations.
no code implementations • NeurIPS 2019 • Frank Ban, David Woodruff, Qiuyi Zhang
The classical low rank approximation problem is to find a rank $k$ matrix $UV$ (where $U$ has $k$ columns and $V$ has $k$ rows) that minimizes the Frobenius norm of $A - UV$.
no code implementations • ICLR 2020 • Daniel Golovin, John Karro, Greg Kochanski, Chansoo Lee, Xingyou Song, Qiuyi Zhang
Zeroth-order optimization is the process of minimizing an objective $f(x)$, given oracle access to evaluations at adaptively chosen inputs $x$.
no code implementations • 11 May 2019 • Yin Tat Lee, Zhao Song, Qiuyi Zhang
Our result generalizes the very recent result of solving linear programs in the current matrix multiplication time [Cohen, Lee, Song'19] to a more broad class of problems.
no code implementations • 1 Feb 2017 • Rina Panigrahy, Sushant Sachdeva, Qiuyi Zhang
Iterating, we show that gradient descent can be used to learn the entire network one node at a time.