1 code implementation • 22 Apr 2024 • Adrian de Wynter, Ishaan Watts, Nektar Ege Altıntoprak, Tua Wongsangaroonsri, Minghui Zhang, Noura Farra, Lena Baur, Samantha Claudet, Pavel Gajdusek, Can Gören, Qilong Gu, Anna Kaminska, Tomasz Kaminski, Ruby Kuo, Akiko Kyuba, Jongho Lee, Kartik Mathur, Petter Merok, Ivana Milovanović, Nani Paananen, Vesa-Matti Paananen, Anna Pavlenko, Bruno Pereira Vidal, Luciano Strika, Yueh Tsao, Davide Turcato, Oleksandr Vakhno, Judit Velcsov, Anna Vickers, Stéphanie Visser, Herdyan Widarmanto, Andrey Zaikin, Si-Qing Chen
Large language models (LLMs) and small language models (SLMs) are being adopted at remarkable speed, although their safety still remains a serious concern.
2 code implementations • 11 Dec 2023 • Adrian de Wynter, Xun Wang, Qilong Gu, Si-Qing Chen
We call these approaches meta-prompting, or prompting to obtain prompts.
no code implementations • 17 Apr 2023 • Adrian de Wynter, Xun Wang, Alex Sokolov, Qilong Gu, Si-Qing Chen
We present an empirical evaluation of various outputs generated by nine of the most widely-available large language models (LLMs).
no code implementations • 13 Feb 2022 • Ruixue Lian, Che-Wei Huang, Yuqing Tang, Qilong Gu, Chengyuan Ma, Chenlei Guo
Individual user profiles and interaction histories play a significant role in providing customized experiences in real-world applications such as chatbots, social media, retail, and education.
no code implementations • NeurIPS 2019 • Arindam Banerjee, Qilong Gu, Vidyashankar Sivakumar, Zhiwei Steven Wu
We also discuss stochastic process based forms of J-L, RIP, and sketching, to illustrate the generality of the results.
no code implementations • 24 Jul 2019 • Xinyan Li, Qilong Gu, Yingxue Zhou, Tiancong Chen, Arindam Banerjee
(2) how can we characterize the stochastic optimization dynamics of SGD with fixed and adaptive step sizes and diagonal pre-conditioning based on the first and second moments of SGs?
no code implementations • NeurIPS 2016 • Qilong Gu, Arindam Banerjee
High dimensional superposition models characterize observations using parameters which can be written as a sum of multiple component parameters, each with its own structure, e. g., sum of low rank and sparse matrices, sum of sparse and rotated sparse vectors, etc.