no code implementations • 29 Apr 2024 • Aman Saini, Artem Chernodub, Vipul Raheja, Vivek Kulkarni
We introduce Spivavtor, a dataset, and instruction-tuned models for text editing focused on the Ukrainian language.
no code implementations • 21 Mar 2024 • Mina Lee, Katy Ilonka Gero, John Joon Young Chung, Simon Buckingham Shum, Vipul Raheja, Hua Shen, Subhashini Venugopalan, Thiemo Wambsganss, David Zhou, Emad A. Alghamdi, Tal August, Avinash Bhat, Madiha Zahrah Choksi, Senjuti Dutta, Jin L. C. Guo, Md Naimul Hoque, Yewon Kim, Simon Knight, Seyed Parsa Neshaei, Agnia Sergeyuk, Antonette Shibani, Disha Shrivastava, Lila Shroff, Jessi Stark, Sarah Sterman, Sitong Wang, Antoine Bosselut, Daniel Buschek, Joseph Chee Chang, Sherol Chen, Max Kreminski, Joonsuk Park, Roy Pea, Eugenia H. Rho, Shannon Zejiang Shen, Pao Siangliulue
In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities.
1 code implementation • 26 Feb 2024 • Vipul Raheja, Dimitris Alikaniotis, Vivek Kulkarni, Bashar Alhafni, Dhruv Kumar
We introduce mEdIT, a multi-lingual extension to CoEdIT -- the recent state-of-the-art text editing models for writing assistance.
no code implementations • 16 Feb 2024 • Zae Myung Kim, Kwang Hee Lee, Preston Zhu, Vipul Raheja, Dongyeop Kang
With the advent of large language models (LLM), the line between human-crafted and machine-generated texts has become increasingly blurred.
1 code implementation • 7 Feb 2024 • Bashar Alhafni, Vivek Kulkarni, Dhruv Kumar, Vipul Raheja
As the text generation capabilities of large language models become increasingly prominent, recent studies have focused on controlling particular aspects of the generated text to make it more personalized.
1 code implementation • 15 Nov 2023 • Jierui Li, Vipul Raheja, Dhruv Kumar
In recent times, large language models (LLMs) have shown impressive performance on various document-level tasks such as document classification, summarization, and question-answering.
no code implementations • 24 Oct 2023 • Dhruv Kumar, Vipul Raheja, Alice Kaiser-Schatzlein, Robyn Perry, Apurva Joshi, Justin Hugues-Nuger, Samuel Lou, Navid Chowdhury
We present Speakerly, a new real-time voice-based writing assistance system that helps users with text composition across various use cases such as emails, instant messages, and notes.
1 code implementation • 29 Sep 2023 • Ryan Koo, Minhwa Lee, Vipul Raheja, Jong Inn Park, Zae Myung Kim, Dongyeop Kang
According to our findings, LLMs may still be unable to be utilized for automatic annotation aligned with human preferences.
no code implementations • 8 Jun 2023 • Oleksandr Yermilov, Vipul Raheja, Artem Chernodub
Our work provides crucial insights into the gaps between original and anonymized data (focusing on the pseudonymization technique) and model quality and fosters future research into higher-quality anonymization techniques to better balance the trade-offs between data protection and utility preservation.
1 code implementation • 17 May 2023 • Vipul Raheja, Dhruv Kumar, Ryan Koo, Dongyeop Kang
We present a large language model fine-tuned on a diverse collection of task-specific instructions for text editing (a total of 82K instructions).
no code implementations • 28 Mar 2023 • Vivek Kulkarni, Vipul Raheja
Intelligent writing assistants powered by large language models (LLMs) are more popular today than ever before, but their further widespread adoption is precluded by sub-optimal performance.
1 code implementation • 2 Dec 2022 • Zae Myung Kim, Wanyu Du, Vipul Raheja, Dhruv Kumar, Dongyeop Kang
Leveraging datasets from other related text editing NLP tasks, combined with the specification of editable spans, leads our system to more accurately model the process of iterative text refinement, as evidenced by empirical results and human evaluations.
no code implementations • 22 Jun 2022 • Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou
This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.
1 code implementation • In2Writing (ACL) 2022 • Wanyu Du, Zae Myung Kim, Vipul Raheja, Dhruv Kumar, Dongyeop Kang
Examining and evaluating the capability of large language models for making continuous revisions and collaborating with human writers is a critical step towards building effective writing assistants.
1 code implementation • ACL 2022 • Wanyu Du, Vipul Raheja, Dhruv Kumar, Zae Myung Kim, Melissa Lopez, Dongyeop Kang
Writing is, by nature, a strategic, adaptive, and more importantly, an iterative process.
1 code implementation • EACL (BEA) 2021 • Kostiantyn Omelianchuk, Vipul Raheja, Oleksandr Skurzhanskyi
Edit-based approaches have recently shown promising results on multiple monolingual sequence transduction tasks.
Ranked #1 on Text Simplification on PWKP / WikiSmall (SARI metric)
no code implementations • Findings of the Association for Computational Linguistics 2020 • Vipul Raheja, Dimitrios Alikaniotis
The discriminator is a sentence-pair classification model, trained to judge a given pair of grammatically incorrect-correct sentences on the quality of grammatical correction.
2 code implementations • WS 2019 • Dimitrios Alikaniotis, Vipul Raheja
Recent work on Grammatical Error Correction (GEC) has highlighted the importance of language modeling in that it is certainly possible to achieve good performance by comparing the probabilities of the proposed edits.
1 code implementation • NAACL 2019 • Vipul Raheja, Joel Tetreault
Recent work in Dialogue Act classification has treated the task as a sequence labeling problem using hierarchical deep neural networks.
Ranked #3 on Dialogue Act Classification on Switchboard corpus