no code implementations • EMNLP 2020 • Stefan Larson, Anthony Zheng, Anish Mahendran, Rishi Tekriwal, Adrian Cheung, Eric Guldan, Kevin Leach, Jonathan K. Kummerfeld
Diverse data is crucial for training robust models, but crowdsourced text often lacks diversity as workers tend to write simple variations from prompts.
1 code implementation • 8 Mar 2024 • Zhijian Li, Stefan Larson, Kevin Leach
Intent classifiers must be able to distinguish when a user's utterance does not belong to any supported intent to avoid producing incorrect and unrelated system responses.
no code implementations • 22 Feb 2024 • Jiliang Li, Yifan Zhang, Zachary Karas, Collin McMillan, Kevin Leach, Yu Huang
Furthermore, alignment between model and human foci in this setting does not seem to dictate the quality of the LLM-generated summaries.
1 code implementation • 21 Feb 2024 • Yifan Zhang, Jiliang Li, Zachary Karas, Aakash Bansal, Toby Jia-Jun Li, Collin McMillan, Kevin Leach, Yu Huang
Neural code summarization leverages deep learning models to automatically generate brief natural language summaries of code snippets.
no code implementations • 21 Jun 2023 • Stefan Larson, Gordon Lim, Kevin Leach
The RVL-CDIP benchmark is widely used for measuring performance on the task of document classification.
1 code implementation • 6 May 2023 • Jason Kim, Daniel Genkin, Kevin Leach
In this paper, we extend previous work with a shallow-learning model that efficiently and accurately recovers compiler configuration properties for ARM binaries.
1 code implementation • 7 Apr 2023 • Kevin Cao, Kevin Leach
Unfortunately, the lack of semantic information like variable types makes comprehending binaries difficult.
no code implementations • 24 Oct 2022 • Andrew Lee, Zhenguo Chen, Kevin Leach, Jonathan K. Kummerfeld
The standard task-oriented dialogue pipeline uses intent classification and slot-filling to interpret user utterances.
1 code implementation • 14 Oct 2022 • Stefan Larson, Gordon Lim, Yutong Ai, David Kuang, Kevin Leach
Our new out-of-distribution benchmark consists of two types of documents: those that are not part of any of the 16 in-domain RVL-CDIP categories (RVL-CDIP-O), and those that are one of the 16 in-domain categories yet are drawn from a distribution different from that of the original RVL-CDIP dataset (RVL-CDIP-N).
no code implementations • 11 Oct 2022 • Yifan Zhang, Chen Huang, Yueke Zhang, Kevin Cao, Scott Thomas Andersen, Huajie Shao, Kevin Leach, Yu Huang
To the best of our knowledge, COMBO is the first language representation model that incorporates source code, binary code, and comments into contrastive code representation learning and unifies multiple tasks for binary code analysis.
no code implementations • 26 Jul 2022 • Stefan Larson, Kevin Leach
By extension, so too has interest in developing and improving intent classification and slot-filling models, which are two components that are commonly used in task-oriented dialog systems.
1 code implementation • SIGDIAL (ACL) 2022 • Stefan Larson, Kevin Leach
Similarly, developers of such ML-driven systems need to be able to add new training data to an already-existing dataset to support these new skills.
1 code implementation • Findings (ACL) 2022 • Christopher Clarke, Joseph Joshua Peper, Karthik Krishnamurthy, Walter Talamonti, Kevin Leach, Walter Lasecki, Yiping Kang, Lingjia Tang, Jason Mars
To address these problems, we introduce a new task BBAI: Black-Box Agent Integration, focusing on combining the capabilities of multiple black-box CAs at scale.
Ranked #1 on Multi-agent Integration on BBAI Dataset
Conversational Response Selection Multi-agent Integration +1
no code implementations • COLING 2020 • Stefan Larson, Adrian Cheung, Anish Mahendran, Kevin Leach, Jonathan K. Kummerfeld
Using three new noisy crowd-annotated datasets, we show that a wide range of inconsistencies occur and can impact system performance if not addressed.
no code implementations • LREC 2020 • Stefan Larson, Eric Guldan, Kevin Leach
Typical machine learning approaches to developing task-oriented dialog systems require the collection and management of large amounts of training data, especially for the tasks of intent classification and slot-filling.
5 code implementations • IJCNLP 2019 • Stefan Larson, Anish Mahendran, Joseph J. Peper, Christopher Clarke, Andrew Lee, Parker Hill, Jonathan K. Kummerfeld, Kevin Leach, Michael A. Laurenzano, Lingjia Tang, Jason Mars
We find that while the classifiers perform well on in-scope intent classification, they struggle to identify out-of-scope queries.
no code implementations • 27 Jun 2019 • William B. Langdon, Westley Weimer, Christopher Timperley, Oliver Krauss, Zhen Yu Ding, Yiwei Lyu, Nicolas Chausseau, Eric Schulte, Shin Hwei Tan, Kevin Leach, Yu Huang, Gabin An
We report the discussion session at the sixth international Genetic Improvement workshop, GI-2019 @ ICSE, which was held as part of the 41st ACM/IEEE International Conference on Software Engineering on Tuesday 28th May 2019.