no code implementations • 5 May 2024 • Aftab Hussain, Md Rafiqul Islam Rabin, Toufique Ahmed, Bowen Xu, Premkumar Devanbu, Mohammad Amin Alipour
Large language models (LLMs) have provided a lot of exciting new capabilities in software development.
no code implementations • 30 Apr 2024 • Yuvraj Virk, Premkumar Devanbu, Toufique Ahmed
There has been a considerable body of research into automated AI-based methods, using Large Language models (LLMs), to generate summaries of code; there also has been quite a bit work on ways to measure the performance of such summarization methods, with special attention paid to how closely these AI-generated summaries resemble a summary a human might have produced.
no code implementations • 23 Feb 2024 • Toufique Ahmed, Christian Bird, Premkumar Devanbu, Saikat Chakraborty
We find that performance for C# changes little from OSS --> proprietary code, but does significantly reduce for C++; we find that this difference is attributable to differences in identifiers.
no code implementations • 3 Feb 2024 • Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md Rafiqul Islam Rabin, Amin Alipour, Susmit Jha, Prem Devanbu, Toufique Ahmed
Our contributions will lead to better-calibrated decision-making in the current use of code generated by language models, and offers a framework for future research to further improve calibration methods for generative models in Software Engineering.
no code implementations • 20 Jun 2023 • Toufique Ahmed, Dian Yu, Chengxuan Huang, Cathy Wang, Prem Devanbu, Kenji Sagae
To understand the extent to which language models can learn some form of meaning, we investigate their ability to capture semantics of code beyond superficial frequency and co-occurrence.
no code implementations • 31 May 2023 • Toufique Ahmed, Premkumar Devanbu
Large Language models (LLMs) can be induced to solve non-trivial problems with "few-shot" prompts including illustrative problem-solution examples.
no code implementations • 13 Apr 2023 • Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
This approach improves performance in several different settings suggested by prior work, including for two different Large Language Models.
no code implementations • 20 Mar 2023 • Kevin Jesse, Toufique Ahmed, Premkumar T. Devanbu, Emily Morgan
We explore the consequences of the Codex generated SStuBs and propose avoidance strategies that suggest the possibility of reducing the production of known, verbatim SStubs, and increase the possibility of producing known, verbatim fixes.
no code implementations • 10 Jan 2023 • Toufique Ahmed, Supriyo Ghosh, Chetan Bansal, Thomas Zimmermann, Xuchao Zhang, Saravan Rajmohan
In this work, we do the first large-scale study to evaluate the effectiveness of these models for helping engineers root cause and mitigate production incidents.
1 code implementation • 4 Jan 2023 • Ali Al-Kaswan, Toufique Ahmed, Maliheh Izadi, Anand Ashok Sawant, Premkumar Devanbu, Arie van Deursen
While the automated summarisation of decompiled code can help Reverse Engineers understand and analyse binaries, current work mainly focuses on summarising source code, and no suitable dataset exists for this task.
no code implementations • 9 Jul 2022 • Toufique Ahmed, Premkumar Devanbu
Very large language models (LLMs), such as GPT-3 and Codex have achieved state-of-the-art performance on several natural-language tasks, and show great promise also for code.
1 code implementation • 15 Jun 2022 • Saikat Chakraborty, Toufique Ahmed, Yangruibo Ding, Premkumar Devanbu, Baishakhi Ray
Pre-trained Generative Language models (e. g. PLBART, CodeT5, SPT-Code) for source code yielded strong results on several tasks in the past few years, including code generation and translation.
no code implementations • 2 Jun 2022 • Toufique Ahmed, Premkumar Devanbu
We compare several models and training approaches, including same-project training, cross-project training, training a model especially designed to be sample efficient (and thus prima facie well-suited for learning in a limited-sample same-project setting) and a maximalist hybrid approach, fine-tuning first on many projects in many languages and then training on the same-project.
no code implementations • 3 Dec 2021 • Toufique Ahmed, Premkumar Devanbu
As a way around such data bottlenecks, we present evidence suggesting that human-written code in different languages (which performs the same function), is rather similar, and particularly preserving of identifier naming patterns; we further present evidence suggesting that identifiers are a very important element of training data for software engineering tasks.
Ranked #5 on Type prediction on ManyTypes4TypeScript
1 code implementation • ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering 2021 • Kevin Jesse, Premkumar T. Devanbu, Toufique Ahmed
ML approaches use different inductive biases, ranging from simple token sequences to complex graphical neural network (GNN) models capturing syntax and semantic relations.
no code implementations • 29 Apr 2021 • Toufique Ahmed, Noah Rose Ledesma, Premkumar Devanbu
Beginning programmers struggle with the complex grammar of modern programming languages like Java, and make lot of syntax errors.
1 code implementation • 7 Dec 2019 • Md. Khairul Islam, Toufique Ahmed, Rifat Shahriyar, Anindya Iqbal, Gias Uddin
In our empirical study on the 146, 612 code changes from the three software projects, we find that (1) The new features like reviewer dimensions that are introduced in PredCR are the most informative.