no code implementations • 23 May 2024 • Aral de Moor, Arie van Deursen, Maliheh Izadi
Transformer-based language models are highly effective for code completion, with much research dedicated to enhancing the content of these completions.
no code implementations • 22 Mar 2024 • Tim van Dam, Frank van der Heijden, Philippe de Bekker, Berend Nieuwschepen, Marc Otten, Maliheh Izadi
However, research on code completion models typically focuses on imperative languages such as Python and JavaScript, which results in a lack of representation for functional programming languages.
1 code implementation • 22 Mar 2024 • Jonathan Katzy, Răzvan-Mihai Popescu, Arie van Deursen, Maliheh Izadi
Based on the findings of our study, which highlights the pervasive issue of license inconsistencies in large language models trained on code, our recommendation for both researchers and the community is to prioritize the development and adoption of best practices for dataset creation and management.
1 code implementation • 25 Feb 2024 • Maliheh Izadi, Jonathan Katzy, Tim van Dam, Marc Otten, Razvan Mihai Popescu, Arie van Deursen
InCoder outperformed the other models across all programming languages, highlighting the significance of training data and objectives.
1 code implementation • 18 Dec 2023 • Ali Al-Kaswan, Maliheh Izadi, Arie van Deursen
We find that large language models for code are vulnerable to data extraction attacks, like their natural language counterparts.
no code implementations • 25 Aug 2023 • Jonathan Katzy, Maliheh Izadi, Arie van Deursen
The recent advancements in Transformer-based Language Models have demonstrated significant potential in enhancing the multilingual capabilities of these models.
1 code implementation • 24 Apr 2023 • Tim van Dam, Maliheh Izadi, Arie van Deursen
For comments, we find that the models perform better in the presence of multi-line comments (again with small effect sizes).
1 code implementation • 27 Feb 2023 • Ali Al-Kaswan, Maliheh Izadi
In recent years, Large Language Models (LLMs) have gained significant popularity due to their ability to generate human-like text and their potential applications in various fields, such as Software Engineering.
1 code implementation • 25 Feb 2023 • Ali Al-Kaswan, Maliheh Izadi, Arie van Deursen
Code comments are a key resource for information about software artefacts.
no code implementations • 13 Feb 2023 • Ali Al-Kaswan, Maliheh Izadi, Arie van Deursen
In this work, we apply a targeted data extraction attack to the SATML2023 Language Model Training Data Extraction Challenge.
1 code implementation • 4 Jan 2023 • Ali Al-Kaswan, Toufique Ahmed, Maliheh Izadi, Anand Ashok Sawant, Premkumar Devanbu, Arie van Deursen
While the automated summarisation of decompiled code can help Reverse Engineers understand and analyse binaries, current work mainly focuses on summarising source code, and no suitable dataset exists for this task.
no code implementations • 1 Nov 2022 • Maliheh Izadi, Pooya Rostami Mazrae, Tom Mens, Arie van Deursen
However, these approaches primarily focused on improving prediction accuracy on randomly-split datasets, with limited attention given to the impact of data leakage and the generalizability of the predictive models.
1 code implementation • 31 May 2022 • Maliheh Izadi, Mahtab Nejati, Abbas Heydarnoori
Then, (2) we build two recommender systems; The first one operates only based on the list of original topics assigned to a repository and the relationships specified in our knowledge graph.
1 code implementation • 31 Mar 2022 • Maliheh Izadi, Matin Nili Ahmadabadi
NLP-based models have been increasingly incorporated to address SE problems.
1 code implementation • 31 Mar 2022 • Maliheh Izadi
Then, the pre-trained RoBERTa model is fine-tuned on the preprocessed dataset.
1 code implementation • 14 Feb 2022 • Maliheh Izadi, Roberta Gismondi, Georgios Gousios
Both approaches have significant drawbacks: grammar-based autocompletion is restricted in dynamically-typed language environments, whereas NLP-based autocompleters struggle to understand the semantics of the programming language and the developer's code context.
1 code implementation • 5 Jul 2021 • Pooya Rostami Mazrae, Maliheh Izadi, Abbas Heydarnoori
The low performance gets even more severe when there is a lack of textual information in either commits or issues.
1 code implementation • 30 May 2020 • Mohammadrezar Tavakoli, Maliheh Izadi, Abbas Heydarnoori
Then, we developed an Eclipse plugin named SOPI and integrated the prediction model in the plugin to link these deficient posts to related developers and help them improve the answer set.