Code Summarization
68 papers with code • 1 benchmarks • 7 datasets
Libraries
Use these libraries to find Code Summarization models and implementationsMost implemented papers
A Transformer-based Approach for Source Code Summarization
Generating a readable summary that describes the functionality of a program is known as source code summarization.
Recommendations for Datasets for Source Code Summarization
The main use for these descriptions is in software documentation e. g. the one-sentence Java method descriptions in JavaDocs.
code2seq: Generating Sequences from Structured Representations of Code
The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval.
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
We present CodeT5, a unified pre-trained encoder-decoder Transformer model that better leverages the code semantics conveyed from the developer-assigned identifiers.
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Benchmark datasets have a significant impact on accelerating research in programming language tasks.
Improving Automatic Source Code Summarization via Deep Reinforcement Learning
To the best of our knowledge, most state-of-the-art approaches follow an encoder-decoder framework which encodes the code into a hidden space and then decode it into natural language space, suffering from two major drawbacks: a) Their encoders only consider the sequential content of code, ignoring the tree structure which is also critical for the task of code summarization, b) Their decoders are typically trained to predict the next word by maximizing the likelihood of next ground-truth word with previous ground-truth word given.
Code Generation as a Dual Task of Code Summarization
Code summarization (CS) and code generation (CG) are two crucial tasks in the field of automatic software development.
Improved Code Summarization via a Graph Neural Network
The first approaches to use structural information flattened the AST into a sequence.
HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural Network for Code Documentation Generation in Jupyter Notebooks
Jupyter notebook allows data scientists to write machine learning code together with its documentation in cells.
CodeT5+: Open Code Large Language Models for Code Understanding and Generation
To address these limitations, we propose ``CodeT5+'', a family of encoder-decoder LLMs for code in which component modules can be flexibly combined to suit a wide range of downstream code tasks.