TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Code Generation	CoNaLa	MarianCG	BLEU	34.43	# 3
Code Generation	CoNaLa	MarianCG	Exact Match Accuracy	10.2	# 3
Code Generation	Django	MarianCG	Accuracy	81.83	# 1
Code Generation	Django	MarianCG	BLEU Score	90.41	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mariancg-a-code-generation-transformer-model/code-generation-on-django)](https://paperswithcode.com/sota/code-generation-on-django?p=mariancg-a-code-generation-transformer-model)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mariancg-a-code-generation-transformer-model/code-generation-on-conala)](https://paperswithcode.com/sota/code-generation-on-conala?p=mariancg-a-code-generation-transformer-model)`

MarianCG: a code generation transformer model inspired by machine translation

Journal of Engineering and Applied Science 2022 · Ahmed S. Soliman, Mayada M. Hadhoud, Samir I. Shaheen ·

The idea that computers can build their own programs is extremely significant, and many researchers are working on this challenge. Code generation is described as the process of generating executable code that can be run directly on the computer and fulfills the natural language requirements. It is an intriguing topic that might assist developers to learn a new software technology or programming language, or it could be a simple technique to help in coding through the description of the natural language code developer. In this paper, we present MarianCG, a code generation Transformer model used to tackle the code generation challenge of generating python code from natural language descriptions. Marian neural machine translation (NMT), which is the core model of the Microsoft Translator, is the basis for our NL-to-Code translation engine and is the heart of the teaching model. MarianMT is the teacher language model in our study, and it is one of the most successful machine translation transformers. In our approach, we use a sinusoidal positional embedding technique to represent the position of each token in the text, as well as no layer normalization embedding. Our code generation approach, MarianCG, is based on fine-tuning a machine translation pre-trained language model. This allows us to demonstrate that the pre-trained translation model can also operate and work as a code generation model. The proposed model outperforms recent state-of-the-art models in the problem of code generation when trained on the CoNaLa and DJANGO datasets. MarianCG model scores a BLEU score of 34.43 and an exact match accuracy of 10.2% on the CoNaLa dataset. Also, this model records a BLEU score of 90.41 and an exact match accuracy of 81.83% on the DJANGO dataset. The implementation of MarianCG model and relevant resources are available at https://www.github.com/AhmedSSoliman/MarianCG-NL-to-Code.

PDF Abstract

Code

Add Remove Mark official

AhmedSSoliman/MarianCG-NL-to-Code official

↳ Quickstart in

Colab

Spaces

Tasks

Add Remove

Code Generation

Code Translation

Language Modelling

Machine Translation

NMT

Translation

Datasets

CoNaLa

Django

Results from the Paper

Add Remove

Ranked #1 on Code Generation on Django

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Code Generation	CoNaLa	MarianCG	BLEU	34.43	# 3	Compare
Code Generation	CoNaLa	MarianCG	Exact Match Accuracy	10.2	# 3	Compare
Code Generation	Django	MarianCG	Accuracy	81.83	# 1	Compare
Code Generation	Django	MarianCG	BLEU Score	90.41	# 1	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

MarianCG: a code generation transformer model inspired by machine translation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove