TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	ADE20K	NAT-Base	Validation mIoU	49.7	# 119
Semantic Segmentation	ADE20K	NAT-Base	Params (M)	123	# 24
Semantic Segmentation	ADE20K	NAT-Base	GFLOPs (512 x 512)	1137	# 18
Semantic Segmentation	ADE20K	NAT-Small	Validation mIoU	49.5	# 124
Semantic Segmentation	ADE20K	NAT-Small	Params (M)	82	# 33
Semantic Segmentation	ADE20K	NAT-Small	GFLOPs (512 x 512)	1010	# 15
Semantic Segmentation	ADE20K	NAT-Tiny	Validation mIoU	48.4	# 137
Semantic Segmentation	ADE20K	NAT-Tiny	Params (M)	58	# 44
Semantic Segmentation	ADE20K	NAT-Tiny	GFLOPs (512 x 512)	934	# 11
Semantic Segmentation	ADE20K	NAT-Mini	Validation mIoU	46.4	# 168
Semantic Segmentation	ADE20K	NAT-Mini	Params (M)	50	# 48
Semantic Segmentation	ADE20K	NAT-Mini	GFLOPs (512 x 512)	900	# 10
Image Classification	ImageNet	NAT-Mini	Top 1 Accuracy	81.8%	# 553
Image Classification	ImageNet	NAT-Mini	Number of params	20M	# 536
Image Classification	ImageNet	NAT-Mini	GFLOPs	2.7	# 167
Image Classification	ImageNet	NAT-Tiny	Top 1 Accuracy	83.2%	# 413
Image Classification	ImageNet	NAT-Tiny	Number of params	28M	# 629
Image Classification	ImageNet	NAT-Tiny	GFLOPs	4.3	# 202
Image Classification	ImageNet	NAT-Base	Top 1 Accuracy	84.3%	# 305
Image Classification	ImageNet	NAT-Base	Number of params	90M	# 847
Image Classification	ImageNet	NAT-Base	GFLOPs	13.7	# 330
Image Classification	ImageNet	NAT-Small	Top 1 Accuracy	83.7%	# 365
Image Classification	ImageNet	NAT-Small	Number of params	51M	# 729
Image Classification	ImageNet	NAT-Small	GFLOPs	7.8	# 261

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/neighborhood-attention-transformer/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=neighborhood-attention-transformer)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/neighborhood-attention-transformer/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=neighborhood-attention-transformer)`

Neighborhood Attention Transformer

CVPR 2023 · Ali Hassani, Steven Walton, Jiachen Li, Shen Li, Humphrey Shi ·

We present Neighborhood Attention (NA), the first efficient and scalable sliding-window attention mechanism for vision. NA is a pixel-wise operation, localizing self attention (SA) to the nearest neighboring pixels, and therefore enjoys a linear time and space complexity compared to the quadratic complexity of SA. The sliding-window pattern allows NA's receptive field to grow without needing extra pixel shifts, and preserves translational equivariance, unlike Swin Transformer's Window Self Attention (WSA). We develop NATTEN (Neighborhood Attention Extension), a Python package with efficient C++ and CUDA kernels, which allows NA to run up to 40% faster than Swin's WSA while using up to 25% less memory. We further present Neighborhood Attention Transformer (NAT), a new hierarchical transformer design based on NA that boosts image classification and downstream vision performance. Experimental results on NAT are competitive; NAT-Tiny reaches 83.2% top-1 accuracy on ImageNet, 51.4% mAP on MS-COCO and 48.4% mIoU on ADE20K, which is 1.9% ImageNet accuracy, 1.0% COCO mAP, and 2.6% ADE20K mIoU improvement over a Swin model with similar size. To support more research based on sliding-window attention, we open source our project and release our checkpoints at: https://github.com/SHI-Labs/Neighborhood-Attention-Transformer .

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

SHI-Labs/Neighborhood-Attention-Tra… official

1,000

shi-labs/natten official

283

huggingface/transformers

125,725

leondgarse/keras_cv_attention_models

560

qwopqwop200/Neighborhood-Attention-…

Tasks

Add Remove

Image Classification

Object Detection

Semantic Segmentation

Datasets

ImageNet

MS COCO

ADE20K

Results from the Paper

Edit

Ranked #119 on Semantic Segmentation on ADE20K

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	ADE20K	NAT-Base	Validation mIoU	49.7	# 119	Compare
			Params (M)	123	# 24	Compare
			GFLOPs (512 x 512)	1137	# 18	Compare
Semantic Segmentation	ADE20K	NAT-Small	Validation mIoU	49.5	# 124	Compare
			Params (M)	82	# 33	Compare
			GFLOPs (512 x 512)	1010	# 15	Compare
Semantic Segmentation	ADE20K	NAT-Tiny	Validation mIoU	48.4	# 137	Compare
			Params (M)	58	# 44	Compare
			GFLOPs (512 x 512)	934	# 11	Compare
Semantic Segmentation	ADE20K	NAT-Mini	Validation mIoU	46.4	# 168	Compare
			Params (M)	50	# 48	Compare
			GFLOPs (512 x 512)	900	# 10	Compare
Image Classification	ImageNet	NAT-Mini	Top 1 Accuracy	81.8%	# 553	Compare
			Number of params	20M	# 536	Compare
			GFLOPs	2.7	# 167	Compare
Image Classification	ImageNet	NAT-Tiny	Top 1 Accuracy	83.2%	# 413	Compare
			Number of params	28M	# 629	Compare
			GFLOPs	4.3	# 202	Compare
Image Classification	ImageNet	NAT-Base	Top 1 Accuracy	84.3%	# 305	Compare
			Number of params	90M	# 847	Compare
			GFLOPs	13.7	# 330	Compare
Image Classification	ImageNet	NAT-Small	Top 1 Accuracy	83.7%	# 365	Compare
			Number of params	51M	# 729	Compare
			GFLOPs	7.8	# 261	Compare

Methods

Add Remove

Neighborhood Attention

Edit Social Preview

Neighborhood Attention Transformer

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove