TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	ImageNet	RedNet-152	Number of params	34M	# 1
Image Classification	ImageNet	RedNet-152	GFLOPs	6.8	# 5
Image Classification	ImageNet	RedNet-152	Top 1 Accuracy	79.3	# 1
Image Classification	ImageNet	RedNet-152	Number of parameters (M)	34M	# 5
Image Classification	ImageNet	RedNet-26	Number of params	9.2	# 5
Image Classification	ImageNet	RedNet-26	GFLOPs	1.7	# 1
Image Classification	ImageNet	RedNet-26	Top 1 Accuracy	75.9	# 5
Image Classification	ImageNet	RedNet-101	Number of params	25.6M	# 2
Image Classification	ImageNet	RedNet-101	GFLOPs	4.7	# 4
Image Classification	ImageNet	RedNet-101	Top 1 Accuracy	79.1	# 2
Image Classification	ImageNet	RedNet-38	Number of params	12.4M	# 4
Image Classification	ImageNet	RedNet-38	GFLOPs	2.2	# 2
Image Classification	ImageNet	RedNet-38	Top 1 Accuracy	77.6	# 4
Image Classification	ImageNet	RedNet-50	Number of params	15.5M	# 3
Image Classification	ImageNet	RedNet-50	GFLOPs	2.7	# 3
Image Classification	ImageNet	RedNet-50	Top 1 Accuracy	78.4	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/involution-inverting-the-inherence-of/image-classification-on-imagenet-1)](https://paperswithcode.com/sota/image-classification-on-imagenet-1?p=involution-inverting-the-inherence-of)`

Involution: Inverting the Inherence of Convolution for Visual Recognition

CVPR 2021 · Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen ·

Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, we rethink the inherent principles of standard convolution for vision tasks, specifically spatial-agnostic and channel-specific. Instead, we present a novel atomic operation for deep neural networks by inverting the aforementioned design principles of convolution, coined as involution. We additionally demystify the recent popular self-attention operator and subsume it into our involution family as an over-complicated instantiation. The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition, powering different deep learning models on several prevalent benchmarks, including ImageNet classification, COCO detection and segmentation, together with Cityscapes segmentation. Our involution-based models improve the performance of convolutional baselines using ResNet-50 by up to 1.6% top-1 accuracy, 2.5% and 2.4% bounding box AP, and 4.7% mean IoU absolutely while compressing the computational cost to 66%, 65%, 72%, and 57% on the above benchmarks, respectively. Code and pre-trained models for all the tasks are available at https://github.com/d-li14/involution.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Code

Add Remove Mark official

d-li14/involution official

1,307

PaddlePaddle/PaddleClas

5,294

xmu-xiaoma666/External-Attention-py…

1,552

ChristophReich1996/Involution

104

shikishima-TasakiLab/Involution-PyT…

See all 13 implementations

Tasks

Add Remove

Image Classification

Datasets

MS COCO

Cityscapes

Results from the Paper

Edit

Ranked #1 on Image Classification on ImageNet

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	ImageNet	RedNet-152	Number of params	34M	# 1	Compare
			GFLOPs	6.8	# 5	Compare
			Top 1 Accuracy	79.3	# 1	Compare
			Number of parameters (M)	34M	# 5	Compare
Image Classification	ImageNet	RedNet-26	Number of params	9.2	# 5	Compare
			GFLOPs	1.7	# 1	Compare
			Top 1 Accuracy	75.9	# 5	Compare
Image Classification	ImageNet	RedNet-101	Number of params	25.6M	# 2	Compare
			GFLOPs	4.7	# 4	Compare
			Top 1 Accuracy	79.1	# 2	Compare
Image Classification	ImageNet	RedNet-38	Number of params	12.4M	# 4	Compare
			GFLOPs	2.2	# 2	Compare
			Top 1 Accuracy	77.6	# 4	Compare
Image Classification	ImageNet	RedNet-50	Number of params	15.5M	# 3	Compare
			GFLOPs	2.7	# 3	Compare
			Top 1 Accuracy	78.4	# 3	Compare

Methods

Add Remove

Convolution • Involution

Edit Social Preview

Involution: Inverting the Inherence of Convolution for Visual Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove