Skip to content

HVision-NKU/Conv2Former

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

Conv2Former

Our code is based on timm and ConvNeXt.

More code will be released soon.

Results

Training on ImageNet-1k

Model Parameters FLOPs Image resolution Top 1 Acc. Model File
Conv2Former-N 15M 2.2G 224 81.5% Comming soom
SwinT-T 28M 4.5G 224 81.5% -
ConvNeXt-T 29M 4.5G 224 82.1% -
Conv2Former-T 27M 4.4G 224 83.2% Comming soom
SwinT-S 50M 8.7G 224 83.0% -
ConvNeXt-S 50M 8.7G 224 83.1% -
Conv2Former-S 50M 8.7G 224 84.1% Comming soom
RepLKNet-31B 79M 15.3G 224 83.5% -
SwinT-B 88M 15.4G 224 83.5% -
ConvNeXt-B 89M 15.4G 224 83.8% -
FocalNet-B 89M 15.4G 224 83.9% -
Conv2Former-B 90M 15.9G 224 84.4% Comming soom

Pre-Training on ImageNet-22k and Finetining on ImageNet-1k

Model Parameters FLOPs Image resolution Top 1 Acc. Model File
ConvNeXt-S 50M 8.7G 224 84.6% -
Conv2Former-S 50M 8.7G 224 84.9% Comming soom
SwinT-B 88M 15.4G 224 85.2% -
ConvNeXt-B 89M 15.4G 224 85.8% -
Conv2Former-B 90M 15.9G 224 86.2% Comming soom
SwinT-B 88M 47.0G 384 86.4% -
ConvNeXt-B 89M 45.1G 384 86.8% -
Conv2Former-B 90M 46.7G 384 87.0% Comming soom
SwinT-L 197M 34.5G 224 86.3% -
ConvNeXt-L 198M 34.4G 224 86.6% -
Conv2Former-L 199M 36.0G 224 87.0% Comming soom
EffNet-V2-XL 208M 94G 480 87.3% -
SwinT-L 197M 104G 384 87.3% -
ConvNeXt-L 198M 101G 384 87.5% -
CoAtNet-3 168M 107G 384 87.6% -
Conv2Former-L 199M 106G 384 87.7% Comming soom

Reference

You may want to cite:

@article{hou2022conv2former,
  title={Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition},
  author={Hou, Qibin and Lu, Cheng-Ze and Cheng, Ming-Ming and Feng, Jiashi},
  journal={arXiv preprint arXiv:2211.11943},
  year={2022}
}

@inproceedings{liu2022convnet,
      title={A ConvNet for the 2020s}, 
      author={Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
      booktitle=CVPR,
      year={2022}
}

@inproceedings{liu2021swin,
  title={Swin transformer: Hierarchical vision transformer using shifted windows},
  author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
  booktitle=ICCV,
  year={2021}
}

@inproceedings{tan2021efficientnetv2,
  title={Efficientnetv2: Smaller models and faster training},
  author={Tan, Mingxing and Le, Quoc},
  booktitle=ICML,
  pages={10096--10106},
  year={2021},
  organization={PMLR}
}

@misc{focalmnet,
  author = {Yang, Jianwei and Li, Chunyuan and Gao, Jianfeng},
  title = {Focal Modulation Networks},
  publisher = {arXiv},
  year = {2022},
}

@article{dai2021coatnet,
  title={Coatnet: Marrying convolution and attention for all data sizes},
  author={Dai, Zihang and Liu, Hanxiao and Le, Quoc and Tan, Mingxing},
  journal=NIPS,
  volume={34},
  year={2021}
}

@inproceedings{replknet,
  author = {Ding, Xiaohan and Zhang, Xiangyu and Zhou, Yizhuang and Han, Jungong and Ding, Guiguang and Sun, Jian},
  title = {Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs},
  booktitle=CVPR,
  year = {2022},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages