ViT算法解读——Transformer在分类任务中的应用论文:An image is worth 16x16 words: Transformers for image recognition at scale 作者:Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly,