|
题目: MAE的介绍与研究 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
主讲人:杜启蒙
地点:腾讯会议
时间:2022年02月14日 周一晚 8点30分
简介:1)介绍Vision Transformerde的背景及方法介绍
2)介绍MAE在imageNet数据集自编码效果说明
3)代码进度
参考文献:
[1]An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby ,arxiv.org/abs/2010.11929
[2]Masked Autoencoders Are Scalable Vision Learners, Kaiming He ,Xinlei Chen (Facebook AI Research, FAIR),/arxiv.org/pdf/2111.06377.pdf
[3]李沐视频:https://www.bilibili.com/video/BV1sq4y1q77t?spm_id_from=333.999.0.0
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-9-21 11:32
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社