[1] MLP-Mixer: An all-MLP Architecture for Vision - Google Research [2] Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks - 清华大学 [3] Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on …