论文笔记 -《MegaBlocks- Efficient Sparse Training with Mixture-of-Experts》MegaBlocks: 高效的稀疏MoE训练1 Stanford University, Stanford, California, USA 2 Microsoft Research, Redmond, Washington, USA 3 Google Research, Mountain View, California, USA. Correspondence to: Trevor [email protected]. 1、MoE的潜力与挑战