GShard: Scaling Giant Neural Networks with Conditional Computation

research
advanced
Author

Krishnatheja Vanka

Published

July 15, 2025