Michael Gschwind
Michael Gschwind
Born
Vienna, Austria
NationalityUSA
Alma materTechnische Universität Wien

Michael Karl Gschwind is an American computer scientist who currently is a director and principal engineer at Meta in Menlo Park, CA. He is recognized for his seminal contributions to the design and exploitation of general-purpose programmable accelerators an early advocate of sustainability in computer design and as a prolific inventor.

Accelerators edit

Gschwind is best known for his contributions to general-purpose programmable Accelerators and Heterogeneous computing as architect of the Cell Broadband Engine processor used in the Sony PlayStation 3,[1][2] and the first supercomputer to reach sustained Petaflop operation. As Chief Architect for IBM Architecture, he led integration of Nvidia GPUs and IBM CPUs to create the Summit and Sierra supercomputers.

AI Acceleration edit

Gschwind was an early advocate of AI Hardware Acceleration with GPUs and programmable accelerators. As IBM's Chief Engineer for AI, he led the development of IBM's first AI products and initiated the PowerAI project which brought to market AI-optimized hardware (originally known as "Minsky systems"), and the first prebuilt hardware-optimized AI frameworks. At Facebook, Gschwind led the company-wide adoption of ASIC[3] and GPU Inference, and AI Accelerator Enablement for PyTorch, leading the development of Accelerated Transformers[4] (formerly "Better Transformer"[5] to establish PyTorch as the standard ecosystem for Large Language Models and Generative AI. Gschwind is one of the architects of Multiray, an accelerator-based platform for serving foundation models and the first production system to serve Large Language Models at scale in the industry, serving over 800 billion queries per day in 2022.[6]. Gschwind is a pioneer and advocate of Sustainable AI.[7]

Supercomputer Design edit

Gschwind was a chief architect for hardware design and software architecture for several supercomputers, including three top-ranked supercomputer systems Roadrunner (June 2008 – November 2009), Sequoia (June 2012 – November 2012), and Summit (June 2018 – June 2020).

Roadrunner was a supercomputer built by IBM for the Los Alamos National Laboratory in New Mexico, USA. The US$100-million Roadrunner was designed for a peak performance of 1.7 petaflops. It achieved 1.026 petaflops on May 25, 2008, to become the world's first TOP500 LINPACK sustained 1.0 petaflops system.[8][9] It was also the fourth-most energy-efficient supercomputer in the world on the Supermicro Green500 list, with an operational rate of 444.94 megaflops per watt of power used.

Sequoia was a petascale Blue Gene/Q supercomputer constructed by IBM for the National Nuclear Security Administration as part of the Advanced Simulation and Computing Program (ASC). It was delivered to the Lawrence Livermore National Laboratory (LLNL) in 2011 and was fully deployed in June 2012.[10] Sequoia was dismantled in 2020, its last position on the top500.org list was #22 in the November 2019 list.

Summit is a supercomputer developed by IBM for use at Oak Ridge Leadership Computing Facility (OLCF), a facility at the Oak Ridge National Laboratory. It held the number 1 position from November 2018 to June 2020.[11][12] Its current LINPACK benchmark is clocked at 148.6 petaFLOPS.[13]

Background edit

Gschwind was born in Vienna and obtained his doctorate degree in Computer Engineering at the Technische Universität Wien in 1996. He joined the [[IBM] Thomas J. Watson Research Center in Yorktown Heights, NY and also held positions IBM Systems product group and at its corporate headquarter in Armonk, NY. At Huawei, Gschwind served Vice President of Artificial Intelligence and Accelerated Systems at Huawei. Gschwind is currently a director at Meta Platforms where he has been responsible for AI Acceleration and AI infrastructure.


  1. ^ David Becker (December 3, 2004). "PlayStation 3 chip goes easy on developers". CNET. Retrieved January 13, 2019.
  2. ^ Scarpino, M. (2008). Programming the cell processor: for games, graphics, and computation. Pearson Education.
  3. ^ First-Generation Inference Accelerator Deployment at Facebook, https://arxiv.org/pdf/2107.04140.pdf
  4. ^ Michael Gschwind, Driss Guessous, Christian Puhrsch, Accelerated PyTorch 2 Transformers, https://pytorch.org/blog/accelerated-pytorch-2/
  5. ^ Michael Gschwind, Eric Han, Scott Wolchok, Rui Zhu, Christian Puhrsch, A BetterTransformer for Fast Transformer Inference, https://pytorch.org/blog/a-better-transformer-for-fast-transformer-encoder-inference/
  6. ^ MultiRay: Optimizing efficiency for large-scale AI models, https://ai.meta.com/blog/multiray-large-scale-AI-models/
  7. ^ Sustainable AI: Environmental Implications, Challenges and Opportunities, Conference on Machine Learning and Systems (MLSys), https://research.facebook.com/publications/sustainable-ai-environmental-implications-challenges-and-opportunities/
  8. ^ Gaudin, Sharon (2008-06-09). "IBM's Roadrunner smashes 4-minute mile of supercomputing". Computerworld. Archived from the original on 2008-12-24. Retrieved 2008-06-10.
  9. ^ Fildes, Jonathan (2008-06-09). "Supercomputer sets petaflop pace". BBC News. Retrieved 2008-06-09.
  10. ^ NNSA awards IBM contract to build next generation supercomputer, February 3, 2009
  11. ^ Lohr, Steve (8 June 2018). "Move Over, China: U.S. Is Again Home to World's Speediest Supercomputer". The New York Times. Retrieved 19 July 2018.
  12. ^ "Top 500 List - November 2022". TOP500. November 2022. Retrieved 13 April 2022.
  13. ^ "November 2022 | TOP500 Supercomputer Sites". TOP500. Retrieved 13 April 2022.