Open main menu

Wikipedia β

MXNet is a modern open-source deep learning framework used to train, and deploy deep neural networks. It is scalable, allowing for fast model training, and supports a flexible programming model and multiple languages (C++, Python, Julia, Matlab, JavaScript, Go, R, Scala, Perl, Wolfram Language)

Developer(s) Distributed (Deep) Machine Learning Community
Written in C++, Python, R, Julia, JavaScript, Scala, Go, Perl
Operating system Windows, Linux
Type Library for machine learning and deep learning
License Apache 2.0

The MXNet library is portable and can scale to multiple GPUs[1] and multiple machines. MXNet is supported by major Public Cloud providers including AWS[2] and Azure[3] Amazon has chosen MXNet as its deep learning framework of choice at AWS.[4][5] Currently, MXNet is supported by Intel, Dato, Baidu, Microsoft, Wolfram Research, and research institutions such as Carnegie Mellon, MIT, the University of Washington, and the Hong Kong University of Science and Technology.[6]



Apache MXNet is a lean, flexible, and ultra-scalable deep learning framework that supports state of the art in deep learning models, including convolutional neural networks (CNNs) and long short-term memory networks (LSTMs).


MXNet is designed to be distributed on dynamic Cloud infrastructure, using distributed parameter server (based on [7]), and can achieve almost linear scale with multiple GPU/CPU.


MXNet supports both imperative and symbolic programming, which makes it easier for developers that are used to imperative programming to get started with deep learning. It also makes it easier to track, debug, save checkpoints, modify hyperparameters, such as learning rate or perform early stopping.

Multiple LanguagesEdit

Supports C++ for the optimized backend to get the most of the GPU or CPU available, and Python, R, Scala, Julia, Perl, Matlab and Javascript for the simple to use frontend for the developers.


Supports an efficient deployment of a trained model to low-end devices for inference, such as mobile devices (using Amalgamation [[1]]), IoT devices (using AWS Greengrass), Serverless (Using AWS Lambda) or containers. These low-end environments can have only weaker CPU or limited memory (RAM), and should be able to use the models that were trained on a higher-level environment (GPU based cluster, for example).

See alsoEdit