This document lists the release notes for MXNet-Neuron framework.
- Issue: MXNet Model Server is not able to clean up Neuron RTD states after model is unloaded (deleted) from model server.
- Workaround: run “
/opt/aws/neuron/bin/neuron-cli reset
“ to clear Neuron RTD states after all models are unloaded and server is shut down.
- Workaround: run “
Date 09/22/2020
Various minor improvements.
- Issue: When first importing MXNet into python process and subprocess call is invoked, user may get an OSError exception "OSError: [Errno 14] Bad address" during subprocess call (see apache/mxnet#13875 for more details). This issue is fixed with a mitigation patch from MXNet for Open-MP fork race conditions.
- Workaround for earlier versions: Export KMP_INIT_AT_FORK=false before running python process.
Date 08/08/2020
Various minor improvements.
Date 08/05/2020
Various minor improvements.
Date 07/16/2020
This release contains a few bug fixes and user experience improvements.
- User can specify NEURONCORE_GROUP_SIZES without brackets (for example, "1,1,1,1"), as can be done in TensorFlow-Neuron and PyTorch-Neuron.
- Fixed a memory leak when inferring neuron subgraph properties
- Fixed a bug dealing with multi-input subgraphs
Date 6/11/2020
- Added support for profiling during inference
- Profiling can now be enabled by specifying the profiling work directory using NEURON_PROFILE environment variable during inference. For an example of using profiling, see Getting Started. (Note that graph view of MXNet graph is not available via TensorBoard).
Date 5/11/2020
Improved support for shared-memory communication with Neuron-Runtime.
- Added support for the BERT-Base model (base: L-12 H-768 A-12), max sequence length 64 and batch size of 8.
- Improved security for usage of shared-memory for data transfer between framework and Neuron-Runtime
- Improved allocation and cleanup of shared-memory resource
- Improved container support by automatic falling back to GRPC data transfer if shared-memory cannot be allocated by Neuron-Runtime
- User is unable to allocate Neuron-Runtime shared-memory resource when using MXNet-Neuron in a container to communicate with Neuron-Runtime in another container. This is resolved by automatic falling back to GRPC data transfer if shared-memory cannot be allocated by Neuron-Runtime.
- Fixed issue where some large models could not be loaded on inferentia.
Date 3/26/2020
No major changes or fixes
Date 2/27/2020
No major changes or fixes.
The issue(s) below are resolved:
- Latest pip version 20.0.1 breaks installation of MXNet-Neuron pip wheel which has py2.py3 in the wheel name.
- User is unable to allocate Neuron-Runtime shared-memory resource when using MXNet-Neuron in a container to communicate with Neuron-Runtime in another container. To work-around, please set environment variable NEURON_RTD_USE_SHM to 0.
Date 1/27/2020
No major changes or fixes.
- The following issue is resolved when the latest multi-model-server with version >= 1.1.0 is used with MXNet-Neuron. You would still need to use "
/opt/aws/neuron/bin/neuron-cli reset
" to clear all Neuron RTD states after multi-model-server is exited:- Issue: MXNet Model Server is not able to clean up Neuron RTD states after model is unloaded (deleted) from model server and previous workaround "
/opt/aws/neuron/bin/neuron-cli reset
" is unable to clear all Neuron RTD states.
- Issue: MXNet Model Server is not able to clean up Neuron RTD states after model is unloaded (deleted) from model server and previous workaround "
- Latest pip version 20.0.1 breaks installation of MXNet-Neuron pip wheel which has py2.py3 in the wheel name. This breaks all existing released versions. The error looks like:
Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com
ERROR: Could not find a version that satisfies the requirement mxnet-neuron (from versions: none)
ERROR: No matching distribution found for mxnet-neuron
- Work around: install the older version of pip using "pip install pip==19.3.1".
Date 12/1/2019
-
Issue: Compiler flags cannot be passed to compiler during compile call. The fix: compiler flags can be passed to compiler during compile call using “flags” option followed by a list of flags.
-
Issue: Advanced CPU fallback option is a way to attempt to improve the number of operators on Inferentia. The default is currently set to on, which may cause failures. The fix: This option is now off by default.
- Issue: MXNet Model Server is not able to clean up Neuron RTD states after model is unloaded (deleted) from model server and previous workaround "
/opt/aws/neuron/bin/neuron-cli reset
" is unable to clear all Neuron RTD states.- Workaround: run “
sudo systemctl restart neuron-rtd
“ to clear Neuron RTD states after all models are unloaded and server is shut down.
- Workaround: run “
Date 12/20/2019
No major changes or fixes. Released with other Neuron packages.
Date 12/1/2019
-
Issue: Compiler flags cannot be passed to compiler during compile call. The fix: compiler flags can be passed to compiler during compile call using “flags” option followed by a list of flags.
-
Issue: Advanced CPU fallback option is a way to attempt to improve the number of operators on Inferentia. The default is currently set to on, which may cause failures. The fix: This option is now off by default.
- Issue: MXNet Model Server is not able to clean up Neuron RTD states after model is unloaded (deleted) from model server and previous workaround "
/opt/aws/neuron/bin/neuron-cli reset
" is unable to clear all Neuron RTD states.- Workaround: run “
sudo systemctl restart neuron-rtd
“ to clear Neuron RTD states after all models are unloaded and server is shut down.
- Workaround: run “
Date: 11/25/2019
This version is available only in released DLAMI v26.0 and is based on MXNet version 1.5.1. Please update to latest version.
- Issue: Compiler flags cannot be passed to compiler during compile call.
- Issue: Advanced CPU fallback option is a way to attempt to improve the number of operators on Inferentia. The default is currently set to on, which may cause failures.
- Workaround: explicitly turn it off by setting compile option op_by_op_compiler_retry to 0.
- Issue: Temporary files are put in current directory when debug is enabled.
- Workaround: create a separate work directory and run the process from within the work directory
- Issue: MXNet Model Server is not able to clean up Neuron RTD states after model is unloaded (deleted) from model server.
- Workaround: run “
/opt/aws/neuron/bin/neuron-cli reset
“ to clear Neuron RTD states after all models are unloaded and server is shut down.
- Workaround: run “
- Issue: MXNet 1.5.1 may return inconsistent node names for some operators when they are the primary outputs of a Neuron subgraph. This causes failures during inference.
- Workaround : Use the
excl_node_names
compilation option to change the partitioning of the graph during compile so that these nodes are not the primary output of a neuron subgraph. See MXNet-Neuron Compilation API
compile_args = { 'excl_node_names': ["node_name_to_exclude"] }
- Workaround : Use the
The following models have successfully run on neuron-inferentia systems
- Resnet50 V1/V2
- Inception-V2/V3/V4
- Parallel-WaveNet
- Tacotron 2
- WaveRNN
- Python versions supported:
- 3.5, 3.6, 3.7
- Linux distribution supported:
- Ubuntu 16, Ubuntu 18, Amazon Linux 2