Nvidia is boasting of a breakthrough in conversation natural language processing (NLP) training and inference, enabling more complex interchanges between customers and chatbots with immediate responses.
The need for such technology is expected to grow, as digital voice assistants alone are expected to climb from 2.5 billion to 8 billion within the next four years, according to Juniper Research, while Gartner predicts that by 2021, 15% of all customer service interactions will be completely handled by AI, an increase of 400% from 2017.
The company said its DGX-2 AI platform trained the BERT-Large AI language model in less than an hour and performed AI inference in 2+ milliseconds, making it possible “for developers to use state-of-the-art language understanding for large-scale applications.”
BERT, or Bidirectional Encoder Representations from Transformers, is a Google-powered AI language model that many developers say has better accuracy than humans in some performance evaluations. It’s all discussed here.
Nvidia sets natural language processing records
All told, Nvidia is claiming three NLP records:
1. Training: Running the largest version of the BERT language model, a Nvidia DGX SuperPOD with 92 Nvidia DGX-2H systems running 1,472 V100 GPUs cut training from several days to 53 minutes. A single DGX-2 system, which is about the size of a tower PC, trained BERT-Large in 2.8 days.
“The quicker we can train a model, the more models we can train, the more we learn about the problem, and the better the results get,” said Bryan Catanzaro, vice president of applied deep learning research, in a statement.
2. Inference: Using Nvidia T4 GPUs on its TensorRT deep learning inference platform, Nvidia performed inference on the BERT-Base SQuAD dataset in 2.2 milliseconds, well under the 10 millisecond processing threshold for many real-time applications, and far ahead of the 40 milliseconds measured with highly optimized CPU code.
3. Model: Nvidia said its new custom model, called Megatron, has 8.3 billion parameters, making it 24 times larger than the BERT-Large and the world’s largest language model based on Transformers, the building block used for BERT and other natural language AI models.
In a move sure to make FOSS advocates happy, Nvidia is also making a ton of source code available via GitHub.
- NVIDIA GitHub BERT training code with PyTorch
- NGC model scripts and check-points for TensorFlow
- TensorRT optimized BERT Sample on GitHub
- Faster Transformer: C++ API, TensorRT plugin, and TensorFlow OP
- MXNet Gluon-NLP with AMP support for BERT (training and inference)
- TensorRT optimized BERT Jupyter notebook on AI Hub
- Megatron-LM: PyTorch code for training massive Transformer models
Not that any of this is easily consumed. We’re talking very advanced AI code. Very few people will be able to make heads or tails of it. But the gesture is a positive one.