Page 36 - SAMENA Trends - May 2020
P. 36
REGIONAL & MEMBERS UPDATES SAMENA TRENDS
that are hard to even imagine right now,” also needs tools to train the models across Build conference, Microsoft announced
he said. A new class of multitasking AI these interconnected computers. The that it would soon begin open sourcing
models Machine learning experts have supercomputer developed for OpenAI is a its Microsoft Turing models, as well as
historically built separate, smaller AI single system with more than 285,000 CPU recipes for training them in Azure Machine
models that use many labeled examples cores, 10,000 GPUs and 400 gigabits per Learning. This will give developers access
to learn a single task such as translating second of network connectivity for each to the same family of powerful language
between languages, recognizing objects, GPU server. Compared with other machines models that the company has used to
reading text to identify key points in an listed on the TOP500 supercomputers in improve language understanding across
email or recognizing speech well enough the world, it ranks in the top five, Microsoft its products. It also unveiled a new version
to deliver today’s weather report when says. Hosted in Azure, the supercomputer of DeepSpeed, an open source deep
asked. A new class of models developed also benefits from all the capabilities of learning library for PyTorch that reduces
by the AI research community has proven a robust modern cloud infrastructure, the amount of computing power needed
that some of those tasks can be performed including rapid deployment, sustainable for large distributed model training. The
better by a single massive model — datacenters and access to Azure services. update is significantly more efficient than
one that learns from examining billions “As we’ve learned more and more about the version released just three months ago
of pages of publicly available text, for what we need and the different limits and now allows people to train models
example. This type of model can so deeply of all the components that make up a more than 15 times larger and 10 times
absorb the nuances of language, grammar, supercomputer, we were really able to say, faster than they could without DeepSpeed
knowledge, concepts and context that it ‘If we could design our dream system, on the same infrastructure. Along with
can excel at multiple tasks: summarizing a what would it look like?’” said OpenAI CEO the DeepSpeed announcement, Microsoft
lengthy speech, moderating content in live Sam Altman. “And then Microsoft was announced it has added support for
gaming chats, finding relevant passages able to build it.” OpenAI’s goal is not just distributed training to the ONNX Runtime.
across thousands of legal files or even to pursue research breakthroughs but The ONNX Runtime is an open source
generating code from scouring GitHub. As also to engineer and develop powerful library designed to enable models to be
part of a companywide AI at Scale initiative, AI technologies that other people can portable across hardware and operating
Microsoft has developed its own family use, Altman said. The supercomputer systems. To date, the ONNX Runtime
of large AI models, the Microsoft Turing developed in partnership with Microsoft has focused on high-performance
models, which it has used to improve was designed to accelerate that cycle. inferencing; today’s update adds support
many different language understanding “We are seeing that larger-scale systems for model training, as well as adding the
tasks across Bing, Office, Dynamics and are an important component in training optimizations from the DeepSpeed library,
other productivity products. Earlier this more powerful models,” Altman said. which enable performance improvements
year, it also released to researchers the For customers who want to push their AI of up to 17 times over the current ONNX
largest publicly available AI language ambitions but who don’t require a dedicated Runtime. “We want to be able to build
model in the world, the Microsoft Turing supercomputer, Azure AI provides access these very advanced AI technologies that
model for natural language generation. to powerful compute with the same set ultimately can be easily used by people
The goal, Microsoft says, is to make its of AI accelerators and networks that also to help them get their work done and
large AI models, training optimization tools power the supercomputer. Microsoft is accomplish their goals more quickly,” said
and supercomputing resources available also making available the tools to train Microsoft Principal Program Manager Phil
through Azure AI services and GitHub so large AI models on these clusters in a Waymouth. “These large models are going
developers, data scientists and business distributed and optimized way. At its to be an enormous accelerant.”
customers can easily leverage the power of
AI at Scale. “By now most people intuitively
understand how personal computers are a
platform — you buy one and it’s not like
everything the computer is ever going to
do is built into the device when you pull it
out of the box,” Scott said. “That’s exactly
what we mean when we say AI is becoming
a platform,” he said. “This is about taking
a very broad set of data and training a
model that learns to do a general set of
things and making that model available
for millions of developers to go figure out
how to do interesting and creative things
with.” Training massive AI models requires
advanced supercomputing infrastructure,
or clusters of state-of-the-art hardware
connected by high-bandwidth networks. It
36 MAY 2020