Page 36 - SAMENA Trends - May 2020
P. 36

REGIONAL & MEMBERS UPDATES  SAMENA TRENDS

        that are hard to even imagine right now,”   also needs tools to train the models across   Build  conference,  Microsoft  announced
        he  said. A new class of multitasking AI   these  interconnected  computers.  The   that  it  would soon begin open  sourcing
        models  Machine learning  experts  have   supercomputer developed for OpenAI is a   its Microsoft  Turing  models,  as well as
        historically built separate,  smaller AI   single system with more than 285,000 CPU   recipes for training them in Azure Machine
        models that  use  many  labeled  examples   cores, 10,000 GPUs and 400 gigabits per   Learning. This will give developers access
        to learn a single task such as translating   second  of network connectivity  for each   to the same family of powerful language
        between  languages, recognizing  objects,   GPU server. Compared with other machines   models that  the company  has  used to
        reading  text to identify  key points in an   listed on the  TOP500 supercomputers  in   improve language understanding across
        email  or recognizing  speech  well  enough   the world, it ranks in the top five, Microsoft   its products. It also unveiled a new version
        to deliver today’s  weather report  when   says. Hosted in Azure, the supercomputer   of DeepSpeed,  an  open  source deep
        asked.  A  new class  of models developed   also  benefits  from  all  the  capabilities  of   learning  library for PyTorch that reduces
        by the AI research community has proven   a  robust modern  cloud  infrastructure,   the amount of computing  power needed
        that some of those tasks can be performed   including  rapid deployment,  sustainable   for large distributed  model  training. The
        better  by  a  single massive model —   datacenters and access to Azure services.   update is significantly more efficient than
        one that  learns  from examining billions   “As we’ve learned more and more about   the version released just three months ago
        of pages  of publicly available  text, for   what we need  and the different limits   and now  allows  people  to  train models
        example. This type of model can so deeply   of all  the components  that make up a   more than 15  times larger and  10  times
        absorb the nuances of language, grammar,   supercomputer, we were really able to say,   faster than they could without DeepSpeed
        knowledge,  concepts  and context that it   ‘If  we  could  design  our  dream  system,   on  the  same infrastructure. Along  with
        can excel at multiple tasks: summarizing a   what would it look like?’” said OpenAI CEO   the DeepSpeed  announcement,  Microsoft
        lengthy speech, moderating content in live   Sam  Altman.  “And then Microsoft was   announced  it  has  added  support  for
        gaming  chats,  finding  relevant  passages   able to build it.” OpenAI’s goal is not just   distributed training to the ONNX Runtime.
        across  thousands  of  legal  files  or  even   to pursue research breakthroughs but   The ONNX Runtime is  an  open source
        generating code from scouring GitHub. As   also  to engineer  and develop  powerful   library designed  to enable  models to be
        part of a companywide AI at Scale initiative,   AI  technologies  that  other people  can   portable across  hardware and operating
        Microsoft has developed  its own family   use,  Altman said. The  supercomputer   systems.  To  date, the ONNX  Runtime
        of large AI models, the Microsoft Turing   developed  in partnership  with Microsoft   has  focused  on  high-performance
        models,  which it  has  used to  improve   was  designed  to accelerate  that cycle.   inferencing;  today’s  update adds  support
        many different  language  understanding   “We are seeing that larger-scale systems   for model training, as  well  as  adding  the
        tasks  across  Bing,  Office,  Dynamics  and   are an  important  component  in  training   optimizations from the DeepSpeed library,
        other  productivity  products.  Earlier this   more powerful models,”  Altman said.   which enable performance improvements
        year,  it  also  released to researchers the   For customers who want to push their AI   of up to 17 times over the current ONNX
        largest publicly available AI language   ambitions but who don’t require a dedicated   Runtime. “We want  to  be able to  build
        model  in  the world, the Microsoft Turing   supercomputer, Azure AI provides access   these very advanced AI technologies that
        model  for natural  language  generation.   to powerful compute  with the same set   ultimately can be easily  used  by  people
        The goal, Microsoft says, is  to make its   of AI accelerators and networks that also   to  help them get their work  done  and
        large AI models, training optimization tools   power  the  supercomputer.  Microsoft is   accomplish their  goals more quickly,” said
        and supercomputing resources  available   also making available the  tools to train   Microsoft Principal Program Manager Phil
        through  Azure  AI services  and  GitHub  so   large AI  models  on these clusters  in  a   Waymouth. “These large models are going
        developers,  data scientists and  business   distributed  and optimized way.  At its   to be an enormous accelerant.”
        customers can easily leverage the power of
        AI at Scale. “By now most people intuitively
        understand how personal computers are a
        platform —  you  buy one  and it’s  not like
        everything  the computer  is ever going  to
        do is built into the device when you pull it
        out of the box,” Scott said. “That’s exactly
        what we mean when we say AI is becoming
        a platform,” he said. “This is about taking
        a very broad set of data and training a
        model that  learns  to  do a  general  set  of
        things  and making that model  available
        for millions of developers to go figure out
        how to do interesting and creative things
        with.” Training massive AI models requires
        advanced  supercomputing  infrastructure,
        or clusters of state-of-the-art  hardware
        connected by high-bandwidth networks. It

                                                                                                      36     MAY 2020
   31   32   33   34   35   36   37   38   39   40   41