Microsoft's New Supercomputer will Train Large AI Models By CIOReviewIndia Team

Microsoft's New Supercomputer will Train Large AI Models

CIOReviewIndia Team | Wednesday, 20 May 2020, 09:40 IST

Microsoft’s New Supercomputer will Train Large AI Models

Microsoft has a new supercomputer that is the fifth-fastest in the world. The company claims that this will be the first supercomputer integrated with the cloud.

During the annual Microsoft Build 2020 conference, the organization announced that this supercomputer will be linked to Microsoft’s Azure infrastructure to train large AI models. It is a single system with more than 285000 CPU cores and 10000 GPUs, where each GPU server has 400 gigabits/second of network connectivity.

Microsoft’s supercomputer has been built-in collaboration with the US-based OpenAI, whose Machine Learning efficiency has been getting more efficient day by day, doubling every 16 months. This has outpaced even Moore’s law – according to a paper by Danny Hernandez and Tom Brown.

“We are seeing that larger-scale systems are an important component in training more powerful models,” says Sam Altman CEO of OpenAI.

Multi-tasking AI model to meet more than one end goal

Historically, data scientists have built smaller AI models, which use labelled examples to learn a single task such as translating language, recognizing objects or to deliver the day’s weather report.

In a multitasking AI model, one model is capable of meeting more than one end goal. For instance, if a model can understand the nuances of language such as the human intent behind a statement, then the same algorithm can also be used to analyse billions of pages of text, moderate chat content and generate code as well.

“This is about being able to do a hundred exciting things in natural language processing at once and a hundred exciting things in a computer vision, and when you start to see combinations of these perceptual domains, you’re going to have new applications that are hard to even imagine right now,” said Microsoft’s Chief Technical Officer, Kevin Scott, explaining how the application of the supercomputer of AI will give the birth of a new class of multi-tasking AI models.

Open-Sourcing Turing Models

The Turing Model of Microsoft, for natural language generation of Microsoft, is the largest publicly available AI language model in the world, with over 17 billion parameters and was released earlier this year. Microsoft today, announced that it will begin open-sourcing its Turing models along with how-to guides on how to train them using Azure’s Machine Learning.

“This has enabled things that were seemingly impossible with smaller models,” said Luis Vargas, a Microsoft partner technical advisor who is spearheading the company’s AI at Scale initiative.

And through this, not just Microsoft, but also other developers will be able to implement the models to improve language understanding across its products.