Microsoft and Nvidia create 105-layer, 530 billion parameter language model that needs 280 A100 GPUs, but it's still biased

megatron-turing-nlg-model-size-graph.jpg

Symbol: Microsoft

Nvidia and Microsoft have teamed as much as create the Megatron-Turing Herbal Language Technology style, which the duo claims is the “maximum robust monolithic transformer language style skilled to this point”.

The AI style has 105 layers, 530 billion parameters, and operates on chunky supercomputer hardware like Selene.

By means of comparability, the vaunted GPT-Three has 175 billion parameters.

“Each and every style copy spans 280 NVIDIA A100 GPUs, with Eight-way tensor-slicing inside a node, and 35-way pipeline parallelism throughout nodes,” the pair stated in a weblog publish.

The style used to be skilled on 15 datasets that contained 339 billion tokens, and used to be able to appearing how better fashions want much less coaching to perform smartly.

Alternatively, the wish to perform with languages and samples from the true international intended an previous drawback with AI reappeared: Bias.

“Whilst large language fashions are advancing the cutting-edge on language technology, in addition they be afflicted by problems equivalent to bias and toxicity,” the duo stated.

“Our observations with MT-NLG are that the style alternatives up stereotypes and biases from the information on which it’s skilled. Microsoft and Nvidia are dedicated to operating on addressing this drawback.

“Our observations with MT-NLG are that the style alternatives up stereotypes and biases from the information on which it’s skilled. Microsoft and Nvidia are dedicated to operating on addressing this drawback.”

It wasn’t see you later in the past that Microsoft had its chatbot Tay flip complete Nazi in a question of hours via interacting on the web.

Similar Protection

Leave a Reply

Your email address will not be published. Required fields are marked *