Meta debuts next-generation Llama 3 LLM series and new chatbot features – Business

admin 20th April 2024

Spread the love

Meta Platforms Inc. today debuted Llama 3, a new series of open-source large language models that the company says can outperform the competition across several task categories.

The first two LLMs in the lineup feature 8 billion and 70 billion parameters. Down the road, Meta plans to expand the series with additional models that will feature more than 400 billion parameters. In conjunction, the company is rolling out a new chatbot to its social media platforms that uses Llama 3 to answer user questions.

Upgraded LLM architecture

Llama 3 is the third iteration of a language model series that Meta first introduced last February. The company trained the first-generation models on 1.4 trillion tokens, units of data that each contain a few letters or numbers. According to Meta, Llama 3 was trained on a dataset more than ten times that size which included a significant amount of code and multilingual text.

Meta also made upgrades to the LLMs’ architecture. According to the company, many of the enhancements are designed to improve Llama 3’s hardware-efficiency during inference.

Language models process text in phases. First, they break down the sentences inputted by the user prompt into tokens, word fragments that each contain a few characters. Those word fragments are then turned into mathematical structures called embeddings on which the LLM carries out calculations. From there, the model translates the calculation results back into tokens and assembles the tokens into a prompt response.

Llama 3 features an improved tokenizer, the LLM component that translates user input into tokens. Meta says the mechanism is more efficient than its predecessor and thus helps improve the new models’ performance.

Llama 3 also implements grouped query attention, an improved version of the so-called attention mechanism that LLMs use to process text. A model’s attention mechanism allows it to take into account the context in which a word is used when interpreting its meaning. This context enables LLMs to understand text more accurately than earlier types of neural networks.

Meta researchers evaluated the impact of the architectural upgrades on Llama 3’s performance using an internally-developed benchmark test. They also tested the LLM series across five existing AI benchmarks. According to Meta, both editions of Llama 3 generated more accurate results than competing models with similar parameter counts.

“We developed a new high-quality human evaluation set,” the researchers detailed. “This evaluation set contains 1,800 prompts that cover 12 key use cases: asking for advice, brainstorming, classification, closed question answering, coding, creative writing, extraction, inhabiting a character/persona, open question answering, reasoning, rewriting, and summarization.”

Custom training infrastructure

Meta trained Llama 3 on two server clusters that each feature 24,000 graphics processing units. According to the company, its engineers developed a custom software platform that can detect technical issues in GPUs and automatically fix them. The software helped Meta cut downtime and thereby reduce the amount of GPU computing capacity that goes unused.

“Our most efficient implementation achieves a compute utilization of over 400 TFLOPS per GPU when trained on 16K GPUs simultaneously,” Meta detailed. “Combined, these improvements increased the efficiency of Llama 3 training by ~three times compared to Llama 2.”

Meta also developed a new data storage system to support the project. According to the company, the system is designed to reduce the amount of hardware resources necessary to perform rollbacks. In AI development, a rollback is the task of reverting an LLM that was trained incorrectly to an earlier version.

Meta is currently using the infrastructure with which it developed the first two Llama 3 models to train a set of larger, more advanced LLMs. Those models feature more than 400 billion parameters. They will be capable of processing longer prompts and generating higher-quality responses.

The code for the initial two Llama 3 models is available on GitHub. Additionally, Meta has made the LLMs accessible through Amazon Web Services Inc.’s Amazon SageMaker service, Google Cloud and Microsoft Azure. Meta Chief Executive Officer Mark Zuckerberg has indicated that the company will likely also open-source the Llama 3 versions with over 400 billion parameters once development is complete.

Upgraded chatbot features

Meta is using Llama 3 to power a new version of Meta AI, a chatbot that it rolled out to its Messenger chat tool last year. The new version of the chatbot will also be accessible through the search bars of Instagram, Facebook and WhatsApp. Additionally, Meta is integrating it into the Facebook Feed to let users ask questions about posts.

The most significant upgrade is rolling out to the chatbot’s image generation feature. Meta AI now generates sharper images and provides the option to turn those images into animated GIFs. Moreover, the feature works faster than before: Meta AI can update generated images in real-time while users are typing a new prompt or rewriting an existing one.

The chatbot’s text processing capabilities have been enhanced as well. According to Meta, the AI can now fetch “real-time information” such as up-to-date product prices from around the web. The chatbot also lends itself to several other tasks including rewriting text and solving math problems.

Image: Meta

.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” –

THANK YOU