Large AI models and applications, such as ChatGPT and GPT-4, have become increasingly popular worldwide, with many experts from academia…
Databricks presents Dolly, a low-cost LLM that demonstrates surprisingly high levels of the instruction-following abilities seen in ChatGPT. This work indicates that anyone with access to high-quality training data and an out-of-date open-source large language model (LLM) can train it to perform like ChatGPT in under 30 minutes on a single machine. Dolly uses data from Alpaca to make minor adjustments to an existing, open-source 6 billion parameter model from EleutherAI to elicit instruction following capabilities such as brainstorming and text production.
Many factors make it preferable for a business to create its own LLM model rather than provide data to a centralized LLM provider who uses a proprietary model concealed behind an API. For instance, many businesses may be hesitant to hand up their most valuable intellectual property to a third party in the form of the challenges and datasets that stand to gain the most from AI. Companies may also have varying priorities regarding model quality, cost, and desired behavior. The team believed owning one’s models is the best long-term strategy for most ML users.
This work finds that even open-source models years old with much earlier architectures exhibit striking behaviors when fine-tuned on a small corpus of instruction training data.
Dolly’s success is even more remarkable since the two-year-old model behind it only includes 6 billion parameters, compared to 175 billion in GPT-3. This shows that targeted corpora of instruction-following training data, rather than larger or better-tuned base models, may be responsible for the qualitative gains in state-of-the-art models like ChatGPT.
In evaluating Dolly’s instruction-following skills, the researchers found that it has many qualitative qualities, as stated in the InstructGPT paper on which ChatGPT is based. These include text production, brainstorming, and open QA. Instead of focusing on the quality of the output text. These examples highlight the significant gain in instruction-following capabilities that can be achieved by fine-tuning a years-old open-source model on a small, high-quality dataset.
The team has published Dolly’s source code to demonstrate how to recreate it using Databricks. With the help of models like Dolly, they anticipate that LLMs will become more accessible, going from a luxury item that only a select few businesses can buy to a standard tool that all businesses can use and tweak to better their products.
Check out the Github and Reference Article. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 16k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.