Orca, developed by Microsoft AI, is a 13 billion parameter model that learns to mimic the logic of large foundation models.


The remarkable zero-shot learning capabilities demonstrated by large foundation models (LFMs) like ChatGPT and GPT-4 have sparked a question: Can these models autonomously supervise their behavior or other models with minimal human intervention? To explore this, a team of Microsoft researchers introduces Orca, a 13-billion parameter model that learns complex explanation traces and step-by-step thought processes from GPT-4. This innovative approach significantly improves the performance of existing state-of-the-art instruction-tuned models, addressing challenges related to task diversity, query complexity, and data scaling.

The researchers acknowledge that the query and response pairs from GPT-4 can provide valuable guidance for student models. Therefore, they enhance these pairs by adding detailed responses that offer a better understanding of the reasoning process employed by the teachers when generating their responses. By incorporating these explanation traces, Orca equips student models with improved reasoning and comprehension skills, effectively bridging the gap between teachers and students.

The research team utilizes the Flan 2022 Collection to enhance Orca’s learning process further. The team samples tasks from this extensive collection to ensure a diverse mix of challenges. These tasks are then sub-sampled to generate complex prompts, which serve as queries for LFMs. This approach creates a diverse and rich training set that facilitates robust learning for the Orca, enabling it to tackle a wide range of tasks effectively.

The researchers conduct comprehensive evaluations to assess Orca’s capabilities, focusing on generative, reasoning, and comprehension abilities. They compare Orca’s performance against strong baselines such as Text-Davinci-003, ChatGPT, GPT-4, and Vicuna. The results demonstrate Orca’s superiority over state-of-the-art instruction-tuned models like Vicuna-13B, showing an improvement of over 100% on BigBench Hard (BBH). Furthermore, Orca exhibits competitive performance on academic exams in zero-shot settings, indicating its potential for real-world applications.

The research findings confirm the tremendous potential of learning from step-by-step explanations in enhancing model performance. By incorporating detailed explanation traces and scaling tasks with complex prompts, Orca achieves significant advancements in instruction-tuned models. This approach not only empowers student models to enhance their reasoning and comprehension abilities but also enables them to surpass existing benchmarks.

The introduction of Orca and its successful application in improving instruction-tuned models present exciting prospects for future research. As LFMs continue to evolve, self-supervised learning mechanisms and the ability to supervise other models with minimal human intervention could revolutionize the field of artificial intelligence. By refining the learning process from complex explanation traces, researchers can continue enhancing model performance across various tasks, driving advancements in natural language processing.

In conclusion, the introduction of Orca, a 13-billion parameter model that learns explanation traces from GPT-4, represents a significant breakthrough in advancing instruction-tuned models. Orca surpasses existing models through explanation tuning, scaling tasks and instructions, and rigorous evaluation, marking a substantial leap forward in AI system capabilities. Incorporating step-by-step explanations in training processes holds promise for fully unlocking the potential of large foundation models and driving progress in natural language processing.