Microsoft Open Source LMOps: An AI Prompt Optimization Toolkit For Generative AI Models

Due to recent technological advancements and their remarkable performance, the general population has now widely adopted language models (LLMs). For instance, tools like ChatGPT can compose lengthy responses to user-provided questions, which can assist authors and other writers in improving their writing style. On the other hand, generative models like HuggingFace’s Stable Diffusion model can directly generate astounding image outcomes from user input. Although these LLMs produce a variety of outputs, they all share one feature in common: they all use text prompts as model input.

Researchers have found that writing quality prompts is one of the quickest and most effective ways to produce better outputs. This is where prompt engineering comes into play. n natural language processing, prompt engineering refers to the process of determining inputs that provide preferable or practical outcomes. In other words, these methods modify the text input by incorporating suggestions for an artistic style or design elements, such as lighting. For example, a better prompt may be “a deserted city with empty buildings, vegetation, high definition, high quality, photo-realistic, ultra-realistic, 4k” rather than “a deserted city with empty buildings.”

Currently, a lot of time and effort is being spent on building prompt engineering tools that can be used to create better text prompts as model inputs. As a major stepping stone in this direction, Microsoft Research recently released LMOps, a set of tools for enhancing text prompts used as input for generative AI models. LMOps is a research initiative aimed at fundamental research for developing foundational models for AI products, focusing on the underlying technology for providing AI capabilities with LLMs and Generative AI models. Promptist, a prompt interface that optimizes user text input for text-to-image conversion, and Structured Prompting, a method for adding more instances in a few-shot learning prompt for text generation, are included in the toolkit.

Researchers at Microsoft also worked on developing a language model that is very helpful for automatic text prompt optimization for text-to-image generation. This language model is primarily based on reinforcement learning. On this front, the team first used supervised learning on a set of manually optimized prompts to fine-tune a pretrained language model. Following that, reinforcement learning was used to further train the model. Since reinforcement learning works on a reward function basis, the team utilized the updated prompts as input to the text-to-image generator and evaluated the generated images on “relevance and aesthetics” using CLIP. The final model was assessed manually by a team of researchers who, in most cases, preferred the images produced by the optimized prompt to those produced by the original prompt.

Microsoft Researchers also addressed one of the primary drawbacks of input sequences for LLMs. The largest input sequence that the LLMs can handle is often in the range of a few thousand words. This restriction is overcome by Microsoft’s Structured Prompting, which supports hundreds of examples. In order to achieve this, examples are first concatenated into groups, and then each group is fed as input into the model. The hidden key and value vectors of the model’s attention modules are cached. The cached attention vectors are then used by the model’s hidden layers when the user’s unaltered input prompt is passed to the model. This newfound approach introduced by the researchers outperforms the traditional method on several NLP tasks.

The toolkit is under extensive development currently to incorporate more features for prompt optimization

Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 14k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Khushboo Gupta is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Goa. She is passionate about the fields of Machine Learning, Natural Language Processing and Web Development. She enjoys learning more about the technical field by participating in several challenges.