Researchers Explore Foundation Models For Generalist Medical Artificial Intelligence


Foundation models are capable of being applied to a wide variety of downstream tasks after being trained on large and varied datasets. From textual questions responding to visual descriptions and game playing, individual models can now achieve state-of-the-art performance. Growing data sets, larger models, and improved model architectures have given rise to new possibilities for foundation models. 

Due to the complexity of medicine, the difficulty of collecting large, diverse medical information, and the novelty of this discovery, these models have not yet infiltrated medical AI. Most medical AI models use a task-specific model-building technique. Pictures must be manually labeled to train a model to analyze chest X-rays to detect pneumonia. A human must write a radiological report when this algorithm detects pneumonia. This hyper-focused, label-driven methodology produces stiff models that can only do the tasks in the training dataset. To adapt to new tasks or data distributions for the same goal, such models sometimes require retraining on a new dataset. 

The developments like multimodal architectures, self-supervised learning techniques, and in-context learning capabilities have made a new class of sophisticated medical foundation models called GMAI possible. Their “generalist” label suggests they will replace more specialized models for specific medical tasks.

Researchers from Stanford University, Harvard University, University of Toronto, Yale University School of Medicine, and Scripps Research Translational Institute identify three essential qualities that set GMAI models apart from traditional medical AI models. 

  1. A GMAI model can be easily adapted to a new task by simply stating the work in English (or another language). Models can address novel challenges after being introduced to them (dynamic task specification) but before requiring retraining.
  2. GMAI models can take in data from various sources and generate results in various formats. GMAI models will explicitly reflect medical knowledge, enabling them to reason through novel challenges and communicate their results in terms medical professionals understand. When compared to existing medical AI models, GMAI models have the potential to tackle a wider variety of tasks with fewer or no labels. Two of GMAI’s defining capabilities—supporting various combinations of data modalities and the capacity to carry out dynamically set tasks—enable GMAI models to engage with users in various ways. 
  3. GMAI models must explicitly represent medical domain knowledge and use it for sophisticated medical reasoning.

GMAI provides remarkable adaptability across jobs and situations by allowing users to interact with models via bespoke queries, making AI insights accessible to a wider range of consumers. To generate queries like “Explain the mass appearing on this head MRI scan,” users might use a custom query. Is it more likely to be a tumor or an abscess?”

Two crucial features, dynamic task specification and multimodal inputs and outputs will be made possible through user-defined queries. 

  1. Dynamic task specification: Artificial intelligence models can be retrained on the fly using custom queries to learn how to address new challenges. When asked, “Given this ultrasound, how thick is the gallbladder wall in millimeters?” GMAI can provide an answer that has never been seen before. The GMAI may be trained on a new notion with just a few examples, thanks to in-context learning.
  2. Multimodal inputs and outputs: Custom queries make the ability to arbitrarily combine modalities into complex medical concerns possible. When asking for a diagnosis, a doctor can attach several photos and lab reports to their query. If the customer requests a textual response and an accompanying visualization, a GMAI model can easily accommodate both requests.

Some of GMAI’s use cases are mentioned below:

  • Credible radiological findings: GMAI paves the way for a new class of flexible digital radiology assistants that may aid radiologists at any stage of their processes and significantly lessen their workloads. Radiology reports that include both aberrant and pertinent normal results and that takes the patient’s history into account can be automatically drafted by GMAI models. When combined with text reports, interactive visualizations from these models can greatly help doctors by, for example, highlighting the area specified by each phrase.
  • Enhanced surgical methods: With a GMAI model, surgical teams are expected to perform treatments more easily. GMAI models might do visualization tasks, such as annotating live video feeds of an operation. When surgeons discover unusual anatomical events, they may also convey verbal information by sounding alarms or reading pertinent literature aloud.
  • Help to make tough calls right at the bedside. More in-depth explanations and recommendations for future care are made possible by GMAI-enabled bedside clinical decision support tools, which build on existing AI-based early warning systems.
  • Making proteins from the text: GMAI synthesized protein amino acid sequences and three-dimensional structures from textual input. This model might be conditioned on producing protein sequences with desirable functional features, like those found in existing generative models.
  • Collaborative note-taking. GMAI models will automatically draft documents like electronic notes and discharge reports; physicians will only need to examine, update, and approve them.
  • Medical chatbots. New patient assistance apps could be powered by GMAI, allowing for high-quality care to be provided even outside of clinical settings.

Check out the Paper and Reference Article. Don’t forget to join our 19k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

???? Check Out 100’s AI Tools in AI Tools Club

Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.