{"id":22120,"date":"2023-05-20T12:18:03","date_gmt":"2023-05-20T11:18:03","guid":{"rendered":"https:\/\/www.marktechpost.com\/?p=36381"},"modified":"2023-05-20T12:18:03","modified_gmt":"2023-05-20T11:18:03","slug":"salesforce-ai-introduces-codet5-a-new-family-of-open-code-large-language-models-with-an-encoder-decoder-architecture","status":"publish","type":"post","link":"https:\/\/healthmedicinet.com\/business\/salesforce-ai-introduces-codet5-a-new-family-of-open-code-large-language-models-with-an-encoder-decoder-architecture\/","title":{"rendered":"Salesforce AI Introduces CodeT5+: A New Family of Open Code Large Language Models with an Encoder-Decoder Architecture"},"content":{"rendered":"<p><img width=\"696\" height=\"390\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-1024x574.png\" class=\"attachment-large size-large wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" style=\"float:left; margin:0 15px 15px 0;\" srcset=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-1024x574.png 1024w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-300x168.png 300w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-768x430.png 768w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-150x84.png 150w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-696x390.png 696w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-1068x598.png 1068w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-750x420.png 750w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM.png 1446w\" sizes=\"auto, (max-width: 696px) 100vw, 696px\" data-attachment-id=\"36385\" data-permalink=\"https:\/\/www.marktechpost.com\/2023\/05\/20\/salesforce-ai-introduces-codet5-a-new-family-of-open-code-large-language-models-with-an-encoder-decoder-architecture\/screenshot-2023-05-20-at-4-46-14-pm\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM.png\" data-orig-size=\"1446,810\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screenshot 2023-05-20 at 4.46.14 PM\" data-image-description=\"\" data-image-caption=\"&lt;p&gt;https:\/\/arxiv.org\/abs\/2305.07922&lt;\/p&gt;n\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-300x168.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-1024x574.png\" \/><img width=\"150\" height=\"150\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-150x150.png\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-150x150.png 150w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-80x80.png 80w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-70x70.png 70w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-24x24.png 24w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-48x48.png 48w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-96x96.png 96w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-300x300.png 300w\" sizes=\"auto, (max-width: 150px) 100vw, 150px\" data-attachment-id=\"36385\" data-permalink=\"https:\/\/www.marktechpost.com\/2023\/05\/20\/salesforce-ai-introduces-codet5-a-new-family-of-open-code-large-language-models-with-an-encoder-decoder-architecture\/screenshot-2023-05-20-at-4-46-14-pm\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM.png\" data-orig-size=\"1446,810\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screenshot 2023-05-20 at 4.46.14 PM\" data-image-description=\"\" data-image-caption=\"&lt;p&gt;https:\/\/arxiv.org\/abs\/2305.07922&lt;\/p&gt;n\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-300x168.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-20-at-4.46.14-PM-1024x574.png\" \/>n<\/p>\n<p>Modern large language models (LLMs) have excellent performance on code reading and generation tasks, allowing more people to enter the once-mysterious field of computer programming. Architecturally, existing code LLMs use encoder- or decoder-only models, which excel at just some comprehension and generating tasks. Code-focused LLMs typically have a limited set of pretraining objectives, which will degrade performance on downstream tasks that are less relevant to those objectives, and they often adopt an encoder-only or decoder-only architecture, which can limit their optimal performance to only specific tasks.<\/p>\n<p>nnnn<\/p>\n<p>The AI Research team at Salesforce presents CodeT5+. It is a revolutionary family of encoder-decoder code foundation LLMs that can be easily customized to perform exceptionally well on various code interpretation and generation tasks. To do this, the team provides CodeT5+ with a wide range of pretraining objectives on unimodal and bimodal data to provide a code LLM that can be easily adapted to various downstream tasks.<\/p>\n<p>nnnn<\/p>\n<p><strong>What is CodeT5+<\/strong><\/p>\n<p>nnnn<\/p>\n<p>CodeT5+ is a set of large-scale language models for analyzing and generating code. The framework incorporates a wide range of unimodal and bimodal pretraining goals. CodeT5+&#8217;s modules can be separated and recombined flexibly to meet the needs of a wide variety of zero-shot, finetuning, and instruction-tuning applications.<\/p>\n<p>nnnn<\/p>\n<p>While the decoder is trained to provide various outputs based on the pretraining learning tasks, the encoder learns to encode contextual representations from code\/text sequences (entire, partial, or span-masked sequences).<\/p>\n<p>nnnn<\/p>\n<ul>n<\/p>\n<li>CodeT5+ is initially pretrained on large-scale unimodal data from public-facing platforms like GitHub. To teach the model how to recover code contexts in code spans, partial programs, and entire programs, this pretraining employs a variety of objectives, including span denoising, decoder-only causal LM, and seq2seq causal LM tasks.<\/li>\n<p>n<\/ul>\n<p>nnnn<\/p>\n<ul>n<\/p>\n<li>The second stage of pretraining uses text-code bimodal data, or combinations of text and code that provide the semantics of a code function. To enhance its cross-modal understanding and creation capabilities, CodeT5+ is here pretrained on cross-modal contrastive learning, matching, and causal LM tasks.<\/li>\n<p>n<\/ul>\n<p>nnnn<\/p>\n<p>CodeT5+ can adapt its performance to various tasks thanks to its two-stage pretraining procedure, which includes seq2seq-generating tasks, decoder-only activities, and understanding-based tasks.<\/p>\n<p>nnnn<\/p>\n<p>In their empirical investigation, the team compared CodeT5+ against 20 benchmark datasets and state-of-the-art code LLMs, including LaMDA, GPT, StarCoder, etc., on tasks including zero-shot, finetuning, and instruction-tuning. While competing against OpenAI&#8217;s robust code-cushman-001 model, CodeT5+ achieved State-of-the-Art (SOTA) outcomes on zero-shot HumanEval code creation tasks.<\/p>\n<p>nnnn<\/p>\n<p><strong>To sum it up<\/strong><\/p>\n<p>nnnn<\/p>\n<p>CodeT5+ is a new family of open-source, large-language models with an encoder-decoder architecture that may function in several modes (encoder-only, decoder-only, and encoder-decoder) to serve a variety of code interpretation and generation activities. CodeT5+ is trained using a variety of pretraining tasks, including span denoising, causal language modeling, contrastive learning, and text-code matching to acquire a comprehensive understanding of both unimodal and bimodal code-text data.<\/p>\n<p>nnnn<\/p>\n<p>This work indicates that the proposed CodeT5+ open code LLMs can support and even reach SOTA performance across a wide range of downstream code jobs by operating flexibly in encoder-only, decoder-only, and encoder-decoder modes. The team is open-sourcing all CodeT5+ models to encourage further study because they believe CodeTs+ can be deployed as a unified retrieval-augmented generation system.<\/p>\n<p>nnnn<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>nnn n<\/p>\n","protected":false},"excerpt":{"rendered":"<p>n Modern large language models (LLMs) have excellent performance on code reading and generation tasks, allowing more people to enter the once-mysterious field of computer programming. Architecturally, existing code LLMs use encoder- or decoder-only models, which excel at just some comprehension and generating tasks. Code-focused LLMs typically have a limited set of pretraining objectives, which [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-22120","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/posts\/22120","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/comments?post=22120"}],"version-history":[{"count":0,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/posts\/22120\/revisions"}],"wp:attachment":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/media?parent=22120"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/categories?post=22120"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/tags?post=22120"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}