{"id":22143,"date":"2023-05-22T09:44:41","date_gmt":"2023-05-22T08:44:41","guid":{"rendered":"https:\/\/www.marktechpost.com\/?p=36483"},"modified":"2023-05-22T18:19:32","modified_gmt":"2023-05-22T17:19:32","slug":"a-new-ai-model-can-segment-anything-in-three-dimensions-when-sam-meets-nerf","status":"publish","type":"post","link":"https:\/\/healthmedicinet.com\/business\/a-new-ai-model-can-segment-anything-in-three-dimensions-when-sam-meets-nerf\/","title":{"rendered":"A new\u00a0AI model can segment anything in three dimensions when SAM meets NeRF."},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"attachment-large size-large wp-post-image\" style=\"float: left; margin: 0 15px 15px 0;\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-1024x670.png\" sizes=\"auto, (max-width: 696px) 100vw, 696px\" srcset=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-1024x670.png 1024w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-300x196.png 300w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-768x503.png 768w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-150x98.png 150w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-696x456.png 696w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-1068x699.png 1068w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-642x420.png 642w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-741x486.png 741w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM.png 1210w\" alt=\"\" width=\"696\" height=\"455\" data-attachment-id=\"36488\" data-permalink=\"https:\/\/www.marktechpost.com\/2023\/05\/22\/when-sam-meets-nerf-this-ai-model-can-segment-anything-in-3d\/screenshot-2023-05-22-at-2-10-54-pm\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM.png\" data-orig-size=\"1210,792\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screenshot 2023-05-22 at 2.10.54 PM\" data-image-description=\"\" data-image-caption=\"&lt;p&gt;https:\/\/arxiv.org\/abs\/2304.12308&lt;\/p&gt;n\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-300x196.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-1024x670.png\" \/><img loading=\"lazy\" decoding=\"async\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-150x150.png\" sizes=\"auto, (max-width: 150px) 100vw, 150px\" srcset=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-150x150.png 150w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-80x80.png 80w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-70x70.png 70w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-24x24.png 24w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-48x48.png 48w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-96x96.png 96w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-300x300.png 300w\" alt=\"\" width=\"150\" height=\"150\" data-attachment-id=\"36488\" data-permalink=\"https:\/\/www.marktechpost.com\/2023\/05\/22\/when-sam-meets-nerf-this-ai-model-can-segment-anything-in-3d\/screenshot-2023-05-22-at-2-10-54-pm\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM.png\" data-orig-size=\"1210,792\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screenshot 2023-05-22 at 2.10.54 PM\" data-image-description=\"\" data-image-caption=\"&lt;p&gt;https:\/\/arxiv.org\/abs\/2304.12308&lt;\/p&gt;n\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-300x196.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-22-at-2.10.54-PM-1024x670.png\" \/><\/p>\n<p>We are all amazed by the generative AI advancements recently, but that does not mean we do not get any significant breakthroughs in other applications. For example, the computer vision domain has been seeing relatively rapid advancements recently as well. The<a href=\"https:\/\/www.marktechpost.com\/2023\/04\/09\/meta-ai-releases-the-segment-anything-model-sam-a-new-ai-model-that-can-cut-out-any-object-in-a-image-video-with-a-single-click\/\"> Segment Anything Model (SAM)<\/a> release by Meta was a huge success and changed the game in 2D image segmentation entirely.<\/p>\n<p>nnnn<\/p>\n<p>In image segmentation, the goal is to detect and sort of \u201cpaint\u201d all the objects in the scene. Usually, this is done by training a model on a dataset of objects we want to segmentize. Then, we can use the model to segment the very objects in different images. However, the main problem here is that the model is bounded by the objects we show it during the training; and it cannot segmentize unseen objects.<\/p>\n<p>&nbsp;<\/p>\n<p>With SAM, this is changed. SAM is the first model that could segmentize <strong><em>anything<\/em><\/strong>, literally. This is achieved by training the SAM on large-scale data and giving it the ability to perform zero-shot segmentation across various styles of image data. It is designed to automatically segment objects of interest in images, regardless of their shape, size, or appearance. SAM has demonstrated remarkable performance in segmenting objects in 2D images, revolutionizing the field of computer vision.<\/p>\n<p>&nbsp;<\/p>\n<p>Of course, people did not simply stop there. They started working on ways to extend SAM\u2019s capabilities beyond 2D. However, a key question has remained unanswered: Can SAM\u2019s segmentation ability be extended to 3D, thereby bridging the gap between 2D and 3D perception caused by data scarcity? The answer is looking like yes, and it is time to meet with <strong>SA3D<\/strong>.<\/p>\n<p>&nbsp;<\/p>\n<p><strong>SA3D <\/strong>leverages advancements in Neural Radiance Fields (NeRF) and the SAM model to revolutionize 3D segmentation. NeRF has emerged as one of the most popular 3D representations in recent years. NeRF builds connections between sparse 2D images and real 3D points through differentiable volume rendering. It has seen numerous improvements, making it a powerful tool for tackling the challenges of 3D perception.<\/p>\n<p>&nbsp;<\/p>\n<p>There have been some attempts to extend NeRF-based techniques for 3D segmentation. These approaches involved training an additional feature field aligned with a pre-trained 2D visual backbone. While effective, these methods suffer from limitations such as high memory footprint, artifacts in radiance fields affecting feature fields, and inefficiency due to the need for training an additional feature field for every scene.<\/p>\n<p>&nbsp;<\/p>\n<p>This is where <strong>SA3D <\/strong>comes into play. Unlike previous methods, <strong>SA3D <\/strong>does not require training an additional feature field. Instead, it leverages the power of SAM and NeRF to segment desired objects from all views automatically.<\/p>\n<p>&nbsp;<\/p>\n<div class=\"wp-block-image\">\n<p>&nbsp;<\/p>\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-36485\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-32-1024x404.png\" alt=\"\" width=\"897\" height=\"372\" data-attachment-id=\"36485\" data-permalink=\"https:\/\/www.marktechpost.com\/2023\/05\/22\/when-sam-meets-nerf-this-ai-model-can-segment-anything-in-3d\/image-32-7\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-32.png\" data-orig-size=\"1566,618\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"image-32\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-32-300x118.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-32-1024x404.png\" \/><figcaption class=\"wp-element-caption\"><em>Overview of SA3D. Source: <\/em><a href=\"https:\/\/arxiv.org\/abs\/2304.12308\"><em>https:\/\/arxiv.org\/abs\/2304.12308<\/em><\/a><\/figcaption><\/figure>\n<\/div>\n<p>&nbsp;<\/p>\n<p><strong>SA3D <\/strong>works by taking user-specified prompts from a single rendered view to initiate the segmentation process. The segmentation maps generated by SAM are then projected onto 3D mask grids using density-guided inverse rendering, providing initial 3D segmentation results. To refine the segmentation, incomplete 2D masks from other views are rendered and used as cross-view self-prompts. These masks are fed into SAM to generate refined masks, which are then projected onto the 3D mask grids. This iterative process allows for the generation of complete 3D segmentation results.<\/p>\n<p>&nbsp;<\/p>\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-36484\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-1024x547.png\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" srcset=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-1024x547.png 1024w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-300x160.png 300w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-768x410.png 768w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-1536x821.png 1536w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-150x80.png 150w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-696x372.png 696w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-1068x571.png 1068w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-786x420.png 786w, https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31.png 1600w\" alt=\"\" width=\"1024\" height=\"547\" data-attachment-id=\"36484\" data-permalink=\"https:\/\/www.marktechpost.com\/2023\/05\/22\/when-sam-meets-nerf-this-ai-model-can-segment-anything-in-3d\/image-31-5\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31.png\" data-orig-size=\"1600,855\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"image-31\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-300x160.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2023\/05\/image-31-1024x547.png\" \/><\/figure>\n<p>&nbsp;<\/p>\n<p><em>Overview of how SA3D works. Source: <\/em><a href=\"https:\/\/arxiv.org\/abs\/2304.12308\"><em>https:\/\/arxiv.org\/abs\/2304.12308<\/em><\/a><\/p>\n<p>&nbsp;<\/p>\n<p><strong>SA3D <\/strong>offers several advantages over previous approaches. It can easily adapt to any pre-trained NeRF model without the need for changes or re-training, making it highly compatible and adaptable. The entire segmentation process with <strong>SA3D <\/strong>is efficient, taking approximately two minutes without requiring engineering optimization. This speed makes <strong>SA3D <\/strong>a practical solution for real-world applications. Moreover, experimental results have demonstrated that <strong>SA3D <\/strong>can generate fine-grained segmentation results for various types of 3D objects, opening up new possibilities for applications such as robotics, augmented reality, and virtual reality.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We are all amazed by the generative AI advancements recently, but that does not mean we do not get any significant breakthroughs in other applications. For example, the computer vision domain has been seeing relatively rapid advancements recently as well. The Segment Anything Model (SAM) release by Meta was a huge success and changed the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-22143","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/posts\/22143","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/comments?post=22143"}],"version-history":[{"count":0,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/posts\/22143\/revisions"}],"wp:attachment":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/media?parent=22143"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/categories?post=22143"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/tags?post=22143"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}