{"id":25788,"date":"2024-08-11T15:03:20","date_gmt":"2024-08-11T14:03:20","guid":{"rendered":"https:\/\/healthmedicinet.com\/business\/metadata-management-tools-and-the-data-platform-shift-business\/"},"modified":"2024-08-11T15:03:20","modified_gmt":"2024-08-11T14:03:20","slug":"metadata-management-tools-and-the-data-platform-shift-business","status":"publish","type":"post","link":"https:\/\/healthmedicinet.com\/business\/metadata-management-tools-and-the-data-platform-shift-business\/","title":{"rendered":"Metadata management tools and the data platform shift &#8211; Business"},"content":{"rendered":"<p>\n<br \/><img decoding=\"async\" src=\"https:\/\/d15shllkswkct0.cloudfront.net\/wp-content\/blogs.dir\/1\/files\/2024\/07\/Bob-Muglia-entrepreneur-and-builder-Supercloud-7.jpg\" \/><\/p>\n<div>\n<p>There\u2019s an ongoing changing source of truth amid the data platform shift. It\u2019s a rapidly evolving situation, as companies must consider open table formats and metadata management tools.<\/p>\n<p>The open table format landscape includes Delta Lake, Iceberg and Apache Hudi. Much has happened in recent months, including <a href=\"https:\/\/www.databricks.com\/blog\/databricks-tabular\">Databricks Inc. purchasing Tabular Inc.<\/a>, according to <a href=\"https:\/\/www.linkedin.com\/in\/bob-muglia\/\">Bob Muglia<\/a> (pictured), entrepreneur and builder.<\/p>\n<p>\u201cIn a way, right now, Databricks has controlling capability of both the Iceberg and the Delta formats, but this is important to all the other vendors, and we\u2019ll just watch what happens over the coming months,\u201d Muglia said. \u201cI do think that we\u2019ll continue to see coexistence of these two things. In the last year, fortunately, there have been tools that have been developed to allow for both to be used simultaneously.\u201d<\/p>\n<p>Muglia spoke with theCUBE Research\u2019s <a href=\"https:\/\/www.linkedin.com\/in\/george-gilbert-tech-version\/\">George Gilbert<\/a> at the<a href=\"https:\/\/events.cube365.net\/supercloud\/supercloud-7\"> Supercloud 7: Get Ready for the Next Data Platform<\/a> event, during an exclusive broadcast on  Media\u2019s livestreaming studio. They discussed the evolution of data platform standards and the importance of metadata management tools.<\/p>\n<h3>Metadata management tools and universal open-source capabilities<\/h3>\n<p>There are metadata management tools that exist today including <a href=\"https:\/\/xtable.apache.org\/\">XTable<\/a>, which copies metadata. Fortunately, data formats all use the same on-disk format for data, according to Muglia.<\/p>\n<p>\u201cIt\u2019s really just the metadata we\u2019re talking about. But I do think we\u2019ll see those things converging, and I expect to see an open-source capability coming out, an open-source environment coming out that will be adopted pretty much universally across the vendors,\u201d he said. \u201cThat\u2019s what I hope to see anyway.\u201d<\/p>\n<p>The second thing that appears to be happening is catalogs being built on top of open data lake formats and collectively between a catalog and an underlying data format that is one\u2019s source of truth, according to Muglia. They\u2019re being developed, but they\u2019re not very compatible with each other.<\/p>\n<p>\u201cOnce again, my guess is that\u2019s just early stages of things, and we\u2019ll start to see something emerging that could be compatible and used across multiple vendors, but that\u2019s certainly not where we are at the moment,\u201d he said. \u201cWe\u2019re early stages of this transition from where we have proprietary formats to an open format, but the industry hasn\u2019t quite settled on it yet.\u201d<\/p>\n<p>It\u2019s clear that the source of truth isn\u2019t just the data. Metadata has to start with the technical operational data because the data warehouses and tools that run in the data environment have to be able to work with the data in a cohesive and secure way, according to Muglia.<\/p>\n<p>\u201cI think over time it\u2019ll include the higher levels of semantics as well. This is one of those open questions. Nobody really knows how that\u2019s going to develop,\u201d he said. \u201cAs you go up the stack and try to do more and more, you may want to have more and more capabilities, which could be an opportunity for vendor differentiation as well. So we\u2019ll see.\u201d<\/p>\n<h3>The challenge of unifying technical, operational and semantic data<\/h3>\n<p>It all poses a question: Is there a way to separate the technical metadata from the operational metadata, from the richer semantics? Or, if one wants a coherent source of data, do they all need to have one underlying unifying owner?<\/p>\n<p>\u201cI don\u2019t think you need one engineer for it,\u201d Muglia said. \u201cI think you need to have a way of accessing the data coherently across multiple engines, potentially.\u201d<\/p>\n<p>For instance, if one had knowledge graph database processors, that would want to work with the same information a SQL database would be working with, according to Muglia. It means that some of the same metadata is required.<\/p>\n<p>\u201cBut then there\u2019s a lot more information that one could put in the higher level semantic layer. And in fact, if you look at that, there\u2019s a lot of operations that you want to perform on that data,\u201d he said. \u201cThey\u2019re graphs, and they\u2019re complicated graphs, and there are relational operators that can be applied across the graphs.\u201d<\/p>\n<p>Today\u2019s databases and catalogs don\u2019t do that. But change is happening fast.<\/p>\n<p>\u201cYou need something different, which I believe is a relational knowledge graph, which we\u2019re starting to see emerge now,\u201d Muglia said.<\/p>\n<p>Ultimately, companies will need to have a vision across all of their underlying metadata to get a consistent source of truth, according to Muglia. These changes are still far off into the distance.<\/p>\n<p>\u201cWe\u2019re really just beginning to see the emergence of this metadata in this semantic layer as a real thing,\u201d he said.<\/p>\n<p>Here\u2019s the complete video interview, part of SiliconANGLE\u2019s and theCUBE Research\u2019s coverage of the<a href=\"https:\/\/events.cube365.net\/supercloud\/supercloud-7\"> Supercloud 7: Get Ready for the Next Data Platform<\/a> event:<\/p>\n<p><iframe loading=\"lazy\" title=\"Bob Muglia | Supercloud 7\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/gjX2Yky-YXQ?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<h5><\/h5>\n<div class=\"silic-after-content\" id=\"silic-976026252\">\n<hr style=\"border: 1px solid; color: #d8d8d8; height: 0px; margin-top: 20px;\"\/>\n<h3><span style=\"font-size: 16px;\"><\/span><\/h3>\n<h3><span style=\"font-size: 16px;\"> \u00a0<\/span><\/h3>\n<h3><a href=\"\"><\/a><\/h3>\n<h3><span style=\"font-size: 16px;\"><\/span><\/h3>\n<div>\n<p>\n \u2013 <\/strong><\/figure>\n<\/p>\n<\/div>\n<p><strong><\/strong><\/p>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>There\u2019s an ongoing changing source of truth amid the data platform shift. It\u2019s a rapidly evolving situation, as companies must consider open table formats and metadata management tools. The open table format landscape includes Delta Lake, Iceberg and Apache Hudi. Much has happened in recent months, including Databricks Inc. purchasing Tabular Inc., according to Bob [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-25788","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/posts\/25788","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/comments?post=25788"}],"version-history":[{"count":0,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/posts\/25788\/revisions"}],"wp:attachment":[{"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/media?parent=25788"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/categories?post=25788"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/healthmedicinet.com\/business\/wp-json\/wp\/v2\/tags?post=25788"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}