news

Neo4j CTO says new Graph Query Language standard will have ‘massive ripple effects’ – Business

Spread the love


The International Standards Organization’s publication of a standard for the Graph Query Language earlier this month generated relatively little media interest, but executives at graph database makers were turning virtual cartwheels in the halls.

The GQL standard is the first to be ratified by the ISO since Structured Query Language in 1986. The 600-page document defines the rules for “creating, accessing, querying, maintaining, and controlling property graphs and the data they comprise.”

A property graph is a type of data structure used primarily in graph databases and graph processing frameworks. It consists of nodes and edges that represent relationships that can’t easily be expressed in conventional relational tables.

Property graphs are acknowledged to be a better way to represent relationships between data elements than relational tables in a loosely organized structure such as a data lake. Although SQL will continue to be king of the hill for querying structured data, GQL is expected to be a better fit for queries combining data of different types and from different sources.

“This will have a multigenerational impact on the database landscape,” said Philip Rathle (pictured), chief technology officer at graph database market leader Neo4j Inc. “A lot of the people who are going to be impacted by this over the months and years to come don’t know it yet, but it’s going to have massive ripple effects.”

The graph database industry has been hampered by the lack of a single standard. “A lot of organizations will either not touch or only minimally invest in a technology if there isn’t the weight of a formal standard,” Rathle said.

Kevin Bacon effect

Graph structures aren’t an alternative to relational tables but a completely different way of representing relationships between data elements. For some query types, they can be orders of magnitude faster and more productive.

Writing in VentureBeat, StepZen Inc. founder Anant Jhingran gave the example of a query to find the tracking number and expected delivery date on two different orders shipping from two different companies. GQL can return a result of a single query whereas SQL would require at least three queries and a lot of manual sorting.

Rathle drew an analogy to the “Six Degrees of Kevin Bacon” game in which the object is to start with a movie in which the “Footloose” star doesn’t appear and trace the relationships of everyone in the cast and crew to find a link to Bacon.

Rathle said the GQL and SQL standards were developed by the same ISO committee and are not intended to compete with each other. GQL doesn’t require a graph back-end and can work with document databases, key-value stores and even unstructured data.

“At the end of the day, these query languages are both about having a language to access data and a data model,” he said. “Any data you can put in a relational database, you can also put in a graph database. The relational model is just one way to choose how to structure the data, and graphs are an alternative.” SQL will continue to be the best option for querying structured data in tables.

One of the advantages of graph structures is that they don’t require a fixed schema, which is a blueprint that defines the organization and structure of a database. “I can bring data in as it exists in the real world without trying to put it into a box,” he said. “As parts of that structure become known, you can lock those down.”

How people think

Graph schemas also more closely resemble the way business people think about relationships between data elements, he said. “The data model is understandable by nondevelopers and nondatabase people, so that’s a pretty big deal in terms of agile development and business transformation, all of which need to move fast.”

It’s also relevant to generative artificial intelligence, which excels at finding relationships between data elements that aren’t explicitly defined. Vector databases are currently the most common way to do that, but Rathle said graphs have some advantages.

Relationships in a vector database “are determined by an algorithm which isn’t overseen by a human,” he said. “The concept comes from the proximity and frequency of words, but it’s opaque, so it doesn’t have clarity on what that concept actually is. A graph uses a structure that is curated and understood by humans and that already exists in most enterprises, such as a product catalog or an employee hierarchy.”

Large language models can use that explicit structure to understand the relationship between entities better and deliver more precise responses. “They’re both useful, but they operate in a very different way,” he said.

Rivals collaborate

Like most industry standards, GQL was the product of a collaboration among enterprises, academics and vendors, some of which are fierce rivals. However, they find common ground on this topic. Neo4j competitor TigerGraph Inc. posted a celebration on its blog late last week. Rathle and Brad Bebee, lead product engineering for Amazon Web Services Inc. competing graph database Neptune, collaborated on a post on the AWS Database Blog Thursday.

“We’re collaborating from that standpoint, and from this point on, it’s on each of us to earn our business,” he said.

GQL is expected to unite multiple versions of graph query languages that are already in the market. Neo4j’s Cypher is the most widely used. Others include TigerGraph’s GSQL, Oracle Corp.’s PGQ, the Linked Data Benchmark Council’s G-core and the open-source Gremlin. Most vendors are expected to support GQL.

For enterprises that have been hedging on graph database support, “it’s time to go all in,” Rathle said. “It’s a night-and-day difference when you can suddenly understand the context and causality of everything going on in your system through the power of connections.”

Photo: Neo4j

.

 

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” –

THANK YOU