72
points
I’ve seen a few talks and blog posts about GraphRAG and graph databases (mostly from companies building graph databases), and I’m curious to hear about your experiences with them in the context of agentic applications.
I get how they could model relationships between entities more naturally and help agents pull relevant context faster, but does it actually make a big difference in your project?
I’m considering trying it out, but I’m sceptical that the benefits are that much bigger than just sticking with good-old Postgres.
Where have you found graph databases really made a difference? And were there cases where you wouldn’t use them again?
I memorably had a job interview which consisted almost entirely of their senior architect going over exactly why he regretted introducing Neo4J several years earlier and how all the work is really about getting away from it. That was just the most extreme example.
The truth that people here don't like is that the Couch/Mongo style document DB is far more compelling as a intermediate point of structured/unstructured. There was even a mongo DB compatibility layer for foundation DB, but it doesn't seem to be maintained sadly. https://github.com/FoundationDB/fdb-document-layer
in my opinion Graph DBs should only be used for highly structured data, which after all is what a graph is. Generally anything you would represent in SQL with too many joins to do queries you commonly have to do.
If you're just pulling up a tree of comments on an article using parent-child relations, SQL will be fine, though for query latency you might be better off with a "flat list" article-comment relation instead and recovering the tree structure after fetching the comments.
If you dive into the query plans of a graph DB you quickly see that there is nothing special about that. In the end it boils down to the same physical joins on node and vertex tables. The only thing graph DBs offer over your typical RDBMS is nicer syntax (while having worse operational maturity), and with the advent of SQL/PGQ event that advandtage is going away.
Even for use cases Graph dbs knock out of the park Neo4j (historically, I haven't used it in like 10 years) didn't work very reliably compared to modern competitors.
But as always it's about picking the right tool for the job - I tried to build a "social network" in mysql and neo4j, and (reliability aside) neo4j worked way better.
Many will go "you need a proper ontology" at which point just use a RDBMS. Ontologies are an absolute tarpit, as the semantic web showed. The graph illusion is similar to that academic delusion "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures." which is one of those quips that makes you appreciate just how theoretical some theoreticians really are.
GraphQL implementations have consistently felt like a hobby project where implementing it in an API becomes a thing to put on a resume rather than making a functionally useful API.
[1] https://graphql-docs-v2.opencollective.com/queries/account (navigate to the variables tab for a real mindfuck)
It depends on the shape of your data. In my domain (cloud security), there are many many entities and it's very valuable to map out how they relate to each other.
For example, we often want to answer a question like: “Which publicly exposed EC2 instances are used by IAM roles that have administrative privileges in my AWS account?”
To answer the question, you need to: 1. Join ec2 instances to security groups to IP rules to IP ranges to find network exposure paths to the open internet. 2. Join the instances to their instance profiles, to their roles. 3. Join the IAM roles to their role policies to determine which have admin policies. 4. Chain all of those joins together, possibly with recursive queries if there are indirect relationships (e.g., role assumption chains).
That’s a lot of joins, and the SQL query would get both heavy and hard to maintain.
In graph this query looks something like
match (i:EC2Instance)--(sg:EC2SecurityGroup)--(r:IPPermissionInbound{action:"Allow"})--(rng:IPRange{id:"0.0.0.0/0"}) match (i)--(r:AWSRole)--(p:AWSPolicy)--(stmt:AWSPolicyStatement{effect:"Allow", resource:"*"}) return i.id as instance_id, r.name as role_name
To answer this question what internet open compute instances can act as admins in our environment, we needed to traverse multiple objects, but the shape of the answer is pretty simple: just a list of ids and names.
Graph databases have quirks and add complexity of their own. If your domain isn't this edge heavy, you're probably better off with Postgres, but for our use-case it's been worth the trade-off imo.
I don't think every graph needs a graph database. For 99% of use-cases a relational database is the preferred solution to store a graph: provided that we have objects and ways to link objects, we're good to go. The advantages of graph dbs are in running more complex graph algorithms whenever that is required (transversal, etc) which is more efficient than "hacking it" with recursive queries in a relational db.
For us, I've yet to find the need for a dedicated graph db with few exceptions, and in those exceptions https://kuzudb.com/ was the perfect solution.
It seems to me that the way recursive CTEs were originally defined is the biggest reason that relational databases haven't been more successful with users who need to run serious graph workloads - in Frank McSherry's words:
> As it turns out, WTIH RECURSIVE has a bevy of limitations and mysterious semantics (four pages of limitations in the version of the standard I have, and I still haven't found the semantics yet). I certainly cannot enumerate, or even understand the full list [...] There are so many things I don't understand here.
https://github.com/frankmcsherry/blog/blob/master/posts/2022...
After enough years you realize this is the case for every single problem
If you are considering the use of a graph database for AI based search and you are not already familiar with graph database technology, then you should be advised that graph databases are not relational databases. If you cognitively model nodes = tables and edges = joins, then you will be in for some nasty surprises. You should consider some learning, and some unlearning, to do before proceeding with that choice.
The use case was to build a knowledge graph to drive recommendations for the next best thing the user should learn.
After a few weeks of getting frustrated I went back to good old Postgres and writing a few tools for agentic retrieval.
It seems the agents are smart enough to traverse a database in a graph like manner if you provide them with the right tooling and context
2. Writing cypher queries is a job I would never like to have as a human. But LLMs love it, so that an agent can do an ad hoc data science for every single problem. Especially while being aware which criteria were used for graph construction. It is worth ditching things like MCP in favor of tool graph-like solutions. For this purpose I developed my own DSL which only LLM speaks in internally. The effects are mind-blowing.
If the goal is to maintain views over graphs and performance/scale matters, consider Feldera. We see folks use it for its ability to incrementally maintain recursive SQL views (disclaimer: I work there).
How are you evaluating your current retrieval? Can you get to the point where you can compare your current solution with a Graph based one?
A lot of the time i've seen people reach for a Graph DB they actually wanted/needed re-ranking of results.
An aside but the Director of ML at a company I worked for kept telling us "We need a Graph! We need a Graph!" and when questioned about _why_ said because we could find fastest routes between train stations (it was for a big train ticket retailer) - no matter how many times we told him we don't set the routes and timetables, it's set by National Rail.
I (and many others) left the company shortly after his arrival.
It depends. Maybe they knew something that the team didn't but couldn't articulate it. Maybe it would have been great. Alternately (and this seems to be a common tactic unfortunately), is they don't really know what they are doing but use the strategy of introducing a large / time consuming change and promise incredible things once the change is complete. The longer the change takes, the better in this situation as they can just chill while the change is taking place and polish resume for the next gig if it doesn't work out. If they jump to a new job before failure is obvious they can claim that they affected some large change at previous company and repeat the process. The other strategy is to performtatively claim success in the face of failure and move on to the next big thing.
It's the lack of a fully developed storage engine that avoids vendor lock-in.
Apache GraphAr (incubating) is a step in this direction. But it's an import/export format. Not primary storage.
Unaware of this effort (roots in Chinese graph dbs), I wrote a competing proposal that's more aimed at graphdbs looking to disaggregate compute and storage.
https://adsharma.github.io/beating-the-CAP-theorem-for-graph...
I think for the particular use case, something like filtering the vector search based on tags for each document and then (maybe) a relatively inexpensive and fast reranking LLM step could have worked as well or better. But the reranker is not necessarily important with a strong model and including enough results.
If it is just to query one single dataset that is already in one tool it is less compelling.
This is not agentic but pretty good results when I did a poc.
So far, I'm using postgres and helping the agent develop MCP functions for it. As we find some optimizations (or at least, reliably performant daily routines), I might make the choice to represent some relationships in a graph database.
One thing I'm building that seems plausibly likely to eventually call for a graph DB is "The Oracle of Bluegrass Bacon" (like the Oracle of Kevin Bacon, but for string band pickers). And it's nice for the agent to have fairly optimal access to this data as we build other adjacent projects.
But yeah, so far just postgres.
Don't remember if their licensing is annoying, but it was rather neat as a graph storage when I tried it out last christmas or thereabouts. If you actually have a fitting need it's probably a decent option, there are some graphical interfaces and so on that people who aren't technical specialists can use.
https://terminusdb.org/
As a human user, consider file-system navigation or code search. You navigate to a seed file, and then hop through the children and dependencies until you find what you're looking for. The value is in explicit and direct edges. Perfect search is hard. Landing in the right neighborhood is less so. (It's like golf if you think about it)
Agentic systems loop through the following steps - (Plan -> Inquire -> Retrieve -> Observe -> Act -> Repeat). The agent interacts with your search-system during the inquire and retrieve phases. In these phrases, There are 2 semantic problems that a simple embedding based search or a simple db alone can't solve: seeding and completeness. Seeding - How do you ask a good question when you don't know what you don't know ? Completeness - once you know a little bit, how do you know that you have obtained everything you need to answer a question ?
A solid embedding based search allows under-defined free-form inquiry, and puts the user near the data they're looking for. From there, an explicit graph allows the agent to navigate through the edges until it hits gold or gives the agent enough signal to retry with better informed free-form inquiry. Together, they solve the seeding problem. Now, once you have found a few seed nodes to work off of, the agent can keep exploring the neighbors, until they become sufficiently irrelevant. At that threshold, the retrieval system can return the explored nodes with a measurable metric of confidence in completeness. This makes completeness a measure that you can optimize, helping solve the 2nd problem.
You'll notice that there is no magic here. The quality of your search will depend on the quality of your edges, entities, exploration strategy and relevance detectors. This requires a ton of hand-engineering and subject specific domain knowledge, neither of which are systems bottlenecks. The data-store itself will do very little to help get you a better answer.
Which brings me to your question, the datastore. The datastore only matters at sufficient scale. You CAN implement Graph RAG in a standard database. Get a column to track your edges, a column to track entities and some way to search over embeddings and you're good. You can get it done in an afternoon (until permissions become an issue, but I digress).
We know that a spotlight style file-system search works just fine on 100k+ documents, while your mac's fan barely even turns on. If you're asking this question, then your company probably doesn't scale past that point. In fact, I'd argue that few companies will ever cross that threshold for agentic operations. At this scale, your postgres instance won't be the bottleneck.
Comparing postgres to graph-rag-startups, the real value of using a native graph-RAG solution is their defaults. The companies know that their user's need is agentic semantic search, and the products come preloaded with defaults that give you embeddings, entities and graph-edges that aren't completely useless. From a practical standpoint, those extras might push you over the edge. But be aware that your performance gains are coming from outsourcing the hand-engineering of features and not the data structure itself.
My personal opinion is to keep the data structure as simple as possible. MLEs and Data Scientists are mediocre systems engineers and it is okay to accept that. You want your ML & product team to be able to iterate on the search-logic and quality as fast as possible. That's where the real gains will come from. Speaking from experience, premature optimization in a new field will slow your team down to a crawl. IE. Go with postgres if that's what's simple for everyone to work with.
tldr: It's not about the scalability of the datastructure, it's about how you use it.