101
points
73
comments github.com
SnapQL is an open-source desktop app (built with Electron) that lets you query your Postgres database using natural language. It’s schema-aware, so you don’t need to copy-paste your schema or write complex SQL by hand.
Everything runs locally — your OpenAI API key, your data, and your queries — so it's secure and private. Just connect your DB, describe what you want, and SnapQL writes and runs the SQL for you.
I could see this being incredible if it had a set of performance related queries or ran explain analyze and offered some interpreted results.
Can this be run fully locally with a local llm?
For analytical purposes, this text-to-SQL is the future; it's already huge with Snowflake (https://www.snowflake.com/en/engineering-blog/cortex-analyst...).
Question, how are you testing this? Like doing it on dummy data is a bit too easy. These models, even 4o, falter when it comes to something really specific to a domain (like I work with supply chain data and other column names specific to the work that I do, that only makes sense to me and my team, but wouldn't make any sense to an LLM unless it somehow knows what those columns are)
And thank you for offering to contribute. I'll be very active on GitHub!
Granted, I was using 3.5 at the time, but even with heavy prompting and trying to explain what certain tables/columns are used for, feeding it the schema, and feeding it sample rows, more often than not it produced garbage. Maybe 4o/o3/Claude4/etc can do better now, but I’m still skeptical.
This is not a lot of text, maybe 5 pages. I then skim it myself in about 2-3 minutes and I write down what I would consider the key points. I compare the results and I find the AI usually (over 50% of the time) misses 1 or more points that I would consider key.
I encourage everyone to reproduce this test just to see how well current AI works for this use case.
For me, AI can't adequately do one of the first things that people claim it does really well (summarization). I'll keep testing, maybe someday it will be satisfactory in this, but I think this is a basic flaw in the attention mechanism that will not be solved by throwing more data and more GPUs at the problem.
I do this regularly and find it very enlightening. After I’ve read a news article or done my own research on a topic I’ll ask ChatGPT to do the same.
You have to be careful when reading its response to not grade on a curve, read it as if you didn’t do the research and you don’t know the background. I find myself saying “I can see why it might be confused into thinking X but it doesn’t change the fact that it was wrong/misleading”.
I do like when LLM‘s cite their sources, mostly because I find out they’re wrong. Many times I’ve read a summary, then followed it to the source, read the entire source, and realized it says nothing of the sort. But almost always, I can see where it glued together pieces of the source, incorrectly.
A great micro example of this are the Apple Siri summaries for notifications. Every time they mess up hilariously I can see exactly how they got there. But it’s also a mistake that no human would ever make.
All it will take is a form where the user supplies details about each column and relation. For some reason, most LLM based apps don't add this simple feature.
If by 3.5 you mean ChatGPT 3.5 you should absolutely try it with newer models, there is a huge difference in capabilities.
Example/clean/demo datasets it does very well on. Incredibly impressive even. But on real world schema/data for an app developed over many years, it struggled. Even when I could finally prompt my way into getting it to work for 1 type of query, my others would randomly break.
It would have been easier to just provide tools for hard-coded queries if I wanted to expose a chat interface to the data.
And I could potentially see LLMs being useful to generate the “bones” of a query for me but I’d never expose it to end-users (which was what I was playing with). So instead of letting my users do something like “What were my sales for last month?” I could use LLMs to help build queries that were hardcoded for various reports.
The problem is that I know SQL, I’m pretty good at, and I have a perfect understanding of my company’s schema. I might ask an LLM a generic SQL question but trying to feed it my schema just leads to (or rather “led to” in my trials before) prompt hell. I spent hours tweaking the prompts, feeding it more context, begging with it to ignore the “cash” column that has been depreciated for 4+ years, etc. After all of that it still would make simple mistakes that I hard specially warned against.
For example, if a column contains 0 or 1 encoding the absence of presence of something, LLMs need to know what 0 and 1 stand for. Same goes of column names because they can be cryptic in production databases.
> you don’t need to understand the normalizations
You definitely should. Normalizing isn’t that difficult of a concept, Wikipedia has terrific descriptions of each level.
As to the rest, maybe read docs? This is my primary frustration with LLMs in general: people seem to believe that they’re just as good of developers as someone who has read the source documentation, because a robot told them the answer. If you don’t understand what you’re doing, you cannot possibly understand the implications and trade-offs.
"Give me all the back office account postings for payment transfers of CCP cleared IRD trades which settled yesterday with a payment amount over 1M having a value date in two days"
That's what I'd like to be able to say and get an accurate response.
A 100% accurate ai powered solution would have many customers.
But can this generation of llms produce 100% accuracy?
it's all about friction. why spend minutes writing a query when you can spend 5 seconds speaking the result you want and get 90-100% of the way there.
But honestly, most queries I’ve ever seen are just simple joins, which shouldn’t take you 5 minutes to write.
This is the fundamental problem when attempting to use "GenAI" to make program code, SQL or otherwise. All one would have to do is substitute SQL with language/library of choice above and it would be just as applicable.
Good for you. Some of us deal with more complex queries, even if it may not seems so from the outside. For example getting hierarchical data based on parent_id, while having non-trivial conditions for the parents and the children or product search queries which need to use trigram functions with some ranking, depending on product availability across stores and user preferences.
I agree knowing SQL is still useful, but more for double checking the queries from LLMs than for trying to build queries yourself.
So, an adjacency list (probably, though there are many alternatives, which are usually better). That’s not complex, that’s a self-join.
> trigram functions
That’s an indexing decision, not a query. It’s also usually a waste: if you’re doing something like looking up a user by email or name, and you don’t want case sensitivity to wreck your plan, then use a case-insensitive collation for that column.
> I agree knowing SQL is still useful, but more for double checking the queries from LLMs
“I agree knowing Python / TypeScript / Golang is still useful, but more for double checking the queries from LLMs.” This sounds utterly absurd, because it is. Why SQL is seen as a nice-to-have instead of its reality - the beating heart of every company - is beyond me.
What is the current support for OpenAI proxy or non-GPT models?
For example, using locally deployed Qwen models or LLaMA models.
Pardon my technical ignorance, but what exactly is OpenAI's API being used for in this?
https://github.com/NickTikhonov/snap-ql/blob/main/src/main/l...
[0]: https://github.com/NickTikhonov/snap-ql/blob/409e937fa330deb...
[1]: https://github.com/vercel/ai
Side note: I don't see a license anywhere, so technically it isn't open source.
I wish you luck in refining your differentiation.
> I wish you luck in refining your differentiation. Can't agree more with you. It's about distribution (which Snowflake/Databricks/... have) or differentiation.
Still, chatting with your data is already working and useful for lots.