Twitter: LMeyerov
We're 100X'ing investigations with the first GPU visual graph AI platform and now Louie.AI, the genAI-first rethink of analyst notebooks. We enjoy working with data teams all the way from from tech companies and startups to scientists and government agencies, and on problems from threat hunting, fraud, & misinformation to supply chain, user journey, and genomics. Our partners include Amazon, Nvidia, and more.
* Louie.AI: L.O.U.I.E. connects to your data systems so analysts can use natural language to ask questions and get back data, analyses, AI models, interactive visualizations, and everything else the Graphistry platform can do
* Graph visual analytics: Gartner measured graph as a Top 5 growing data technology for the next 5 years and awarded us as the 2021 Cool Vendor in Graph
* End-to-end GPU computing: Apache Arrow & RAPIDS.ai were both voted top data projects of 2020 & 2021
* Point-and-click workflow automation: Process automation is the fastest growing enterprise industry in history
* Graph AI (GNNs) to make it easy for operational teams to take the next step of their graph journey by automating their graph insights for tasks like detection: Winner of the 2022 US Cyber Command AI alert data challenge!
... And we're hiring! LLM app/backend engineering, AI (cyber, GNNs, ...), sales engineering, platform engineering, UI, and more: https://www.graphistry.com/careers
- Interesting, mind sharing the context here?
My experience has been as workloads get heavier, it's "cheaper" to push to an accelerated & dedicated inferencing server. This doesn't always work though, eg, world of difference between realtime video on phones vs an interactive chat app.
Re:edge embedding, I've been curious about the push by a few to 'foundation GNNs', and it may be fun to compare UMAP on property-rich edges to those. So far we focus on custom models, but the success of neural graph drawing NNs & newer tabular NNs suggest something pretrained can replace UMAP as a generic hammer here too...
- A few things
Table stakes for our bigger users:
- parity or improvement on perf, for both CPU & GPU mode
- better support for learning (fit->transform) so we can embed billion+ scale data
- expose inferred similarity edges so we can do interactive and human-optimized graph viz, vs overplotted scatterplots
New frontiers:
- alignment tooling is fascinating, as we increasingly want to re-fit->embed over time as our envs change and compare, eg, day-over-day analysis. This area is not well-defined yet common for anyone operational so seems ripe for innovation
- maybe better support for mixing input embeddings. This seems increasingly common in practice, and seems worth examining as special cases
Always happy to pair with folks in getting new plugins into the pygraphistry / graphistry community, so if/when ready, happy to help push a PR & demo through!
- We generally run UMAP on regular semi-structured data like database query results. We automatically feature encode that for dates, bools, low-cardinality vals, etc. If there is text, and the right libs available, we may also use text embeddings for those columns. (cucat is our GPU port of dirtycat/skrub, and pygraphistry's .featurize() wraps around that).
My last sentence was on more valuable problems, we are finding it makes sense to go straight to GNNs, LLMs, etc and embed multidimensional data that way vs via UMAP dim reductions. We can still use UMAP as a generic hammer to control further dimensionality reductions, but the 'hard' part would be handled by the model. With neural graph layouts, we can potentially even skip the UMAP for that too.
Re:pacmap, we have been eyeing several new tools here, but so far haven't felt the need internally to go from UMAP to them. We'd need to see significant improvements given the quality engineering in UMAP has set the bar high. In theory I can imagine some tools doing better in the future, but the creators have't done the engineering investment, so internally, we rather stay with UMAP. We make our API pluggable, so you can pass in results from other tools, and we haven't heard much from that path from others.
- Fwiw, we are heavy UMAP users (pygraphistry), and find UMAP CPU fine for interactive use at up to 30K rows and GPU at 100K rows, then generally switch to a trained mode when > 100K rows. Our use case is often highly visual - see correlations, and link together similar entities into explorable & interactive network diagrams. For headless, like in daily anomaly detection, we will do this to much larger scales.
We see a lot of wide social, log, and cyber data where this works, anywhere from 5-200 dim. Our bio users are trickier, as we can have 1K+ dimensions pretty fast. We find success there too, and mostly get into preconditioning tricks for those.
At the same time, I'm increasingly thinking of learning neural embeddings in general for these instead of traditional clustering algorithms. As scales go up, the performance argument here goes up too.
- I don't see news spread, eg, direct lineage graphs showing viral attribution & rewrites as a narrative propagates..
Afaict, it is the usual topic trending over time, or maybe it is showing direct sindication?
Computing actual derivation flow would be neato, esp precisely at scale vs just the usual embeddings
- I can believe, so a different question as the attribution is unclear:
For context: A bunch of whitehat teams are using agents to automate both red + blue team cat-and-mouse flows, and quite well, for awhile now. The attack sounded like normal pre-ai methods orchestrated by AI, which is what many commercial red team services already do. Ex: Xbow is #1 on hackerone bug bounty's, meaning live attempts, and works like how the article describes. Ex: we do louie.ai on the AI investigation agent side, 2+ years now, and are able to speed run professional analyst competitions. The field is pretty busy & advanced.
So what I was more curious about is how did they know it wasn't one of the many pentest attack-as-a-service? Xbow is one of many, and their devs would presumably use VPNs. Like did anthropic confirm the attacks with the impacted and were there behavioral tells to show as a specific APT vs the usual , and are they characterizing white hat tester workloads to seperate out their workloads ?
- There is a big push to limit what kind of models can be OSS'd, which in turn means yes, a limit to what AI you are allowed to run.
The California laws the article references make OSS AI model makers liable for whatever developers & users do. That chills the enthusiasm for someone like Facebook or a university to release a better llama. So I'm curious if this law removes that liability..
- (have been a big fan of this work for years now)
From the nearby perspective of building GFQL, an embeddable oss GPU graph dataframe query language somewhere between cypher and duckdb/pandas/spark, at an even higher-level on top of pandas, cudf, etc:
It's nice using higher-level languages with rich libraries underneath so we can focus on the foundational algorithm & data ecosystem problems while still achieving crazy numbers
cudf gives us optimized GPU joins, so jumping from cheap personal CPU or GPU boxes to 80GB server GPUs and deep 2B edge whole-graph queries running in a second without work has been nice :) we want our focus on getting regular graph operations fully data parallel in the way we want while being easy for users, figuring out areas like bigger-than-memory and data lakes, etc, so we want to defer lower-level efforts to when the rust etc rewrite is more merited. I do see value in starting low when the target value and workload is obvious for building our (eg, vector indexes / DBs), but when breaking new ground at every point, value to going where you can roll & extend faster.
- Is this embeddable, eg, a react component that can be hooked into?
The lack of this has been a sticking point making us lean to dropping mermaid, so very cool to see!
- Share can go up and down if consumption keeps going up crazily. We now spend more per dev on their personal use inferencing providers than their home devices, so inferencing chips are effectively their new personal computers...
- Right, the 'knowing' is where I think the interesting thing is today for their evolution
more mature claude.md files already typically index into other files, including guidance which to preload vs lazy load. However, in practice, claude forgets quite easily, so that pattern is janky in practice. A structured mechanism helps claude guarantee less forgetting.
Forward looking, from an automation perspective of autonomous learning, this also makes it more accessible to talk about GEPA-for-everyone to maintain & generate these. We've been playing with similar flows in louie.ai, and came to a similar "just make it folders full of markdown with some learning automation options."
I was guessing that was what was going on here, but the writeup felt like maybe more was being said :) (And thank you for continuing to write!)
- I'm a bit unclear what's different here from how vibe coders already work?
Pretty early on folks recognized that most MCPs can just be CLI commands, and a markdown file is fine for describing them. So Claude Code users have markdown files of CLI calls and mini tutorials on how to do things. The 'how to do things' part seems to be what we're now calling skills... Which we're still writing in markdown and using from Claude.
Is the new thing that Claude will match & add them to your context automatically vs you call them manually? And that's a breakthrough because there's some emergent behavior?
- Senior engineers know process, including for what you described, and that maps to plan-driven AI engineering well:
1. Note the discussion of plan-driven development in the claude code sections (think: plan = granular task list, including goals & validation criteria, that the agent loops over and self-modifies). Plans are typically AI generated: I ask it to do initial steps of researching current patterns for x+y+z and include those in the steps and validations, and even have it re-audit a plan. Codex internally works the same, and multiple people are reporting it automates more of this plan flow.
2. Working with database for tasks like migrations is normal and even better. My two UIs are now the agent CLI (basically streaming AI chat for task list monitoring & editing) and GitHub PR viewer: if it wasn't smart enough to add and test migrations and you didn't put that into the plan, you see it in the PR review and tell it to fix that. Writing migrations is easy, but testing them is annoying, and I've found AI helping write mocks, integration tests, etc to be wonderful.
- Posted below: GFQL is also OSS and architecturally similar, though slightly different goals and features: https://www.hackerneue.com/item?id=45560036#45561807
- Reposting:
--
Rough news on kuzu being archived - startups are hard and Semih + Prashanth did so much in ways I value!
For those left in the lurch for compute-tier Apache Arrow-native graph queries for modern OSS ecosystems, GFQL [1] should be pretty fascinating, and hopefully less stress due to a sustainable governance model. Likewise, as an oss deeptech community, we add interesting new bits like the optional record-breaking GPU mode with NVIDIA Rapids [4].
GFQL, the graph dataframe-native query language, is increasingly how Graphistry, Inc. and our community work with graphs at the compute tier. Whether the data comes from a tabular ETL pipeline, a file, SQL, nosql, or a graph storage DB, GFQL makes it easy to do on-the-fly graph transforms and queries at the compute tier at sub-second speeds for graphs anywhere from 100 edges to 1,000,000,000 [3]. Currently, we support arrow/pandas, and arrow / nvidia rapids as the main engine modes.
While we're not marketing it much yet, GFQL is already used daily by every single Graphistry user behind-the-scenes, and directly by analysts & developers at banks, startups, etc around the world. We built it because we needed an OSS compute-tier graph solution for working with modern data systems that separate storage from compute. Likewise, data is a team sport, so it is used by folks on teams who have to rapidly wrangle graphs, whether for analysis, data science, ETL, visualization, or AI. Imagine an ETL pipeline or notebook flow or web app where data comes from files, elastic search, databricks, and neo4j, and you need to do more on-the-fly graph stuff with it.
We started [4] building what became GFQL before Kuzu because it solves real architectural & graph productivity problems that have been challenging our team, our users, and the broader graph community for years now. Likewise, by going dataframe-native & GPU-mode from day 1, it's now a large part of how we approach GPU graph deep tech investments throughout our stack, and means it's a sustainably funded system. We are looking at bigger R&D and commercial support contracts with organizations needing to do subsecond billion+-scale with us so we can build even more, faster (hit me up if that's you!), but overall, most of our users are just like ourselves, and the day-to-day is wanting an easy OSS way to wrangle graphs in our apps & notebooks. As we continue to smooth it out (ex: we'll be adding a familiar Cypher syntax), we'll be writing about it a lot more.
Links:
* ReadTheDocs: SQL <> Cypher <> GFQL - https://pygraphistry.readthedocs.io/en/latest/gfql/translate...
* pip install: https://pypi.org/project/graphistry/
* 2025 keynote - OSS interactive billion-edge GFQL analytics on 1 gpu: https://www.linkedin.com/posts/graphistry_at-graph-the-plane...
* 2022 blogpost w/ Ben Lorica first painting the vision: https://thedataexchange.media/the-graph-intelligence-stack/
- That's a cool concept - would be curious about a more common setup for agentic data analysis (ex: for using in Claude Code) like:
* Multiple tasks vs 1
* O3/o3-mini + 4o/4o-mini instead of nano
* Extra credit: Inside a fixed cost/length reasoning loop
Ex: does the md-kv benefit disappear with smarter models that you'r typically use, and thus just become a 2-3x cost?
- This is a big problem for many enterprise startups. Most (90%+) are customer funded, not VC funded, and it is common to have a few big design partner customers as they figure it out. Imagine several 3yr $1.5M ($500K/yr) contracts, so a team of 10-15 funded on that.
Companies have annual whimsical budget cycles, with new innovation initiatives always big easy targets as not cemented yet. This change makes it very easy to yank one of the 3 year commitments... Which would cause layoffs of a third of the startup.
- We have been on a fork for louie.ai:
* Small data - talk to your PDF on-the-fly etc: Getting bigger & faster via cloud APIs
* Big data - for RAG: Getting smaller, bc we don't want to pay crazy fees for vector DB hosting, and doable bc easier to get higher-quality small embeddings that do that
- I like growth-oriented startups in that it is more of a team sport here. The promotion is pretty directly tied to:
* Doing your part to make the revenue grow. For management, going from say $1M annual revenue in seed to $3M in A means the company can support 3X the staff. Joining a startup is basically a bet that you can outperform when unleashed.
* Surf that wave. Show it makes sense to put new hires below you vs above, or give you increasingly big responsibilities, etc, bc you managed the past ones well. Startups run to their limit so can feel like pressure cookers, and they're relatively small, so your demonstrated ability should be pretty apparent to hands-on leadership.
* Compensation comes out of that. Stock becomes worth more, you get bigger refreshers, more experience, new title, etc
For backend/application code, I find it's instead about focusing on the planning experience, managing multiple agents, and reviewing generated artifacts+PRs. File browsers, source viewers, REPLs, etc don't matter here (verbose, too zoomed-in, not reflecting agent activity, etc), or at best, I'll look at occasionally while the agents do their thing.