Preferences

Something sounds fishy in this. Has these bugs really been found by AI? (I don't think they were).

If you read Corgea's (one of the products used) "whitepaper", it seems that AI is not the main show:

> BLAST addresses this problem by using its AI engine to filter out irrelevant findings based on the context of the application.

It seems that AI is being used to post-process the findings of traditional analyzers. It reduces the amount of false positives, increasing the yield quality of the more traditional analyzers that were actually used in the scan.

Zeropath seems to use similar wording like "AI-Enabled Triage" and expressions like "combining Large Language Models with AST analysis". It also highlights that it achieves less false positives.

I would expect someone who developed this kind of thing to setup a feedback loop in which the AI output is somehow used to improve the static analysis tool (writing new rules, tweaking existing ones, ...). It seems like the logical next step. This might be going on on these products as well (lots of in-house rule extensions for more traditional static analysis tools, written or discovered with help of AI, hence the "build with AI" headline in some of them).

Don't get me wrong, this is cool. Getting an AI to triage a verbose static analysis report makes sense. However, it does not mean that AI found the bugs. In this model, the capabilities of finding relevant stuff are still capped at the static analyzer tools.

I wonder if we need to pay for it. I mean, now that I know it is possible (at least in my head), it seems tempting to get open source tools, set them to max verbosity, and find which prompts they are using on (likely vanilla) coding models to get them to triage the stuff.


asadeddin
Hi there, I'm Ahmad, CEO at Corgea, and the author of the white paper. We do actually use LLMs to find the vulnerabilities AND triage findings. For the majority of our scanning, we don't use traditional static analysis. At the core of our engine is the LLM reading the line of code to find CWEs in them.
etlun
Hi, I'm Etienne, one of the cofounders @ ZeroPath.

We do not use traditional static analyzers; our engine was built from the ground up to use LLMs as a primitive. The issues ZeroPath identified in Joshua's post were indeed surfaced and triaged by AI.

If you're interested in how it works under the hood, some of the techniques are outlined here: https://zeropath.com/blog/how-zeropath-works

alganet OP
Hi! Thanks for the reply.

Joshua describes it as follows: "ZeroPath takes these rules, and applies (or at least the debug output indicates as such) the rules to every .. function in the codebase. It then uses LLM’s ability to reason about whether the issue is real or not."

Would you say that is a fair assessment of the LLM role in the solution?

simonw
Looks like you're reacting to the Hacker News title here, which is currently " Daniel Stenberg on 22 curl bugs found by AI and fixed"

That's an editorialized headline (so it may get fixed by dang and co) - if you click through to what Daniel Stenberg said he was more clear:

> Joshua Rogers sent us a massive list of potential issues in #curl that he found using his set of AI assisted tools.

AI-assisted tools seems right to me here.

alganet OP
If the title changes, it is still a valid critique of the tools, how they might work, and a possible way of getting them for free.

Also, think about it: of course I read Joshua's report. Otherwise, how could I have known the names of the products he used?

bgwalter
I don't think many people here are interested in how something works. They want to see the headline "Curl developer finally convinced by AI!" and otherwise drop anecdotes about Claude Code etc.

All comments that want to know more are at the bottom.

robhlam
It’s clear my attempt to keep the gist of what Daniel said while keeping under the title character count didn’t hit the mark.

How would you have worded it?

simonw
Always tricky! In this case maybe the following:

Daniel Stenberg on 22 curl bugs reported using AI-assisted security scanners

robhlam
That doesn’t really convey that these bug reports were for real issues and greatly appreciated unlike the slop that Daniel is known for complaining about which I think that’s the real story here.

I will spend longer considering my title next time.

Cheers!

bgwalter
I suppose the downvoters all have subscriptions to the tools and know exactly how the tools work while leaving the rest of us in the dark.

Even Joshua's blog post does not clearly state which parts and how much is "AI". Neither does the pdf.

refulgentis (dead)

This item has no comments currently.