Preferences

refulgentis parent (dead)
[flagged]

bgwalter (dead)
alganet
Do you believe AI is at the core of these security analyzers? If so, why the personal story blogpost? You can just explain me in technical terms why is that so.

Claiming to work for Google does not work as an authority card for me, you still have to deliver a solid argument.

Look, AI is great for many things, but to me these products sounds like chocolate that is actually just 1% real chocolate. Delicious, but 99% not chocolate.

tptacek
I had a conversation in a chat room yesterday about AI-assisted math tutoring where a skeptic said that the ability of GPT5 to effortlessly solve quotient differentials or partial fraction decomposition or rational inequalities wasn't indicative of LLM improvements, but rather just represented the LLMs driving CAS tools and thus didn't count.

As a math student, I can't possibly care less about that distinction; either way, I paste in a worked problem solution and ask for a critique, and either way I get a valid output like "no dummy multiply cos into the tan before differentiating rather than using the product rule". Prior to LLMs, there was no tool that had that UX.

In the same way: LLMs are probably mostly not off the top of their "heads" (giant stacks of weight matrices) axiomatically deriving vulnerabilities, but rather just doing a very thorough job of applying existing program analysis tools, assembling and parallel-evaluating large numbers of hypothesis, and then filtering them out. My interlocutor in the math discussion would say that's just tool calls, and doesn't count. But if you're a vulnerability researcher, it doesn't matter: that's a DX that didn't exist last year.

As anyone who has ever been staffed on a project triaging SAST tool outputs before would attest: it extremely didn't exist.

alganet
I don't care if it counts as true LLM brilliance or not.

If it doesn't matter if it's AI or not, just that they're good tools, why even advertise the AI keyword all over it? Just say "best in class security analysis toolset". It's proprietary anyway, you can't know how much of it is actually AI (unless you reproduce its results, which is the core argument you missed here).

tptacek
Because that's not accurate. The underlying program analysis tooling already existed, but the LLM glue logic is what makes it effective. You could as a human replicate it with those preexisting tools, but you won't, in the same way you wouldn't model your whole project in a prover and solve it with formal methods; it was possible, but that possibility isn't meaningful.
alganet
> the LLM glue logic is what makes it effective

Allegedly (it's proprietary, we don't know). Maybe it's the triage approach, and there are undiscovered non-LLM triage techniques that would surpass it.

If I were to guess, the AI naming was for marketing purposes (to ride on the hype train), not because it accurately describes the product (even though it might accurately describe the product).

Most importantly, how is that it's so effective? I want to know. Perhaps you and some others just want to celebrate an LLM win. That's fine, but I want to know how it works.

I'd say my guess is fair, and it's a viable approach for someone trying to create a similar tool. If I were to try and replicate this, I would definitely start with an existing static analyzer. For example, I would do it with phpstan (just because I know it a little bit better).

I would extend it so it becomes more verbose than what it currently is (something humans don't want, but machines might benefit from it). Perhaps I would introduce some rules that make it report things that aren't even issues, but just information I can gauge from the AST (like, does this controller has a middleware? If so, emit something in the report). Then I would attempt to use that enriched report as the input for a coding model, and experiment with different prompts and different granularity units on the input.

It sounds reasonable, doesn't it? I could describe that approach as "LLM right in the core of the solution", but I know by heart that in that arrangement, the quality of the final product is still capped by the static analyzer and what it can detect and describe. It doesn't matter that the LLM is what makes it better. My wheat farm is still about wheat, not the fancy sift I recently bought to separate it from the chaff.

I don't understand why this sounds so offensive to some of the readers here. I was just thinking "how would I use AI in such a product" and the only way I can come up with is this way in which is not the main show.

I mean, my experience with LLMs also confirm that. Prompting "find me bugs" or stuff like that almost never works. It works better if I get an error and ask it to explain it to me, giving the application context. The static analyzer is there to give this initial kick, to create these nucleation sites in which the LLM will crystalize answers upon.

This sounds like the most viable, easy to make product that can find bugs with LLMs. It's only offensive if that's actually what these products are doing, it's not supposed to be known and I struck a nerve or something.

refulgentis OP
I don't mean to aggravate you. I do mean to offer some insight in the mindset of the people the person I was replying to was puzzled by. I'm calmed by the fact that if we're both here, we both value one of the HN sayings I'm very fond of: come with curiosity.

> Do you believe AI is at the core of these security analyzers?

Yes.

> If so, why the personal story blogpost?

When I am feeling intensely, and people respond to me as I'm about to respond to you, I usually get very frustrated. Apologies in advance if you suffer from that same part of being human, I don't mean anything about you or your positions by this:

I don't know what you mean.

Thus, I may be answering wrong with the following: the person I replied to indicated all downvoters must know every detail, and as the, well lets use your phrasing, personal story blogpost, I just assume you mean my comment, leads with: "I believe there's a little more going on than everyone knowing every detail already, or presumably, being wrong to downvote. Full case study of a downvoter at work:"

> Claiming to work for Google

I claimed the opposite! I'm a jobless hack :) (quit in 2023)

> does not work as an authority card for me,

Looking at it, the thing isn't "I worked at Google therefore AI good" it's "I worked at Google and on a specific well-known project, the company's design language, used AI pre-ChatGPT to great effect. It's unclear to me why this use case would be unbelievable years later"

> you still have to deliver a solid argument.

What are we arguing? :) (I'm serious! Apologies, again, if it comes off as flippant. If you mean I need to deliver a solid argument the tools must have AI, I assume if said details were available you would have found them, you seem well-considered and curious. I meant to explain the mind of a downvoter who yet cannot recite details as yet unavailable to the public to the person I replied to, not to verify the workflow step by step.)

alganet
The argument is that these high-quality security analyzers seem to use AI as a triage mechanism, and the quality of the analysis is still capped by the quality of the static analysis tool.

One of the tools provide a whitepaper, that you can read here:

https://corgea.com/blog/whitepaper-blast-ai-powered-sast-sca...

It seems to explicitly put AI in this coadjuvant role, contradicting the HN title "found by AI".

Neither me or the other commenter actually dismissed AI as useless. I can't speak for him, but to me, it seems actually useful in this arrangement. However, not "I'll pay for a subscription" levels of useful.

Since it's just triage, it seems that trying to reproduce the idea using free tools might be worth a shot (and that's the idea of finding out where the AI component lies in the system). What I said is very doable (plug the output of traditional tools into vanilla coding LLMs prompts). It also looks a lot like this Corgea schematic:

https://framerusercontent.com/images/EtFkxLjT1Ou2UTPACObJbR2...

I mean, it's very brave to explain a downvote, but in this case, it seems that you missed the opportunity to make sense.

This item has no comments currently.