Preferences

It is logically impossible for a LLM to, for example, to know that fooExecute() takes two int arguments if the documentation is blocked by robots.txt and there are no examples of fooExecute() usage in the wild, don't you agree?

diggan
I agree, but also think it's less important. I don't want a big fat LLM that memorized every API out there, and as soon as the API changed, the weights have to updated. I like the current approach of Codex (and similar) where they can look up the APIs they need to use as they're doing the work instead, so same weights will continue to work no matter how much the APIs change.
tharant
Sure, the model would not “know” about your example, but that’s not the point; the penultimate[0] goal is for the model to figure out the method signature on its own just like a human dev might leverage her own knowledge and experience to infer that method signature. Intelligence isn’t just rote memorization.

[0] the ultimate, of course, being profit.

jowea OP
I don't think a human dev can divine a method signature and effects in the general case either. Sure the add() function probably takes 2 numbers, but maybe it takes a list? Or a two-tuple? How would we or the LLM know without having the documentation? And yeah sure the LLM can look at the documentation while being used instead of it being part of the training dataset, but that's strictly inferior for practical uses, no?

I'm not sure if we're thinking of the same field of AI development. I think I'm talking about the super-autocomplete with integrated copy of all of digitalized human knowledge, while you're talking about trying to do (proto-)AGI. Is that it?

heavenlyblue
> Sure the add() function probably takes 2 numbers, but maybe it takes a list? Or a two-tuple? How would we or the LLM know without having the documentation?

You just listed possible options in the order of their relative probability. Human would attempt to use them in exactly that order

This item has no comments currently.