Preferences

crazygringo parent
If you watch how agents attempt a task, fail, try to figure out what went wrong, try again, repeat a couple more times, then finally succeed -- you don't see the similarity?

dingnuts
no I see something resembling gradient descent which is fine but it's hardly a child
haskellshill
> try to figure out what went wrong

LLMs don't do this. They can't think. If you just one for like five minutes it's obvious that just because the text on the screen says "Sorry, I made I mistake, there are actually 5 r's in strawberry", doesn't mean there's any thought behind it.

crazygringo OP
I mean, you can literally watch their thought process. They try to figure out reasons why something went wrong, and then identify solutions. Often in ways that require real deduction and creativity. And have quite a high success rate.

If that's not thinking, then I don't know what is.

balder1991
No, because an agent doesn’t learn, it’s just continuing a story. A kid will learn from the experience and at the end will be a different person.
CaptainOfCoit
You just haven't added the right tool together with the right system/developer prompt. Add a `add_memory` and `list_memory` (or automatically inject the right memories for the right prompts/LLM responses) and you have something that can learn.

You can also take it a step further and add automatic fine-tuning once you start gathering a ton of data, which will rewire the model somewhat.

haskellshill
Perhaps it can improve but it can't learn because that requires thought. Would you say that a PID regulator can "learn"?
CaptainOfCoit
I guess it depends on what you understand "learn" to mean.

But in my mind, if I tell the LLM to do something, and it did it wrong, then I ask it to fix it, and if in the future I ask the same thing and it avoids the mistake it did first, then I'd say it had learned to avoid that same pitfall, although I know very well it hasn't "learned" like a human would, I just added it to the right place, but for all intents and purposes, it "learned" how to avoid the same mistake.

This item has no comments currently.