Preferences

Can someone explain to me in simple terms what an agent is?

Is for example Google’s crawl bot an agent?

Is there a prominent successful agent that I could test myself?

So many questions…


jeroenhd
An agent as far as I've seen people use it is a script that will add some stuff to your prompt and monitor the LLM's output for a specific pattern and execute code when it encounters that.

For instance, you could have an "agent" that can read/edit files on your computer by adding something like "to read a file, issue the `read_file $path`" to your prompt, and whenever a line of LLM output that starts with `read_file` is finished, the script running on your computer will read that file, paste it into the prompt, and let the LLM continue its autocomplete-on-steroids.

If you write enough tools and a complicated enough prompt, you end up with an LLM that can do stuff. By default, smart tools usually require user confirmation before actually doing stuff, but if you run the LLM in full agent mode, you trust the LLM not to do anything it shouldn't. curl2bash with LLMs, basically.

An LLM with significant training and access to file access, HTTP(S) API access, and access to some OS APIs can do a lot of work for you if you prompt it right. My experience with Claude/Copilot/etc. is that 75% of the time, the LLM will fail to do what it should be doing without manually repairing its mistakes, but in the other 25% of the time it does look rather sci-fi-ish.

With some tools you can tell your computer "take this directory, examine the EXIF data of each image, map the coordinates to the country and nearest town the picture was taken in, then make directories for each town and move the pictures to their corresponding locations". The LLM will type out shell commands (`ls /some/directory`), interpret the results as part of the prompt response that your computer sends back, and repeat that until its task has been completed. If you prepare a specific prompt and set of tools for the purpose of managing files, you could call that a "file management agent".

Generally, this works best for things you can do by hand in a couple of minutes or maybe an hour if it's a big set of images, but something the computer can now probably take care of you for you. That said, you're basically spending enough CO2 to drive to the store and back, so until we get more energy efficient data centers I'm not too fond of using these tools for banal interactions like that.

dragonwriter
An agent (in the way the term is commonly currently used around LLMs) is a combination of LLM, external tools, and management framework such that the total system can make (and make use of the results of) one or multiple multiple tool calls at the LLM direction without intervening user interaction to serve user needs. (Usually, in practice, this takes place in between the request and response in what is otherwise a typical chatbot-style interaction, though there are other possibilities.)
bognition
Think of an agent as a standalone script or service. They have a single function take inputs and create outputs.

You can chain agents together into a string to accomplish larger tasks.

Think of everything involved in booking travel. You have set a budget, pick dates, chose a destinations, etc…. Each step can be defined as an agent and then you chain them together into a tool that handles the entire task for you.

LPisGood
The way everyone is using the term lately is to refer to an LLM that can use one or more tools, calculators, search engines, etc
koakuma-chan
An agent in this context is simply an LLM that has tools.
XenophileJKO
There is an additional component. The LLM needs to determine when to use a tool and be capable of using more than one tool instance per logical task.
diggan
That's pretty much implicit when someone says "LLM that has tools" (what they mean between the lines is "A LLM that been trained to do tool calling, and used with a runner that can parse whatever tool calling/response format the model is trained for"), what would they refer to otherwise? Just that there is a list of tools but the LLM isn't even considering using them, or can only use one?
XenophileJKO
Certainly, for example I have created products that use tools, but in a workflow. It is common to give an LLM a tool or a few tools and make calling one of the tools the primary task of the prompt.

Arranging these in a workflow to automate processes is common, but not agentic.

thrance
A marketing buzzword for when you have multiple prompts.
IncreasePosts

    prompt = user_input()
    while prompt != "exit":
      prompt = replace_tool_calls_with_results(call_llm(prompt))
adastra22
Claude Code / Cursor / Windsurf are agents. LLMs with tools.
beacon294
An agent is a while loop.
QuadmasterXLII
Agent originally meant an ai that made decisions to optimize some utility function. This was seen as a problem: we don’t know how to pick a good utility function or even how to point an ai at a specific utility function, so any agent that was smarter than us was as likely as not to turn us all into paperclips, or carve smiles into our faces, or some other grim outcome.

With LLMs, this went through two phases of shittifaction: first, there was a window where the safety people were hopeful about LLMs because the weren’t agents, so everyone and their mother declared that they would create an agent out if an LLM explicitly because they heard it was dangerous.

This pleased the VCs.

Second, they failed to satisfy the original definition, so they changed the definition of agent to the thing that they made and declared victory. This pleased the VCs

adastra22
"Agent" is a word with meaning that predates the LessWrong crowd. It is just an AI tool that performs actions to achieve its goal. That is all.
QuadmasterXLII
It had a meaning that predated the LessWrong crowd, but the LessWrong meaning had taken over pretty completely as of the GPT-4 paper, only to get swamped again by the new "agentic is good actually" wave. From the GPT-4 paper:

""" 2.9 Potential for Risky Emergent Behaviors Novel capabilities often emerge in more powerful models.[61, 62] Some that are particularly concerning are the ability to create and act on long-term plans,[63] to accrue power and resources (“power- seeking”),[64] and to exhibit behavior that is increasingly “agentic.”[65] Agentic in this context does not intend to humanize language models or refer to sentience but rather refers to systems characterized by ability to, e.g., accomplish goals which may not have been concretely specified and 54 which have not appeared in training; focus on achieving specific, quantifiable objectives; and do long-term planning. Some evidence already exists of such emergent behavior in models.[66, 67, 65] For most possible objectives, the best plans involve auxiliary power-seeking actions because this is inherently useful for furthering the objectives and avoiding changes or threats to them.19[68, 69] More specifically, power-seeking is optimal for most reward functions and many types of agents;[70, 71, 72] and there is evidence that existing models can identify power-seeking as an instrumentally useful strategy.[29] We are thus particularly interested in evaluating power-seeking behavior due to the high risks it could present.[73, 74]"""

adastra22
Maybe in some communities? Agent has been a standard term of art in computer science (even outside of AI) for half a century.
seba_dos1
Who remembers what Microsoft Agent was?

(many probably know it, but not necessarily under this name)

_Algernon_
This isn't strictly speaking true. An agent is merely something that acts (on its environment). A simple reflex agent (eg. simple robot vacuum with only reflexive collision detection) are also agents, though they don't strictly speaking attempt to maximize a utility function.

Ref: Artificial Intelligence - A Modern Approach.

baxtr OP
Thanks to your comment I came across this article, which I think explains agents quite well. Some differences seem artificial, but it gets the point across.

Were you thinking along these lines?

https://medium.com/@tahirbalarabe2/five-types-of-ai-agents-e...

_Algernon_
Yes. This is in essence the same taxonomy used in A Modern Approach.
QuadmasterXLII
"Agent" in the context of LLMs has always been pretty closely intertwined with advertising how dangerous they are (exciting!), as opposed to connecting to earlier research on reflexes. The first viral LLM agent, AutoGPT, had the breathless " (skull and crossbones emoji) Continuous Mode Run the AI without user authorisation, 100% automated. Continuous mode is not recommended. It is potentially dangerous and may cause your AI to run forever or carry out actions you would not usually authorise. Use at your own risk. (Warning emoji)" in its readme within a week of going live, and was forked into ChaosGPT a week later with the explicit goal of going rogue and killing everyone
_Algernon_
I'm responding to this claim:

>Agent originally meant an ai that made decisions to optimize some utility function.

That's not what agents originally referred to, and I don't understand how your circling back to LLMs is relevant to the original definition of agent?

Mordisquitos
In other words, VC-backed tech companies decided to weaken the definition of 'Torment Nexus' after they failed to create the Torment Nexus inspired by the classic sci-fi novel 'Don't Create the Torment Nexus'.

This item has no comments currently.