Preferences

I can't wait for something like this to be built.

People have tons of workflows that involve a lot of clicks and typing in response to data that are too difficult or one-off to automate with fragile macros.

But if my computer can quickly realize that I'm deleting every odd-numbered page of a PDF, or renaming every file to add a prefix, or following each link on a website and saving an image... and then just instantly automate the next 100 times... that's going to be huge!


> But if my computer can quickly realize that I'm deleting every odd-numbered page of a PDF, or renaming every file to add a prefix, or following each link on a website and saving an image... and then just instantly automate the next 100 times... that's going to be huge!

The first two tasks could be easily done by asking ChatGPT to write a script for you. Scraping a website can be a bit more tricky. Still, I don't see why you have to rely on "computer use" for these tasks -- there are much more efficient and reliable approaches to the tasks.

Those are just simple examples. Most of the clicking I do on my computer doesn't have a command-line equivalent. Nor do I want to have to type out a request to ChatGPT, even if there is one.

There's a gigantic area of productivity improvement around repetitive actions that aren't easily scriptable or no scripting interface exists. But where an AI assistant that interfaces with your screen, pointer and keyboard would be a huge help.

The places where automation is most needed are for non-technical folks. To “write a script” is a huge hurdle.
Check out AutoGPT you don't have to wait, it's already built.

https://github.com/Significant-Gravitas/AutoGPT

I don't think that's what I'm describing.

That's about manually setting up agents, that run on a server, that seem to interact largely with the web (from the examples).

I'm talking about not manually setting up anything -- I'm talking about an AI that simply observes the repetitive actions you're taking on your computer, infers patterns from them, and then offers to take over and finish the job.

There was something like this on Macs in the mid 1990s, that would watch you work and suggest timed automations - there was a bit of a developer panic the first time it told someone "I see you launch <popular desktop game> around 4:30pm every Friday, would you like to do that automatically?"
As a start, I want to see if the agent can figure out how I play a clicker style game such as adventure capitalist on the computer. I am thinking I have a certain style of playing. I still don't understand how an agent can somehow figure out valid gameplay (earth, moon, mars, events) AND figure out a valid gameplay much less play the game in my own style.

I think we should start with something simple, repeatable, and does little to no harm if/when things go wrong.

Edit: repetitive -> repeatable

Does this do "Computer Use" in that it looks at the screen, controls the mouse, keyboard (e.g. how Anthropic computer use does?)
AutoGPT is an unserious project.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal