Few ideas we were thinking of: integrating a small LLM, building MCP store into browser, building a more AI friendly DOM, etc.
Even today, we use chrome's accessibility tree (a better representation of DOM for LLMs) which is not exposed via chrome extension APIs.
You might consider the Accessibility Tree and its semantics. Plain divs are basically filtered out so you're left with interactive objects and some structural/layout cues.
The issue is mac and windows accessibility APIs are opaque and I have no idea what I'm doing so I'm forced to vibe code it all which is not turning out too well... :-)
I suffer from mild carpal tunnel so I want to build a really low latency computer use agent that can do anything on my computer without me having to learn the talon voice syntax or some other traditional accessibility software like mac dictation.
Chrome has a built-in LLM: https://developer.chrome.com/docs/ai/built-in
A good place to start is think about for example if you need to copy paste info from 100 websites to put into a spread sheet for example.
These fancy words carry an intellectual/productive effect. When they're put to use it probably makes people feel like they're getting things done. And they never feel lazy because of this.
Does having access to Chromium internals give you any super powers over connecting over the Chrome Devtools Protocol?