agentic AI dev tooling and workflows.
CI for your AI agent team: https://agent-ci.com
reach out: root@a10k.co
- Social movements don't need to be quantifiably better to take off.
When the relevant audience is bored enough to be open to something new, it only takes a few influential people to tip the scales.
People don't want to be truly revolutionary; that takes actual risk. They want the appearance of being revolutionary with minimal downside and social reassurance.
(w/r/t GitHub there's already enough buzz in the right circles and it will likely happen this year.)
- IDK all of my personal and professional projects involve pushing the SOTA to the absolute limit. Using anything other than the latest OpenAI or Anthropic model is out of the question.
Smaller open source models are a bit like 3d printing in the early days; fun to experiment with but really not that valuable for anything other than making toys.
Text summarization, maybe? But even then I want a model that understands the complete context and does a good job. Even things like "generate one sentence about the action we're performing" I usually find I can just incorporate it into the output schema of a larger request instead of making a separate request to a smaller model.
- We're not yet to the point where a single PCIe device will get you anything meaningful; IMO 128 GB of ram available to the GPU is essential.
So while you don't need a ton of compute on the CPU you do need the ability address multiple PCIe lanes. A relatively low-spec AMD EPYC processor is fine if the motherboard exposes enough lanes.
- It might seem minor, but the little things add up. Make your dev environment mirror prod from the start will save you a bunch of headaches. Then, when you're ready to deploy, there is nothing to change.
Even better, stage to a production-like environment early, and then deploy day can be as simple as a DNS record change.
- Wait, if I am providing essential data to your service, why am I paying you?
Perfect opportunity to run a project that benefits it's users (monetarily) if you only did the leg work to market that value to map consumers. And, as a consumer, you don't need the sophisticated hardware, anyway.
- That's definitely a possible future abstraction and one are about the future of technology I'm excited about.
First we get to tackle all of the small ideas and side projects we haven't had time to prioritize.
Then, we start taking ownership of all of the software systems that we interact with on a daily basis; hacking in modifications and reverse engineering protocols to suit our needs.
Finally our own interaction with software becomes entirely boutique: operating systems, firmware, user interfaces that we have directed ourselves to suit our individual tastes.
- I started using Django before the official 1.0 release and used it almost exclusively for years on web projects.
Lately I prefer to mix my own tooling and a couple major packages in for backends (FastAPI, SQLAchemy) that are still heavily inspired by patterns I picked up while using Django. I end up with a little more boilerplate, but I also end up with a little more stylistic flexibility.
- Rather than having multiple agents running inside of one IDE window, I structure my codebase in a way that is somewhat siloed to facilitate development by multiple agents. This is an obvious and common pattern when you have a front-end and a back-end. Super easy to just open up those directories of the repository in separate environments and have them work in their own siloed space.
Then I take it a step further and create core libraries that are structured like standalone packages and are architected like third-party libraries with their own documentation and public API, which gives clear boundaries of responsibility.
Then the only somewhat manual step you have is to copy/paste the agent's notes of the changes that they made so that dependent systems can integrate them.
I find this to be way more sustainable than spawning multiple agents on a single codebase and then having to rectify merge conflicts between them as each task is completed; it's not unlike traditional software development where a branch that needs review contains some general functionality that would be beneficial to another branch and then you're left either cherry-picking a commit, sharing it between PRs, or lumping your PRs together.
Depending on the project I might have 6-10 IDE sessions. Each agent has its own history then and anything to do with running test harnesses or CLI interactions gets managed on that instance as well.
- > Chinese brands
There is far more to the logistics and adoption of this outside of "Tesla failed to capture the region" as the article's title eludes to.
Bribery, government corruption, risky loans, undercutting. It is well documented in the case of large infrastructure projects and the same playbook will be revealed in time.
- Cursor may have had incredible growth, but so many companies are getting into these early and obvious products to enhance developer productivity, I don't think they're going to be dominant (independently at least) for much longer.
Google is a direct competitor now, every major model company has an agentic coder, tons of people are putting out small enhancements and useful tools to augment all of these.
In terms of creating a viable business, I would (and have) position myself a step or two away from these obvious solutions. Further out in the ecosystem there's a ton of nuance around specific use cases, programming languages, development and deployment environments; all which will be revolutionized (again) in the years to come.
- You're talking about technology that's only become realistic in the last couple years. Even then, there's probably nothing off-the-shelf that would serve the current need.
LAPD has been patrolling with helicopters for decades. I have yet to see a drone follow a car in high speed pursuit down the 5 at 100+ MPH.
- Hetzner certainly has this cult-like following mostly because of their low cost.
I assume it is a recent push toward these kind of open frame, super minimalist, consumer hardware based systems (I don't speak german and didn't translate the video).
It looks like they're using lots of consumer hardware and very little redundancy; you'll notice that the power supplies are generic ATX units and they're not doubled up. And then they're also running the onboard networking with a second connection which looks like it's for just a management system. Might not even be 10 gigabit networking.
It's interesting that in an era where almost all of the major players are moving toward cable-free arrangements i.e. backplanes with fully integrated power and networking, etc., they're instead opting for the rat's nest of cabling. It must have something to do with lower labor costs vs hardware costs. The amount of density that they are achieving with those systems is also incredibly low relatively speaking.
- As someone who's used Claude Code daily since the day it was released, the sentiment back then (sooo many months ago) was that the Agent CI coding TUIs were kind of experimental proof-of-concepts. We have seen them be incredibly effective and the CC team has continued to add features.
Tech debt isn't something that even experienced large teams are immune to. I'm not a huge TypeScript fan, so seeing their choice to run their app on Node to me felt like a trade-off between development speed with the experience that the team had and at the expense of long-term growth and performance. I regularly experience pretty intense flickering and rendering issues and high CPU usage and even crashes but that doesn't stop me from finding the product incredibly useful.
Developing good software especially in a format that is relatively revolutionary takes time to get right and I'm sure whatever efforts they have internally to push forward a refactor will be worth it. But, just like in any software development, refactors are prone to timeline slips and scope creep. A company having tons of money doesn't change the nature of problem-solving in software development.
- This is essentially a solved problem. Whenever someone sends me a screenshot that contains any text information (tables, etc), I pass it to an LLM and it correctly interprets the content of it. On modern versions of macOS you can just select text in images relatively painlessly, too.
Linux desktop users will get there one day.
And then goes on to recommend AI Studio is a primary dev tool?! Baffling.