Preferences

It's the latest tech holy war. Tabs vs Spaces but more existential. I'm usually anti hype and I've been convinced of AI's use over and over when it comes to coding. And whenever I talk about it, I see that I come across as an evangelist. Some people appreciate that, online I get a lot of push back despite having tangible examples of how it has been useful.

I don't see it that way. Tabs, spaces, curly brace placement, Vim, Emacs, VSCode, etc are largely aesthetic choices with some marginal unproven cognitive implications.

I find people mostly prefer what they are used to, and if your preference was so superior then how could so many people build fantastic software using the method you don't like?

AI isn't like that. AI is a bunch of people telling me this product can do wonderful things that will change society and replace workers, yet almost every time I use it, it falls far short of that promise. AI is certainly not reliable enough for me to jeopardize the quality of my work by using it heavily.

You can vibe-code a throwaway UI for investigating some complex data in less than 30 minutes. The code quality doesn't matter, and it will make your life much easier.

Rinse and repeat for many "one-off" tasks.

It's not going away, you need to learn how to use it. shrugs shoulders

The issue is people trying to use these AI tools to investigate complex data not the throwaway UI part.

I work as the non-software kind of engineer at an industrial plant there is starting to emerge a trend of people who just blindly trust the output of AI chat sessions without understanding what the chat bot is echoing at them which is wasteful of their time and in some cases my time.

This not not new in the past I have experienced engineers who use (abuse) statistics/regression tools etc. Without understanding what the output was telling them but it is getting worse now.

It is not uncommon to hear something like: "Oh I investigated that problem and this particular issue we experienced was because of reasons x, y and z."

Then when you push back because what they've said sounds highly unlikely it boils down to. "I don't know that is what the AI told me".

Then if they are sufficiently optimistic they'll go back and prompt it with "please supply evidence for your conclusion" or some similar prompt and it will supply paragraphs of plausible sounding text but when you dig into what it is saying there are inconsistencies or made up citations. I've seen it say things that were straight up incorrect and went against Laws of Thermodynamics for example.

It has become the new "I threw the kitchen sink into a multivariate regression and X emerged as significant - therefore we should address x"

I'm not a complete skeptic I think AI has some value, for example if you use it as a more powerful search engine by asking it something like "What are some suggested techniques for investigating x" or "What are the limitations of Method Y" etc. It can point you to the right place assist you with research, it might find papers from other fields or similar. But it is not something you should be relying on to do all of the research for you.

But how do you know you're getting the correct picture from that throwaway UI? A little while back there was an blog posted where the author wrote an article praising AI for his vibe-coded earth-viewer app that used Vulkan to render inside a GUI window. Unfortunately, that wasn't the case and AI just copied from somewhere and inserted code for a rudimentary software rendering. The AI couldn't do what was asked because it had seldom been done. Nobody on the internet ever discussed that particular objective, so it wasn't in the training set.

The lesson to learn is that these are "large-language models." That means it can regurgitate what someone else has done before textually, but not actually create something novel. So it's fine if someone on the internet has posted or talked about a quick UI in whatever particular toolkit you're using to analyze data. But it'll throw out BS if you ask for something brand new. I suspect a lot of AI users are web developers who write a lot of repetitive rote boilerplate, and that's the kind of thing these LLMs really thrive with.

> But how do you know you're getting the correct picture from that throwaway UI?

You get the AI to generate code that lets you spot-check individual data points :-)

Most of my work these days is in fact that kind of code. I'm working on something research-y that requires a lot of visualization, and at this point I've actually produced more throwaway code than code in the project.

Here's an example: I had ChatGPT generate some relatively straightforward but cumbersome geometric code. Saved me 30 - 60 minutes right there, but to be sure, I had it generate tests, which all passed. Another 30 minutes saved.

I reviewed the code and the tests and felt it needed more edge cases, which I added manually. However, these started failing and it was really cumbersome to make sense of a bunch of coordinates in arrays.

So I had it generate code to visualize my test cases! That instantly showed me that some assertions in my manually added edge cases were incorrect, which became a quick fix.

The answer to "how do you trust AI" is human in the loop... AND MOAR AI!!! ;-)

It’s kind of fun watching this comment go up and down :)

There’s so much evidence out there of people getting real value from the tools.

Some questions you can ask yourself are “why doesn’t it work for me?” and “what can I do differently?”.

Be curious, not dogmatic. Ignore the hype, find people doing real work.

They're good questions! The problem is that I've tried to talk to the people who are getting real value from it, and often the answer ends up being that the value is not as real as they think. One guy gave an excited presentation about how AI let him write 7k LOC per day, expounded for an entire session about how the rest of us should follow in his shoes, and then clarified only in Q&A that reviewers couldn't keep up so he exempted himself from code review.
I’m starting to believe there are situations where the human code review is genuinely not necessary. Here’s a concrete example of something that’s been blowing my mind. I have 25 years of professional coding experience but it’s almost all web, with a few years of iOS in the objective C era. I’m also an amateur electronic musician. A couple of weeks ago I was thinking about this plugin that I used to love until the company that made it went under. I’ve long considered trying to make a replacement but I don’t know the first thing about DSP or C++.

You know where this is going. I asked Claude if audio plugins were well represented in its training data, it said yes, off I went. I can’t review the code because I lack the expertise. It’s all C++ with a lot of math and the only math I’ve needed since college is addition and calculating percentages. However, I can have intelligent discussions about design and architecture and music UX. That’s been enough to get me a functional plugin that already does more in some respects than the original. I am (we are?) making it steadily more performant. It has only crashed twice and each time I just pasted the dump into Claude and it fixed the root cause.

Long story short: if you can verify the outcome, do you need to review the code? It helps that no one dies or gets underpaid if my audio plugin crashes. But still, you can’t tell me this isn’t remarkable. I think it’s clear there will be a massive proliferation of niche software.

Most enterprise software I use has serious defects. Professional CAD software for infrastructure is awful. Many are just incremental improvements piled upon software from the 1990s. Bugs last for decades because nobody can understand how the program works so they just work on one more little VBA plugin at a time. Meanwhile, the capabilities of these programs have fallen completely behind game studios with no budget and no business plan. Where are the results of this human excellence and code quality process? There are 10s of thousands of new CVEs every year from code hand crafted by artisans on their very own MacBooks. How? Perhaps there is the tiny possibility that maybe code quality is mostly an aesthetic judgment that nobody can really define, and just maybe this effort is mostly spent on vague concepts like maintainability or preferential decisions instead of the basics: does it meet the specification? Is the performance getting better or worse?

This is the game changer for me: I don’t have to evaluate tens or hundreds of market options that fit my problem. I tell the machine to solve it, and if it works, then I’m happy. If it doesn’t I throw it away. All in a few minutes and for a few cents. Code is going the way of the disposable diaper, and, if you ever washed a cloth diaper you will know, that’s a good thing.

Most people don't have a problem with using genai for stuff like throwaway UI's. That's not even remotely relevant to the criticisms. People reject having it forced down their throats by companies who are desperate to make us totally reliant on it to justify their insane investments. And people reject the evangelicals who claim that it's going to replace developers because it can spit out mostly working boilerplate.
I’m an AI skeptic. I like seeing what UIs it spits out, though, which defeats the blank page staring into my soul fear nicely. I don’t even use the code, just take inspiration from the layouts.
Yeah, it helps a lot to make first steps, to overcome writers block, to make you put into words what you'd like to have built.

At one point you might take over, ask it for specific refactors you'd do but are too lazy to do yourself. Or even toss it away entirely and start fresh with better understanding. Yourself or again with agent.

It's like watching somebody argue that code linting is going to change the face of the world and the rebuttals to the skeptics are arguing that akshually code linting is quite useful....
I have found value for one off tasks. I forget the exact situation, but I wanted to do some data transformation, something that would normally take me a half hour of awk/sed/bash or python scripting. AI spit it out right away.
> You can vibe-code a throwaway UI for investigating some complex data in less than 30 minutes. The code quality doesn't matter, and it will make your life much easier.

I think the throwaway part is important here and people are missing it, particularly for non-programmers.

There's a lot of roles in the business world that would make great use of ephemeral little apps like this to do a specific task, then throw it away. Usually just running locally on someone's machine, or at most shared with a couple other folks in your department.

Code doesn't have to be good, hell it doesn't even have to be secure, and certainly doesn't need to look pretty. It just needs to work.

There's not enough engineering staff or time to turn every manager's pet excel sheet project into a temporary app, so LLMs make perfect sense here.

I'd go as far to say more effort should be put into ephemeral apps as a use case for LLMs over focusing on trying to use them in areas where a more permanent, high quality solution is needed.

Improve them for non-developers.

>You can vibe-code a throwaway UI

And then people create non-throwaway things with it and your job, performance report, bonus, and healthcare are tied to being compared to those people who just do what management says without arguing about the correct application of the tool.

If you keep your job, it's now tied to maintaining the garbage those coworkers checked in.

Perhaps. But does it matter? There is a million tools to investigate complex data already. Are you suggesting it is more useful to develop a new tool from scratch, using LLM-type tools, than it is to use a mature tool for data analysis?

If you don't know how to analyze data, and flat out refuse to invest in learning the skill, then I guess that could be really useful. Those users are likely the ones most enthusiastic about AI. But are those users close to as productive as someone who learns a mature tool? Not even close.

Lots of people appreciate an LLM to generate boiler plate code and establish frameworks for their data structures. But that's code that probably shouldn't be there in the first place. Vibe coding a game can be done impressively quick, but have you tried using a game construction kit? That's much faster still.

Except when your AI psychosis PM / manager sees your throwaway vibe-coded garbage and demands it gets shipped to customers.

It's infinitely worse when your PM / manager vibe-codes some disgusting garbage, sees that it kind of looks like a real thing that solves about half of the requirements (badly) and demands engineers ship that and "fix the few remaining bugs later".

You should try telling management it’s throwaway
One thing people often don't realize or ignore: these LLMs are trained on the internet, the entire internet.

There's a shit-ton of bad and inefficient code on the internet. Lots of it. And it was used to train these LLMs as much as the good code.

In other words, the LLMs are great if you're OK with mediocrity at best. Mediocrity is occasionally good enough, but it can spell death for a company when key parts of it are mediocre.

I'm afraid a lot of the executives who fantasize about replacing humans with AI are going to have to learn this the hard way.

I would say it is like that. No one HAS to use AI. But the shared goal is to get a change to the codebase to achieve a desired outcome. Some will outsource a significant part of that to AI, some won't.

And its tricky because I'm trying not to appeal to emotion despite being fascinated with how this tool has enabled me to do things in a short amount of time that it would have taken me weeks of grinding to get to and improves my communication with stakeholders. That feels world changing. Specifically my world and the day-to-day roll I play when it comes to getting things done.

I think it is fine that it fell short of your expectations. It often does for me as well but it's when it gets me 80% of the way there in less than a day's work, then my mind is blown. It's an imperfect tool and I'm sorry for saying this but so are we. Treat its imperfections in the same way you would with a JR developer- feedback, reframing, restrictions, and iterate.

> No one HAS to use AI.

Well… That's no longer true, is it?

My partner (IT analyst) works for a company owned by a multinational big corporation, and she got told during a meeting with her manager that use of AI is going to become mandatory next year. That's going to be a thing across the board.

And have you called a large company for any reason lately? Could be your telco provider, your bank, public transport company, whatever. You call them, because online contact means haggling with an AI chatbot first to finally give up and shunt you over to an actual person who can help, and contact forms and e-mail have been killed off. Calling is not exactly as bad, but step one nowadays is 'please describe what you're calling for', where some LLM will try to parse that, fail miserably, and then shunt you to an actual person.

AI is already unavoidable.

> My partner (IT analyst) works for a company owned by a multinational big corporation, and she got told during a meeting with her manager that use of AI is going to become mandatory next year. That's going to be a thing across the board.

My multinational big corporation employer has reporting about how much each employee uses AI, with a naughty list of employees who aren't meeting their quota of AI usage.

Nothing says "this product is useful" quite like forcing people to use it and punishing people who don't. If it was that good, there'd be organic demand to use it. People would be begging to use it, going around their boss's back to use it.

The fact that companies have to force you to use it with quotas and threats is damning.

> My multinational big corporation employer has reporting about how much each employee uses AI, with a naughty list of employees who aren't meeting their quota of AI usage.

“Why don’t you just make the minimum 37 pieces of flAIr?”

Yeah. Well. There are company that require TPS reports, too.

It's mostly a sign leadership has lost reasoning capability if it's mandatory.

But no, reporting isn't necessarily the problem. There are plenty of places that use reporting to drive a conversation on what's broken, and why it's broken for their workflow, and then use that to drive improvement.

It's only a problem if the leadership stance is "Haha! We found underpants gnome step 2! Make underpants number go up, and we are geniuses". Sadly not as rare as one would hope, but still stupid.

Those kinds of reports seem to be a thing at all big tech corps now.
> And have you called a large company for any reason lately? Could be your telco provider, your bank, public transport company, whatever. You call them, because online contact means haggling with an AI chatbot first to finally give up and shunt you over to an actual person who can help, and contact forms and e-mail have been killed off. Calling is not exactly as bad, but step one nowadays is 'please describe what you're calling for', where some LLM will try to parse that, fail miserably, and then shunt you to an actual person

All of this predates LLMs (what “AI” means today) becoming a useful product. All of this happened already with previous generations of “AI”.

It was just even shittier than the version we have today.

It was also shittier than the version we had before it (human receptionists).

This is what I always think of when I imagine how AI will change the world and daily life. Automation doesn't have to be better (for the customer, for the person using it, for society) in order to push out the alternatives. If the automation is cheap enough, it can be worse for everyone, and still change everything. Those are the niches in ehich I'm most certain will be here to stay— because sometimes, it hardly matters if it's any good.

> where some LLM will try to parse that, fail miserably, and then shunt you to an actual person.

If you're lucky. I've had LLMs that just repeatedly hang up on me when they obviously hit a dead end.

It isn't a universal thing. I have no doubt there is a job out there that that isn't a requirement. I think the issue is the C-level folks are seeing how more productive someone might be and making it a demand. That to me is the wrong approach. If you demonstrate and build interest, the adoption will happen.
As opposed to reaching, say, somebody in an offshored call center with an utterly undecipherable accent reading a script at you? Without any room for deviation?

AI's not exactly a step down from that.

> But the shared goal is to get a change to the codebase to achieve a desired outcome.

I'd argue that's not true. It's more of a stated goal. The actual goal is to achieve the desired outcome in a way that has manageable, understood side effects, and that can be maintained and built upon over time by all capable team members.

The difference between what business folks see as the "output" of software developers (code) and what (good) software developers actually deliver over time is significant. AI can definitely do the former. The latter is less clear. This is one of the fundamental disconnects in discussions about AI in software development.

In my personal use case, I work at a company that has SO MUCH process and documentation for coding standards. I made an AI agent that knows all that and used it to update legacy code to the new standard in a day. Something that would have taken weeks if not more. If your desire is manageable code, make that a requirement.

I'm going to say this next thing as someone with a lot of negative bias about corporations. I was laid off from Twitter when Elon bought the company and at a second company that was hemorrhaging users.

Our job isn't to write code, it's to make the machine do the thing. All the effort for clean, manageable, etc is purely in the interest of the programmer but at the end of the day, launching the feature that pulls in money is the point.

How did you verify that your AI agent performed the update correctly? I've experienced a number of cases where an AI agent made a change that seemed right at first glance, maybe even passed code review, but fell apart completely when it came time to build on top of it.
It's not just about coding standards. It's about, over time, having a team of people with a built-up set of knowledge about how things work and how they're expected to work. You don't get that by vibe coding and reviewing numerous PRs written by other people (or chatbots).

If everyone on your team is doing that, it's not long before huge chunks of your codebase are conceptually like stuff that was written a long time ago by people who left the company. Except those people may have actually known what they were doing. The AI chatbots are generating stuff that seems to plausibly work well enough based on however they were prompted.

There are intangible parts of software development that are difficult to measure but incredibly valuable beyond the code itself.

> Our job isn't to write code, it's to make the machine do the thing. All the effort for clean, manageable, etc is purely in the interest of the programmer but at the end of the day, launching the feature that pulls in money is the point.

This could be the vibe coder mantra. And it's true on day one. Once you've got reasonably complex software being maintained by one or more teams of developers who all need to be able to fix bugs and add features without breaking things, it's not quite as simple as "make the machine do the thing."

>AI is certainly not reliable enough for me to jeopardize the quality of my work by using it heavily.

I mean this in sincerity, and not at all snarky, but - have you considered that you haven't used the tools correctly or effectively? I find that I can get what I need from chatbots (and refuse to call them AI until we have general AI just to be contrary) if I spend a couple of minutes considering constraints and being careful with my prompt language.

When I've come across people in my real life who say they get no value from chatbots, it's because they're asking poorly formed questions, or haven't thought through the problem entirely. Working with chatbots is like working with a very bright lab puppy. They're willing to do whatever you want, but they'll definitely piss on the floor unless you tell them not to.

Or am I entirely off base with your experience?

It would be helpful if you would relate your own bad experiences and how you overcame them. Leading off with "do it better" isn't very instructive. Unfortunately there's no useful training for much of anything in our industry, much less AI.

I prefer to use LLM as a sock puppet to filter out implausible options in my problem space and to help me recall how to do boilerplate things. Like you, I think, I also tend to write multi-paragraph prompts repeating myself and calling back on every aspect to continuously hone in on the true subject I am interested in.

I don't trust LLM's enough to operate on my behalf agentically yet. And, LLM is uncreative and hallucinatory as heck whenever it strays into novel territory, which makes it a dangerous tool.

> have you considered that you haven't used the tools correctly or effectively?

The problem is that this comes off just as tone-deaf as "you're holding it wrong." In my experience, when people promote AI, its sold as just having a regular conversation and then the AI does thing. And when that doesn't work, the promoter goes into system prompts, MCP, agent files, etc and entire workflows that are required to get it to do the correct thing. It ends up feeling like you're being lied to, even if there's some benefit out there.

There's also the fact that all programming workflows are not the same. I've found some areas where AI works well, but a lot of my work it does not. Usually things that wouldn't show up in a simple Google search back before it was enshittified are pretty spotty.

I suspect AI appeals very strongly to a certain personality type who revels in all the details in getting a proper agentic coding environment bootstrapped for AI to run amok in, and then supervises/guides the results.

Then there’s people like me, who you’d probably term as an old soul, who looks at all that and says, “I have to change my workflow, my environment, and babysit it? It is faster to simply just do the work.” My relationship with tech is I like using as little as possible, and what I use needs to be predictable and do something for me. AI doesn’t always work for me.

Yes, this rings true, it took me over a month to actually get to at least 1x of my usual productivity with Claude Code. There is a ton of setup and ton of things to learn and try to see what works. What to watch out for and how to babysit it so it doesn't go off the rails (quite heavy handed approach works best for me). It's kind of like a shitty, but very fast and very knowledgable junior developer. At this moment it still maybe isn't "worth it" for a lot of devs if productivity (and developer ergonomics) is the only goal, but it is clear to me that this is where the industry is heading and I think every dev will eventually have to get on board. These tools really just started to be somewhat decent this year. I'm 100% sure that in a year or two it will be the default for everyone in a way that you simply won't be able to compete without it at all. It would be like using a shovel instead of an excavator. Remember, right now is the worst it'll ever be.
> In my experience, when people promote AI, its sold as just having a regular conversation and then the AI does thing.

This is almost the complete opposite of my experience. I hear expressions about improvements and optimism for the future, but almost all of the discussion from active people productivly using AI is about identifying the limits and seeing what benefits you can find within those limits.

They are not useless and they are also not a panacea. It feels like a lot of people consider those the only available options.

AI is okay (not great) at generating low- to mid-skill code. If you are working in a high-skill software domain that requires pervasive state-of-the-art or first-principles implementation then AI produces consistently terrible code. It frequently is flatly incorrect about esoteric technical details that really matter.

It can't reason from first principles and there isn't training data for a lot of state-of-the-art computer science and code implementations. Nothing you can prompt will make it produce non-naive output because it doesn't have that capability.

AI works for a lot of things because, if we are honest, AI generated slop is replacing human generated slop. But not all software is slop and there are software domains where slop is not even an option.

All good, no snark inferred. Yes I have considered this, and I keep considering it every time I get a bad result. Sorry this response is so long.

I think I have a good idea how these things work. I have run local LLMs for a couple of years on a pair of video cards here, trying out many open weight models. I have watched the 3blue1brown ML course. I have done several LinkedIn Learning courses (which weren't that helpful, just mandatory). I understand about prompting precisely and personas (though I am not sold personas are a good idea). I understand LLMs do not "know" anything, they just generate the next most likely token. I understand LLMs are not a database with accurate retrieval. I understand "reasoning" is not actual thinking just manipulating tokens to steer a conversation in vector space. I understand LLMs are better for some tasks (summarisation, sentiment analysis, etc) than others (retrieval, math, etc). I understand they can only predict what's in their training data. I feel I have a pretty good understanding of how to get results from LLMs (or at least the ways people say you can get results).

I have had some small success with LLMs. They are reasonably good at generating sub-100 line test code when given a precise prompt, probably because that is in training data scraped from StackOverflow. I did a certification earlier this year and threw ~1000 lines of Markdown notes into Gemini and had it quiz me which was very useful revision, it only got one question wrong of the couple of hundred I had it ask me.

I'll give a specific example of a recent failure. My job is mostly troubleshooting and reading code, all of which is public open source (so accessible via LLM search tooling). I was trying to understand something where I didn't know the answer, and this was difficult code to me so I was really not confident at all in my understanding. I wrote up my thoughts with references, the normal person I ask was busy so I asked Gemini Pro. It confidently told me "yep you got it!".

I asked someone else who saw a (now obvious) flaw in my reasoning. At some point I'd switched from a hash algorithm which generates Thing A, to a hash algorithm which generates Thing B. The error was clearly visible, one of my references had "Thing B" in the commit message title, which was in my notes with the public URL, when my whole argument was about "Thing A".

This wasn't even a technical or code error, it was a text analysis and pattern matching error, which I didn't see because I was so focused on algorithms. Even Gemini, the apparent best LLM in the world which is causing "code red" at OpenAI did not pick this up, when text analysis is supposed to be one of its core functionalities.

I also have a lot of LLM-generated summarisation forced on me at work, and it's often so bad I now don't even read it. I've seen it generate text which makes no logical sense and/or which uses so many words without really saying anything at all.

I have tried LLM-based products where someone else is supposed to have done all the prompt crafting and added RAG embeddings and I can just behave like a naive user asking questions. Even when I ask these things question which I know are in the RAG, they cannot retrieve an accurate answer ~80% of the time. I have read papers which support the idea that most RAG falls apart after about ~40k words and our document set is much larger than that.

Generally I find LLMs are at the point where to evaluate the LLM response I need to either know the answer beforehand so it was pointless to ask, or I need to do all the work myself to verify the answer which doesn't improve my productivity at all.

About the only thing I find consistently useful about LLMs is writing my question down and not actually asking it, which is a form of Rubber Duck Debugging (https://en.wikipedia.org/wiki/Rubber_duck_debugging) which I have already practiced for many years because it's so helpful.

Meanwhile trillions of dollars of VC-backed marketing assures me that these things are a huge productivity increaser and will usher in 25% unemployment because they are so good at doing every task even very smart people can do. I just don't see it.

If you have any suggestions for me I will be very willing to look into them and try them.

I'm probably one of the people that would say AI (at least LLMs) isn't all its cracked up to be and even I have examples where it has been useful to me.

I think the feeling stems from the exaggeration of the value it provides combined with a large number of internal corporate LLMs being absolute trash.

The overvaluation is seen in effect everywhere from the stock market, the price of RAM, the cost of energy as well as IP theft issues etc etc. AI has taken over and yet it still feels like just a really good fuzzy search. Like yeah I can search something 10x faster than before but might get a bad answer every now and then.

Yeah its been useful (so have many other things). No it's not worth building trillion dollar data centers for. I would be happier if the spend went towards manufacturing or semiconductor fabs.

Lol you made me think my power bill has gone up but I didn't get a pay rise for my increased productivity.
I think it's more a continuation of IDE versus pure editor.

More precisely:

In one side, it's the "tools that build up critical mass" philosophy. AI firmly resides here.

On the other, it's the "all you need is brain and plain text" philosophy. We don't see much AI in this camp.

One thing I learned is that you should never underestimate the "all you need is brain and plain text" camp. That philosophy survived many, many "fatal blows" and has come up on top several times. It has one unique feature: resilience to bloat, something that the current smart tools camp is obviously overlooking.

Similar experience. I think it's become an identity politics concept. To those who consider themselves to be anti AI, the concept of the tool having any use is haram.

It feels awkward living in the "LLMs are a useful tool for some tasks" experience. I suspect this is because the two tribes are the loudest.

I see LLM's as kinda the new hotness in IDEs. And some people will use vi forever.
Right this is what I can’t quite understand. A lot of HN folks appear to have been burned by e.g. horrible corporate or business ideas by non technical people that don’t understand AI, that is completely understandable. What I never understand is the population of coders that don’t see any value in coding agents or are aggressively against them, or people that deride LLMs as failing to be able to do X (or hallucinate etc) and are therefore useless and every thing is AI Slop, without recognizing that what we can do today is almost unrecognizeable from the world of 3 years ago. The progress has moved astoundingly fast and the sheer amount of capital and competition and pressure means the train is not slowing down. Predictions of “2025 is the year of coding agents” from a chorus of otherwise unpalatable CEOs was in fact absolutely true…
There is zero guarantee that these tools will continue to be there. Those of us who are skeptical of the value of the tools may find them somewhat useful, but are quite wary of ripping up the workflows we've built for ourselves over decade(s)(+) in favor of something that might be 10-20% more useful, but could be taken away or charged greater fees or literally collapse in functionality at any moment, leaving us suddenly crippled. I'll keep the thing I know works, I know will always be there (because it's open source, etc), even if it means I'm slightly less productive over the next X amount of time otherwise.
What would you imagine a plausible scenario would possibly be that your tools would be taken away or “collapse in functionality”? I would say Claude right now has probably made worse code and wasted time than if I had coded things myself, but it’s because this is like the first few hundred days of this. Open weight models are also worse but they will never go away and improve steadily as well. I am all for people doing whatever works for them I just don’t get the negativity or the skepticism when you look at the progress over what has been almost zero time. It’s crappy now in many respects but it’s like saying “my car is slow” in the one millisecond after I floor the gas pedal
> What would you imagine a plausible scenario would possibly be that your tools would be taken away or “collapse in functionality”?

Simple. The company providing the tool needs actual earning suddenly. Therefore, they need to raise the prices. They also need users to spend more tokens, so they will make the tool respond in a way that requires more refinement. After all, the latter is exactly what happened with google search.

At this point, that is pretty normal software cycle - try to attract crowd by being free or cheap, then lock features behind paywall. Then simultaneously raise prices more and more while making the product worst.

This literally NEEDS to happen, because these companies do not have any other path to profitability. So, it will happen at some point.

Sure but you’re forgetting that competition exists. If anthropic investors suddenly say “enough” and demand positive cash flow it wouldn’t be that hard, everyone is capturing users for flywheels and capex for model improvements because if they don’t they will be guaranteed to lose.

It’s going to definitely be crappy, remember Google in 2003 with relevant results and no endless SEO , or Amazon reviews being reliable, or Uber being simple and cheap, etc. once growth phase ends monetization begins and experience declines but this is guard railed by the fact that there are many players.

Comsidering what I described is how tech companies actually function and functioned in the past, theoretical competition wont help.

They are competing themselves into massive unprofitability. Eventually they will die or do the above in cooperation. Maybe there will bw minor snandal about it, but that sort of collution is not prosecuted or seriously investigated if done by big companies.

So, it will happen exactly as it always happens with tech.

My understanding is that all the big AI companies are currently offering services at a loss, doing the classic Silicon Valley playbook of burning investor cache to get big, and then hope to make a profit later. So any service you depend on could crash out of the race, and if one emerges as a victorious monopoly and you rely on them, they can charge you almost whatever they like.

To my mind, the 'only just started' argument is wearing off. It's software, it moves fast anyway, and all the giants of the tech world have been feverishly throwing money at AI for the last couple of years. I don't buy that we're still just at the beginning of some huge exponential improvement.

My understanding is they make a loss overall due to the spending on training new models, that the API costs are profit making if considered in isolation. That said, this is based on guestimates based on hosting costs of open-weight models, owing to a lack of financial transparancey everywhere for the secret-weights models.
> that the API costs are profit making if considered in isolation.

no, they are currently losing money on inference too

> Predictions of “2025 is the year of coding agents” from a chorus of otherwise unpalatable CEOs was in fact absolutely true…

... but maybe not in the way that these CEOs had hoped.[0]

Part of the AI fatigue is that busy, competent devs are getting swarmed with massive amounts of slop from not-very-good developers. Or product managers getting 5 paragraphs of GenAI bug reports instead of a clear and concise explanation.

I have high hopes for AI and think generative tooling is extremely useful in the right hands. But it is extremely concerning that AI is allowing some of the worst, least competent people to generate an order of magnitude more "content" with little awareness of how bad it is.

[0] https://github.com/ocaml/ocaml/pull/14369

> busy, competent devs are getting swarmed with massive amounts of slop from not-very-good developers

that is a real issue and yet a normal problem and so has an obvious response.

oh wow that PR

> What I never understand is the population of coders that don’t see any value in coding agents or are aggressively against them, or people that deride LLMs as failing to be able to do X (or hallucinate etc) and are therefore useless and every thing is AI Slop, without recognizing that what we can do today is almost unrecognizeable from the world of 3 years ago.

I don't recognize that because it isn't true. I try the LLMs every now and then, and they still make the same stupid hallucinations that ChatGPT did on day 1. AI hype proponents love to make claims that the tech has improved a ton, but based on my experience trying to use it those claims are completely baseless.

> I try the LLMs every now and then, and they still make the same stupid hallucinations that ChatGPT did on day 1.

One of the tests I sometimes do of LLMs is a geometry puzzle:

  You're on the equator facing south. You move forward 10,000 km along the surface of the Earth. You are rotate 90° clockwise. You move another 10,000 km forward along the surface of the earth. Rotate another 90° clockwise, then move another 10,000 km forward along the surface of the Earth.

  Where are you now, and what direction are you facing?
They all used to get this wrong all the time. Now the best ones sometimes don't. (That said, only one to succed just as I write this comment was DeepSeek; the first I saw succeed was one of ChatGPT's models but that's now back to the usual error they all used to make).

Anecdotes are of course a bad way to study this kind of thing.

Unfortunately, so are the benchmarks, because the models have quickly saturated most of them, including traditional IQ tests (on the plus side, this has demonstrated that IQ tests are definitely a learnable skill, as LLMs loose 40-50 IQ points when going from public to private IQ tests) and stuff like the maths olympiad.

Right now, AFAICT the only open benchmarks are the METR time horizon metric, the ARC-AGI family of tests, and the "make me an SVG of ${…}" stuff inspired by Simon Willison's pelican on a bike.

Out of interest, was your intended answer "where you started, facing east"?

FWIW, Claude Opus 4.5 gets this right for me, assuming that is the intended answer. On request, it also gave me a Mathematica program which (after I fixed some trivial exceptions due to errors in units) informs me that using the ITRF00 datum the actual answer is 0.0177593 degrees north and 0.168379 west of where you started (about 11.7 miles away from the starting point) and your rotation is 89.98 degrees rather than 90.

(ChatGPT 5.1 Thinking, for me, get the wrong answer because it correctly gets near the South Pole and then follows a line of latitude 200 times round the South Pole for the second leg, which strikes me as a flatly incorrect interpretation of the words "move forward along the surface of the earth". Was that the "usual error they all used to make"?)

> Out of interest, was your intended answer "where you started, facing east"?

Or anything close to it so long as the logic is right, yes. I care about the reasoning failure, not the small difference between the exact quarter-circumferences of these great circles and 10,000km; (Not that it really matters, but now you've said the answer, this test becomes even less reliable than it already was).

> FWIW, Claude Opus 4.5 gets this right for me, assuming that is the intended answer.

Like I said, now the best ones sometimes don't [always get it wrong].

For me yesterday, Claude (albeit Sonnet 4.5, because my testing is cheap) avoided the south pole issue, but then got the third leg wrong and ended up at the north pole. A while back ChatGPT 5 (I looked the result up) got the answer right, yesterday GPT-5-thinking-mini (auto-selected by the system) got it wrong same way as you report on the south pole but then also got the equator wrong and ended up near the north pole.

"Never" to "unreliable success" is still an improvement.

Yeah, I'm pretty sure that's correct. Just whipped this up, using the WGS-84 datum.

  (use-modules (geo vincenty))
  
  (let walk ((p '(0 0 180))
             (i 0))
    (cond ((= i 3)
           (display p)
           (newline))
          (else
            (walk (apply vincenty
                         (list (car p) (cadr p) (+ 90 (caddr p)) 10000000))
                  (+ i 1)))))
Running this yields:

  (0.01777744062090717 0.16837322410251268 179.98234155229127)
Surely the discrepancy is down to spheroid vs sphere, yeah?
This fascinates me. Just observing but because it hasn't worked for you, everyone else must be lying? (I'm assuming that's what you mean by baseless)

How does that bridge get built? I can provide tangible real life examples but I've found push back from that in other online conversations.

My boss has been passing off Claude generated code and documentation to me all year. It is consistently garbage. It consistently hallucinates. I consistently have to rewrite most, if not all, of what I'm handed.

I do also try and use Claude Code for certain tasks. More often than not, i regret it, but I've started to zero in on tasks it's helpful with (configuration and debugging, not so much coding).

But it's very easy then for me to hear people saying that AI gives them so much useful code, and for me to assume that they are like my boss: not examining that code carefully, or not holding their output to particularly high standards, or aren't responsible for the maintenance and thus don't need to care. That doesn't mean they're lying, but it doesn't mean they're right.

Not everyone is your boss. I have 15 years of experience coding. So when the AI hallucinates, I call that out and it improves the code it does create. If someone is passing off Ai's first pass as done, they are not using the tool correctly.
My boss has 28 years of experience coding so that clearly isn't the deciding factor here.

Yes, i suppose it is theoretically possible that you are that much better than my boss and i at coaxing good output from an LLM, but I'm going to continue to be skeptical until i see it with my own eyes.

"Claude Code" by itself is not specific enough; which model are we talking about?
> it hasn't worked for you, everyone else must be lying?

Well, some non-zero amount of you are probably very financially invested in AI, so lying is not out of the question

Or you simply have blinders on because of your financial investments. After all, emotional investment often follows financial investment

Or, you're just not as good as you think you are. Maybe you're talking to people who are much better at building software than you are, and they find the stuff the AI builds does not impress them, while you are not as skilled so you are impressed by it.

There are lots of reasons someone might disagree without thinking everyone else is lying

I think calling it baseless to claim benefits from AI is more than disagreeing. It's claiming a rightness that is just contrarian and hyperbolic. It's really interesting to me that the skeptics are exactly who should be using AI. Push back on it. Tell it that the code it made was wrong.
What have you tried? How much time have you spent? Using AI is it’s own skill set separate from programming
AI is in a hype bubble that will crash just like every other bubble. The underlying uses are there but just like Dot Com, Tulips, subprime mortgages, and even Sir Isaac Newton's failings with the South Sea Company the financial side will fall.

This will cause bankruptcies and huge job losses. The argument for and against AI doesn't really matter in the end, because the finances don't make a lick of sense.

Ok sure the bubble/non-bubble stuff, fine, but in terms of “things I’d like to be a part of” it’s hard to imagine a more transformative technology (not to again turn off the anti-hype crowd). But ok, say it’s 1997, you don’t like the valuations you see. But as a tech person you’re not excited by browsers, the internet, the possibilities? You don’t want to be a part of that even if it means a bubble pops? I also hear a lot of people argue “finances don’t make a lick of sense” but i don’t think things are that cut and dried and I don’t see this as obvious. I don’t think really many people know how things will evolve and what size a market correction or bubble would have.
What precisely about AI is transformative, compared to the internet? E-mail replaced so much of faxing, phoning and physical mail. Online shopping replaced going to stores and hoping they have what you want, and hoping it is in stock, and hoping it is a good price. It replaced travel agents to a significant degree and reoriented many industries. It was the vehicle that killed CDs and physical media in general.

With AI I can... generate slop. Sometimes that is helpful, but it isn't yet at the point where it's replacing anything for me aside from making google searches take a bit less time on things that I don't need a definitive answer for.

It's popping up in my music streams now and then, and I generally hate it. Mushy-mouthed fake vocals over fake instruments. It pops up online and aside from the occasional meme I hate it there too. It pops up all over blogs and emails and I profoundly hate it there, given that it encourages the actual author to silence themselves and replaces their thoughts with bland drivel.

Every single software product I use begs me to use their AI integration, and instead of "no" I'm given the option of "not now", despite me not needing it, and so I'm constantly being pestered about it by something.

It has, thus far, made nearly everything worse.

> With AI I can... generate slop. Sometimes that is helpful, but it isn't yet at the point where it's replacing anything for me aside from making google searches take a bit less time on things that I don't need a definitive answer for.

I think this is probably the disconnect, this seems so wildly different from my experience. Not only that, I’ll grant that there are a ton of limitations still but surely you’d concede that there has been an incredible amount of progress in a very short time? Like I can’t imagine someone who sits down with Claude like I do and gets up and says “this is crap and a fad and won’t go anywhere”.

As for generated content, I again agree with you and you’d be surprised to learn that _execs_ agree with you but look at models from 1, 2, 3 years ago and tell me you don’t see a frightening progression of quality. If you want to say “I’ll believe it when I see it” that’s fine but my god just look at the trajectory.

For AI slop text, once again agree, once again I think we all have to figure out how to use it, but it is great for e.g. helping me rewrite a wordy message quickly, making a paper or a doc more readable, combining my notes into something polished, etc, and it’s getting better and better and better.

So I disagree it has made everything worse but I definitely agree that it has made a lot of things worse and we have a lot of Pets.com ideas that are totally not viable today, but the point I think people are maybe missing (?) is that it’s not about where we are it’s about the velocity and the future. You may be terrified and nauseated by $1T in capex on AI infra, fine but what that tells you is the scale is going to grow even further _in addition_ to the methodological / algorithmic improvements to tackle things like continual learning, robustness, higher quality multimodal generation with e.g. true narrative consistency, etc etc etc. in 5 years I don’t think many people will think of “slop” so negatively

Maybe those people do different work than you do? Coding agents don’t work well in every scenario.
Yet people imply that because it doesn’t work in their scenario that it’s not good?
Most of the people against “AI” are not against it because they think it doesn’t work.

It’s because they know it works better every day and the people controlling it are gleefully fucking over the rest of the world because they can.

The plainly stated goal is TO ELIMINATE ALL HUMAN EMPLOYEES, with no plan for how those people will feed, clothe, or house themselves.

The reactions the author was getting was the reaction of a horse talking to someone happily working for the glue factory.

I don't think you're qualified to speak for most of the people against AI.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal