Profile: Game_Ender - Hacker Neue

Game_Ender

Joined May 30, 2011 1,611 karma

Game_Ender Dec 30, 2025 parent

Why should he put effort into measuring a tool that the author has not? The point is there are so many of these tools an objective measure that the creators of these tools can compare against each other would be better.
So a better question to ask is - Do you have any ideas for an objective way to a measure a performance of agentic coding tools? So we can truly determine what improves performance or not.
I would hope that internal to OpenAI and Anthropic they use something similar to the harness/test cases they use for training their full models to determine if changes to claude code result in better performance.
Game_Ender Oct 25, 2025 parent

Can you link some? I can only find the hip exoskeletons.
Game_Ender Oct 12, 2025 parent

Speed and simplicity. Now I can fetch one binary on a system and in seconds fetch everything needed to run a Python tool or work on a code base.
I can do all that without having to even worry about virtual ends, or Python versions too.
Game_Ender Oct 11, 2025 parent

https://archive.is/n08XL
Game_Ender Aug 7, 2025 parent

You are looking for Codex CLI [0].
0 - https://github.com/openai/codex
Game_Ender Jul 5, 2025 parent

I think the implicit take is that if your company hits AGI your equity package will do something like 10x-100x even if the company is already big. The only other way to do that is join a startup early enough to ride its growth wave.
Another way to say it is that people think it’s much more likely for each decent LLM startup grow really strongly first several years then plateau vs. then for their current established player to hit hyper growth because of AGI.
Game_Ender Jun 26, 2025 parent

The link you have posted 404’s and I could seem to find a command like that in your repos. Can you be more specific?
Game_Ender Jun 12, 2025 parent

Have you tried a smart watch? The Duo 2FA app lets you add an arbitrary TFA code based authenticator with same QR code Google Authenticator supports and generate those from their Apple WatchOS [0] or Android WearOS apps. I have used it successfully for years, it's a huge reason I got an Apple Watch in fact. Now you'll have to configure your watch with a "work" focus mode that turns off all notifications and not install any fancy apps on the watch (do those still exist?), but it can free you from your phone.
Along the same lines the Meta Wayfarer[2] smart glasses lets you take slice of life photos and videos without needing to whip out your phone. You lose a ton of quality but stay in the moment more. The AI features are getting better so eventually you'll be able to use it for basic information lookup.
0 - https://guide.duo.com/apple-watch
1 - https://guide.duo.com/duo-wear
2 - https://www.meta.com/ai-glasses/wayfarer
Game_Ender Jun 6, 2025 parent

This is really great. Reading the bill raw feels like reviewing a diff with context set to 0.
Game_Ender Jun 5, 2025 parent

I am not sure there is too much value for this article for the typical hacker news conversation on LLM based tooling. Here we generally focus on if the tooling is effective, and can it be used make software quicker or more cheaply. The problem is the author is opposed using the cutting edge models on privacy and ethics grounds. So they say:
> I have woefully little experience with these tools.
> I do not want to be using the cloud versions of these models with their potentially hideous energy demands; I’d like to use a local model. But there is obviously not a nicely composed way to use local models like this.
> The models and tools that people are raving about are the big, expensive, harmful ones. If I proved to myself yet again that a small model with bad tools was unpleasant to use, I wouldn’t really be addressing my opponents’ views.
Then without having any real practical experience with the cutting edge tooling they predict:
> As I have written about before, I believe the mania will end. There will then be a crash, and a “winter”. But, as I may not have stressed sufficiently, this crash will be the biggest of its kind — so big, that it is arguably not of a kind at all. The level of investment in these technologies is bananas and the possibility that the investors will recoup their investment seems close to zero.
I think a more accurate take is this will be like self driving, huge investments, many more losers thank winners, and it will take longer than all the boosters think. But in the end we did get actual self driving cars, but this time it's with LLMs it is something that anyone can use by clicking a link vs. waiting for lots of cars to be built and deployed.
Game_Ender Jun 2, 2025 parent

Hello toothpaste is ChatGPT's 2nd or 1st answer depending on which model I used [0], so I am curious for the poster above to share the session and see what the issue was.
There is known sensitivity (no pun intended ;) to wording of the prompt. I have also found if I am very quick and flippant it will totally miss my point and go off in the wrong direction entirely.
0 - https://www.hackerneue.com/item?id=44164633
Game_Ender Jun 2, 2025 parent

What model and query did you use? I used the prompt "find me a toothpaste that is both SLS free and has fluoride" and both GPT-4o [0] and o4-mini-high [1] gave me correct first answers. The 4o answer used the newish "show products inline" feature which made it easier to jump to each product and check it out (I am putting aside my fear this feature will end up kill their web product with monetization).
0 - https://chatgpt.com/share/683e3807-0bf8-800a-8bab-5089e4af51...
1 - https://chatgpt.com/share/683e3558-6738-800a-a8fb-3adc20b69d...
6 points May 30, 2025

Nuclear Outboard Motor Was a Terrible Idea

0 comments Game_Ender substack.com
Game_Ender Apr 30, 2025 parent

What is your preferred way to manage them?
Game_Ender Apr 19, 2025 parent

With Aider you pay API fees only. You can get simple tasks done for a few dollars. I suggest budgeting $20 or so dollars and giving it a go.
Game_Ender Apr 19, 2025 parent

What are those extra things you have to do more of? I only have experience with Aider so I am curious what I am missing here.
Game_Ender Apr 18, 2025 parent

Can you describe the why of the policy and if you are ok sharing the industry?
I am also curious if you have other restrictions on information sharing, API usage, and what reference documentation to use.
Game_Ender Apr 8, 2025 parent

Getting a 503 with that link.
Game_Ender Apr 5, 2025 parent

To help those who got a bit confused (like me) this Groq the company making accelerators designed specifically for LLM's that they call LPUs (Language Process Units) [0]. So they want to sell you their custom machines that, while expensive, will be much more efficient at running LLMs for you. While there is also Grok [0] which is xAI's series of LLMs and competes with ChatGPT and other models like Claude and DeepSeek.
EDIT - Seems that Groq has stopped selling their chips and now will only partner to fund large build outs of their cloud [2].
0 - https://groq.com/the-groq-lpu-explained/
1 - https://grok.com/
2 - https://www.eetimes.com/groq-ceo-we-no-longer-sell-hardware
Game_Ender Apr 4, 2025 parent

There is a bigger safety margin for humans if you need to land in a relatively large area in the water somewhere with a larger range of acceptable velocities. I believe they considered a propulsive landing over land but decided against it to simplify the initial design.
10 years later though they have added this ability as a backup [0]. Which again shows how if human lives are on the line you want to favor redundancy and simplicity over flash.
0 - https://www.nasaspaceflight.com/2024/10/dragon-propulsive-la...
Game_Ender Mar 18, 2025 parent

I don’t know if it matters now but at some point certain targets were hardened to near misses of certain sizes but not direct strikes. So the better your accuracy the smaller the weapon (or fewer) you can use to take out those targets.
So you could say the use would be increased certainty your enemies command and control and other bunkers would be destroyed increasing the odds of “winning” whatever happens afterwards.
Game_Ender Jan 26, 2025 parent

It looks like o1 also gets the right answer after thinking about it for 14 seconds: https://chatgpt.com/share/67962ead-a5f8-800a-bd91-9a145b993e...
Game_Ender Nov 9, 2024 parent

The tool has an excellent architecture section [0] that goes into how it works under the hood. It stands out to me that a complex tool has an overview to this depth that allows you to grasp conceptually how it works.
0 - https://mergiraf.org/architecture.html
Game_Ender Sep 2, 2024 parent

Since they lack noise isolation over ear or tight fitting buds this can be a problem. I have the OpenSwim Pro and they are fine outside except for really high noise. But while on a treadmill in the Gym they could not overwhelm the background noise.
Game_Ender Jan 12, 2024 parent

The whole slew of layoff posts from Amazon, Google, and others.
Game_Ender Dec 10, 2023 parent

The author is positive because of all the safety layers that existed and staid intact, despite how flawed humans and companies are. The culture of looking at previous accidents like the UA232, where they lost ann engine and ALL controls with it, meant the A380 control system was engineered to take even more damage and it worked.
I do agree though it did not spend enough effort focusing on the areas to improve:
- A computer controlled engine that runs for 60 seconds while on fire, and lets a dangerous part spin too fast. It seems like something that should of been covered ahead of time.
- An engine manufacturing process that is so complex it’s almost impossible to validate.
- A fault management system that only shows you 1 or 2 at a time when you have 40.
Game_Ender Dec 10, 2023 parent

It would add up weight wise, and it’s one of the simpler parts. Jet engines are high performance precise machines with many quickly spinning parts. If you can’t bore a tube correctly how are you going to machine a high efficiency, balanced turbofan system?
That said it seems like did have a poor process where a part could be out of spec and they had no good way to check it. As they mentioned about Swiss cheese, you want as many layers as possible, and checks like that are needed.
Game_Ender Dec 9, 2023 parent

Part of the way I explain this is the amount of overhead in a company or position. Say you have 20 hours of coordination, planning and meetings/week, and 20 hours of direct work. If you work 50 hours you know increase how much development you are doing by 50% by only working 25% more hours. Now it the organization can do the same by cutting overhead and meetings but that is usually not up to one high performance contributor.
Like you said the impact of a top contributor doing 50% more work can be really large, entire new systems can be built, key features launched. It can get you promoted, but you definitely won’t get a 50% raise.

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous