Profile: l33tman - Hacker Neue

l33tman

Joined Jan 18, 2019 1,756 karma

l33tman Aug 8, 2025

I started such an alternative project just before GPT-3 was released, it was really promising (lots of neuroscience inspired solutions, pretty different to Transformers) but I had to put it on hold because the investors I approached seemed like they would only invest in LLM-stuff. Now a few years later I'm trying to approach investors again, only to find now they want to invest in companies USING LLMs to create value and still don't seem interested in new foundational types of models... :/
I guess it makes sense, there is still tons of value to be created just by using the current LLMs for stuff, though maybe the low hanging fruits are already picked, who knows.
I heard John Carmack talk a lot about his alternative (also neuroscience-inspired) ideas and it sounded just like my project, the main difference being that he's able to self-fund :) I guess funding an "outsider" non-LLM AI project now requires finding someone like Carmack to get on board - I still don't think traditional investors are that disappointed yet that they want to risk money on other types of projects..
l33tman Jul 20, 2025

The issue with this is that if someone hacks one of the hosts now they have access to the backups of all your other hosts. With borg at least and the standard setup, would be cool if I was wrong though
l33tman May 30, 2025

It's sort of a slow-motion avalanche. For example the price of outsourcing app development is really coming down now because it's one of the areas where the AI coding tools really excel. It's a lot of boiler-plate style code, pretty canonical stuff, no rocket science and, to be frank, not really that much that has to be clever. You can give the tools a screenshot of an UI and it gladly outputs correct styling code for it in seconds. It supercharges already skilled developers. And if you needed 6 in-house app developers before, now you only need 2. It isn't an immediate effect, but it is an effect that is slowly going to run though the business.
The question is, are companies going to use fewer people to do the same, or the same amount of people and just create better products?
For prototyping new ideas this is also an invaluable turbocharge for startups who can't really afford to have hordes of developers trying out alternative solutions.
l33tman May 30, 2025

Sorry for the late reply here, but let me develop a bit more what I meant. I have a long and vast programming experience, but am also an entrepreneur with many projects that I switch between. I could have focused full-time on, say, React Native, and then I could churn out my monkey code all day long. But I don't spend full-time on RN and then suddenly 12 months passed where I didn't write any at all so my specific knowledge of that domain is always a bit behind.
But something like o4-mini-high is a domain expert in all versions of React, Redux, RN etc. and knows every internal SDK change over the last 10 years (or goes out and reads the changelogs and code itself). Countless times I've had it port old code to new and it figures it out 100%. It formulates good modern canonical ways to solve stuff. It knows all the stupid tricks you have to do to get RN stuff run well on Android and iOS that I would never be able to keep in my head unless I work full-time on that. And it does the eye-watering boring styling code that nobody likes, you can even just upload a screenshot of another app or a sketch on paper and it will correctly output code for the style in a matter of seconds.
The end result is that I can, without investing a full-time of keeping myself current, do a professional RN dual Android/iOS app development cycle because I have the general skill to understand what to ask it and how to merge its output properly. This leaves me time to do other stuff and generally be more productive.
My guess is that many who gave up on the AI coding stuff tried the bad tools like the default chatgpt 4o-mini (or tried the tools available 2 years ago) and got a bad experience. There are light-years of differences between these and something like o4-mini-high.
TL;DR: use the correct model for the job, and it doesn't really need to be an argument - if it makes you more productive it's a good tool, if it doesn't, nobody is forcing you to use it. But I don't think you should imply that everybody who likes these tools are stupid.
l33tman May 26, 2025

I can see why you would chime in to say that in your experience you don't get any value out of it, but to chime in to say that the millions of people who do are "inexperienced" is pretty offensive. In the hands of skilled developers these tools are a complete gamechanger.
l33tman May 15, 2025

Why would you do that, a bus driver costs $10/h and the bus costs several $100k even if it's NOT self driving. The cost of the driver must be miniscule in comparision, not to mention if the self-driving bus and associated insurances will cost many more factors of $100k...
l33tman May 5, 2025

The point of the Narrative Clip (https://getnarrative.com) was to let people both live in the moment AND capture it. It was launched at the height of the life-logger/wearable conjunction 10 years ago, but it's still an interesting idea and the thought of having a more thorough "log" of your life might get more and more relevant the closer we get to vivid AI-recreations and even brain augmentation.
Some customers also just had bad memory and loved sort of re-living their day every evening which made memories store more efficiently in the brain.
There are social "contracts" that the users need to consider when using stuff like this though, as you do take photos of those around you or who you interact with..
l33tman Mar 23, 2025

Ok, so I'm thinking here that.. hmm... maybe.. just maybe... there is something that, kind of, steers the rest of the thought process into a, you know.. more open process? What do you think? What do I think?
As opposed to the more literary authoritative prose from textbooks and papers where the model output from the get-go has to commit to a chain of thought. Some interesting relatively new results are that time spent on output tokens more or less linearly correspond to better inference quality so I guess this is a way to just achieve that.
The tokens are inserted artificially in some inference models, so when the model wants to end the sentence, you switch over the end token with "hmmmm" and it will happily now continue.
l33tman Mar 17, 2025

You're partially correct, but describing it in that way makes it sound like if you could "just look a little bit closer" the statistics would disappear, which doesn't happen. So it's more subtle than this. Fundamentally it's because QM doesn't use additive probabilities, but rather additive amplitudes which are complex numbers, and the probability is the square of the sum of these, so you can get interference between amplitudes. You can never get interference by adding probabilities.
In the dual slit experiment this is visible as you can't get the interference effects by summing the probabilities for "particle through slit 1" and "particle through slit 2" but rather you need to sum the amplitudes of the processes.
Working physicists (since 100 years) just do this, there is no practical need to interpret it further, but it would be cool if someone could figure out some prediction/experiment mismatch that does indeed require tweaking this!
l33tman Mar 17, 2025

No you're understanding correctly (I think), the behaviour of a single detected particle depends on all possible paths it could take to get to the detection.
This is fundamental to 100 years of quantum mechanics and underlies most of physics including all semiconductors, materials science, chemistry, lasers, etc. The double slit experiment is just a very good illustration of the principle boiled down to its essentials, which is why it's everywhere in pop-sci. It makes for more accessible story than describing how a hydrogen atom works.
l33tman Mar 12, 2025

For someone jumping back on the local LLM train after having been out for 2 years, what is the current best local web-server solution to host this for myself on a GPU (RTX3080) Linux server? Preferably with support for the multimodal image input and LaTeX rendering on the output..
I don't really care about insanely "full kitchen sink" things that feature 100 plugins to all existing cloud AI services etc. Just running the released models the way they are intended on a web server...
l33tman Mar 6, 2025

There are a lot of Americans travelling to or working in China, it is in their interest if nothing else to know the true values.
l33tman Feb 28, 2025

Don't know about Slack's videoconf, but Slack's cheap insistence that we pay a rip-off amount of money per month for storing some TEXT messages more than 90 days has continuously degraded my appreciation for it over the last years to the level of me hating it now.
They're so cheap. Just put a quota on total storage or something, that actually map to their costs..
We have a Slack for a shared office of 10 people or so, we use it to like ask each other for where to go for lunch or general stuff, it must cost them $0.001/month to host, but you continuously get a banner that says PAY TO UNLOCK THESE EXCITING OLD MESSAGES all over it, and when you check what they want, they want some exorbitant amount like $10/month/user so $100/month for a lunch-synchronization tool. For $100/month I can store like 5 TB on S3, that's a lot of texts.
I'm genuinely curious why they don't have some other payment option, I'd be happy to pay $1/month/user for some basic level if they just don't want freeloaders there. Well, I wouldn't be happy.. but still :)
l33tman Feb 20, 2025

I don't know about the US but in other countries it is definitely by design that the departments and their data are separate. It is far too easy to abuse gathering and joining data on people otherwise. History did teach us these lessons, and it's continuously visible as well today, fortunately at small scales just because they are separate.
It would truly be a nightmare scenario to have all government databases under a single potentially corrupt roof or having someone with access to all of them cough.
l33tman Feb 13, 2025

In particle physics you just use GeV (with varying powers) for most parameters :)
l33tman Jan 17, 2025

There is a legal distinction and definition, Legal Eagle on YouTube had an episode on exactly this a few weeks ago, about that the DA might have picked a more difficult crime to prove than murder. IANAL but IIRC the terrorism charge has to prove there is an intent to intimidate larger swaths of government or bodies of people. Just "other CEOs of Health Companies are now scared" is not enough.
l33tman Jan 16, 2025

This is far from the truth in particle physics. The symmetries we've found there (together with the Lorentz symmetry from special relativity) guides and constrains the math very strongly, to the degree that it allows you to predict the photon and the other force-carrying particles, and it even allowed predicting the existence and mass of the weak force carriers (discussed in the article) along with the Higgs mechanism that gives masses to them and most of the other particles. This is certainly a triumph of the Standard Model.
There are limits to how much you can do though, I mean at some point it's going to be "just math that fits reality". If you try to enumerate the number of mechanisms and realities that could give a decent enough diversity of composition that life can arise in some form, there's going to be more than our universe possible.
l33tman Jan 16, 2025

Yes, and it's exactly those "smart definitions" that are the Standard Model. The whole goal is to produce even smarter definitions, including showing that as much as possible of it couldn't be any other way, preferably.
l33tman Jan 15, 2025

It's a single lagrangian with a couple of dozen constants, in their pics there as well. It's just expanded out to different degrees.

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous