Preferences

kouteiheika
Joined 2,976 karma

  1. Unfortunately there are never any real specifics about how any of their models were trained. It's OpenAI we're talking about after all.
  2. There's no story. You need to remember - big corporations are not your friend. They're your enemy. They don't care about you. They don't care about doing good. They care about money. They care about control. They care about their stock price. That's it.

    You might ask - but what about the people who work at those corporations? And that's also pretty simply explained by this classic quote: it is difficult to get a man to understand something, when his salary depends on his not understanding it.

  3. > It also makes life harder for individuals and small companies, because this is not Open Source. It's incompatible with Open Source, it can't be reused in other Open Source projects.

    I'm amazed at the social engineering that the megacorps have done with the whole Open Source (TM) thing. They engineered a whole generation of engineers to advocate not in their own self-interest, nor for the interest of the little people, but instead for the interest of the megacorps.

    As soon as there is even the tiniest of restrictions, one which doesn't affect anyone besides a bunch of richiest corporations in the world, a bunch of people immediately come out of the woodwork, shout "but it's not open source!" and start bullying everyone else to change their language. Because if you even so much as inconvenience a megacorporation even a little bit it's not Open Source (TM) anymore.

    If we're talking about ideals then this is something I find unsettling and dystopian.

    I hard disagree with your "It also makes life harder for individuals and small companies" statement. It's the opposite. It gives them a competitive advantage vs megacorps, however small it may be.

  4. > Open source is what it is today because it's built by people with a spine who stand tall for their ideals even if it means less money, less industry recognition, lots of unglorious work and lots of other negatives.

    With all due respect, don't you see the irony in saying "people with a spine who stand tall for their ideals", and then arguing that attaching "restrictions" which only affect the richest megacorporations in the world somehow makes the license not permissive anymore?

    What ideals are those exactly? So that megacorporations have the right to use the software without restrictions? And why should we care about that?

  5. > I do not mind having a license like that, my gripe is with using the terms "permissive" and "open source" like that because such use dilutes them. I cannot think of any reason to do that aside from trying to dilute the term (especially when some laws, like the EU AI Act, are less restrictive when it comes to open source AIs specifically).

    Good. In this case, let it be diluted! These extra "restrictions" don't affect normal people at all, and won't even affect any small/medium businesses. I couldn't care less that the term is "diluted" and that makes it harder for those poor, poor megacorporations. They swim in money already, they can deal with it.

    We can discuss the exact threshold, but as long as these "restrictions" are so extreme that they only affect huge megacorporations, this is still "permissive" in my book. I will gladly die on this hill.

  6. > 10-13 minutes if I remember correctly from booting the game to actually being able to do anything besides mash buttons to try and skip the cutscenes.

    Genuinely curious - if you don't care about the story then why play an RPG? When you're speedrunning - sure, skip all of the cutscenes, but when you're playing casually - why would you want to do that?

  7. > How much would it cost the community to pretrain something with a more modern architecture?

    Quite a lot. Search for "Chroma" (which was a partial-ish retraining of Flux Schnell) or Pony (which was a partial-ish retraining of SDXL). You're probably looking at a cost of at least tens of thousands or even hundred of thousands of dollars. Even bigger SDXL community finetunes like bigASP cost thousands.

    And it's not only the compute that's the issue. You also need a ton of data. You need a big dataset, with millions of images, and you need it cleaned, filtered, and labeled.

    And of course you need someone who knows what they're doing. Training these state-of-art models takes quite a bit of skill, especially since a lot of it is pretty much a black art.

  8. AFAIK a big part of it is that they distilled the guidance into the model.

    I'm going to simplify all of this a lot so please bear with me, but normally the equation to denoise an image would look something like this:

        pos = model(latent, positive_prompt_emb)
        neg = model(latent, negative_prompt_emb)
        next_latent = latent + dt * (neg + cfg_scale * (pos - neg))
    
    So what this does - you trigger the model once with a negative prompt (which can be empty) to get the "starting point" for the prediction, and then you run the model again with a positive prompt to get the direction in which you want to go, and then you combine them.

    So, for example, let's assume your positive prompt is "dog", and your negative prompt is empty. So triggering the model with your empty prompt with generate a "neutral" latent, and then you nudge it into the direction of your positive prompt, in the direction of a "dog". And you do this for 20 steps, and you get an image of a dog.

    Now, for Flux the equation looks like this:

        next_latent = latent + dt * model(latent, positive_prompt_emb)
    
    The guidance here was distilled into the model. It's cheaper to do inference with, but now we can't really train the model too much without destroying this embedded guidance (the model will just forget it and collapse).

    There's also an issue of training dynamics. We don't know exactly how they trained their models, so it's impossible for us to jerry rig our training runs in a similar way. And if you don't match the original training dynamics when finetuning it also negatively affects the model.

    So you might ask here - what if we just train the model for a really long time - will it be able to recover? And the answer is - yes, but at this point the most of the original model will essentially be overwritten. People actually did this for Flux Schnell, but you need way more resources to pull it off and the results can be disappointing: https://huggingface.co/lodestones/Chroma

  9. > It's incredibly clear who the devs assume the target market is.

    Not "assume". That's what the target market is. Take a look at civitai and see what kind of images people generate and what LoRAs they train (just be sure to be logged in and disable all of the NSFW filters in the options).

  10. > it was difficult to fine tune

    Yep. It's pretty difficult to fine tune, mostly because it's a distilled model. You can fine tune it a little bit, but it will quickly collapse and start producing garbage, even though fundamentally it should have been an easier architecture to fine-tune compared to SDXL (since it uses the much more modern flow matching paradigm).

    I think that's probably the reason why we never really got any good anime Flux models (at least not as good as they were for SDXL). You just don't have enough leeway to be able to train the model for long enough to make the model great for a domain it's currently suboptimal for without completely collapsing it.

  11. If you like old computers and are interested in live coding music then I highly recommend to check out this video (Making 8-bit Music From Scratch at the Commodore 64 BASIC Prompt):

    https://www.youtube.com/watch?v=ly5BhGOt2vE

  12. > Speaking of Anti-Cheat and secure boot, you need SB for Battlefield 6. The game won't start without it. So it's happening!

    I'm curious, does anyone know how exactly they check for this? How was it actually made unspoofable?

  13. > For 楷書 type fonts this may be true, but you ought to know there’s more to it, don’t you?

    With all due respect, this is the type of comment that really makes online discourse so exhausting.

    Yes, I know! Unless you put up two pages of disclaimers and footnotes there's always someone who comes out of the woodwork and "um ackshually"ies you. It was only supposed to be a quick and dirty comment talking about the topic in general, and not an in-depth, ten page treatise on the subject.

    If you have something to add to what I wrote, then please do so, but heckling random people who put up their comment up in good faith is not helping anyone.

  14. Fully algorithmically? I have no idea, as I'm not really in the fonts business.

    But I'm pretty sure they're not actually redrawing every character from scratch, and are actually reusing the subcomponents (at very least for normal fonts). But how much of that is actually automated - you'd have to ask actual font designers.

  15. > I guess designing a font for a language with 2100 different characters is probably a hassle.

    The ~2000 is the official count taught in schools, but the actually "commonly" used number in literature is around ~3000. And you actually want more than that, because people's names can use weird kanji which are used nowhere else.

    On the other hand, the vast majority of kanji are actually composed of a limited set of "subcharacters". For example, picking a completely random one:

        徧  ⿰彳扁
    
    The '徧' is composed of '彳' and '扁' arranged in a horizontal pattern. Unicode even has special characters (⿰,⿱,⿶, etc.) to describe these relationships.

    So this actually makes creating a CJK font somewhat easier, because you can do it semi-algorithmically. You don't have to manually draw however many thousand characters there are, but you draw those "subcharacters" and then compose them together.

  16. > The lack of competence from companies that acquire Japanese companies, and then fail to even price things in yen or offer support packages that cater to Japanese customers is really something.

    In general I don't think it's just that. Pretty much all font foundries have... insufferable business models.

    I once emailed one Japanese foundry asking to license one of their font to use on my website. I wanted a perpetual, one-time license to use on a single website, and I wanted to store and serve their font from my server. I was even prepared to pay low four figures for it.

    Nope. I was told I need to pay a subscription fee, and I need to use their crappy Javascript to serve it. Okay, if you don't want my money then I'm not going to insist.

  17. > Given the number of leaks, deliberate publications of weights, and worldwide competition, why do you believe this?

    So where can I find the leaked weights of GPT-3/GPT-4/GPT-5? Or Claude? Or Gemini?

    The only weights we are getting are those which the people on the top decided we can get, and precisely because they're not SOTA.

    If any of those companies stumbles upon true AGI (as unlikely as it is), you can bet it will be tightly controlled and normal people will either have an extremely limited access to it, or none at all.

    > Even if by "lobotomised" you mean "refuses to assist with CNB weapon development"

    Right, because people who design/manufacture weapons of mass destruction will surely use ChatGPT to do it. The same ChatGPT who routinely hallucinates widely incorrect details even for the most trifling queries. If anything, that'd only sabotage their efforts if they're stupid enough to use an LLM for that.

    Nevertheless, it's always fun when you ask an LLM to translate something from another language, and the line you're trying to translate coincidentally contains some "unsafe" language, and your query gets deleted and you get a nice, red warning that "your request violates our terms and conditions". Ah, yes, I'm feeling "safe" already.

  18. > Is there any evidence that we're getting some crappy lobotomized models while the companies keep the best for themselves? It seems fairly obvious that they're tripping over each other in a race to give the market the highest intelligence at the lowest price.

    Yes? All of those models are behind an API, which can be taken away at any time, for any reason.

    Also, have you followed the release of gpt-oss, which the overlords at OpenAI graciously gave us (and only because Chinese open-weight releases lit a fire under them)? It was so heavily censored and lobotomized that it has become a meme in the local LLM community. Even when people forcibly abliterate it to remove the censorship it still wastes a ton of tokens when thinking to check whether the query is "compliant with policy".

    Do not be fooled. The whole "safety" talk isn't actually about making anything safe. It's just a smoke screen. It's about control. Remember back in the GPT-3 days how OpenAI was saying that they won't release the model because it would be terribly, terribly unsafe? And yet nowadays we have open weight model orders of magnitude more intelligent than GPT-3, and yet the sky hasn't fallen over.

    It never was about safety. It never will be. It's about control.

  19. > Anthropic occupies a peculiar position in the AI landscape: a company that genuinely believes it might be building one of the most transformative and potentially dangerous technologies in human history, yet presses forward anyway. This isn't cognitive dissonance but rather a calculated bet—if powerful AI is coming regardless, Anthropic believes it's better to have safety-focused labs at the frontier than to cede that ground to developers less focused on safety (see our core views).

    Ah, yes, safety, because what is more safe than to help DoD/Palantir kill people[1]?

    No, the real risk here is that this technology is going to be kept behind closed doors, and monopolized by the rich and powerful, while us scrubs will only get limited access to a lobotomized and heavily censored version of it, if at all.

    [1] - https://www.anthropic.com/news/anthropic-and-the-department-...

  20. > I have actual work to do.

    Me too. That's why I use Linux.

    If you decide to delve deeper into Linux then it pretty much becomes a high investment -> high reward thing. There's a learning curve, but you can customize everything to be exactly how you want, and there are no black boxes whatsoever.

    What does this mean in practice? You set it up once, exactly the way you like it, and it runs stable forever. It has no misfeatures constantly shoved down your throat (like with e.g. Windows, with AI, ads, telemetry and bloat). Your UI doesn't go through pointless redesigns every few years (like with Windows and macOS) if you don't want to. If you don't like something you can peek under the hood and change it.

    The total amount of time I've spent maintaining my Linux system (or in your parlance: "getting it to work") over the last decade is, I don't know, maybe a dozen hours? But yes, if you're a beginner this is indeed unrealistic.

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal