Preferences

mynti
Joined 93 karma

  1. For all of these kind of releases I ask myself, if it would work well they would not release it for free
  2. Is it actually possible that nvidia chips will have 50TB/s bandwidth by 2028? Right now it shows they are at 8 TB/s. To me it seems the Nvidia forecast is a very, very optimistic exponential. Nontheless, Huawei not matching the scale of production seems to be the biggest hurdle
  3. >> Disney and OpenAI affirm a shared commitment to responsible use of AI that protects the safety of users and the rights of creators.

    Wow so Sora Slop is coming to payed Disney+?

  4. It is really hard to figure out what is satire and what is actual news these days with the orange man..
  5. To me, this only makes sense on a word level not sentence level. I can understand that words, especially those that are older, have evolved because of energy and comfort constraints of our physiology. But to extend this to sentence level is a rather big step. I would suppose it works for simple, short sentences that had to be efficient in the past. But imagine sentences about computer science where most words are rather new and have been chosen by arbitrary rules. To me it would be interesting to see, if this hypothesis holds when applied to longer and more complex sentences and "modern" words.
  6. That is partly true. High population density means a lot of roof area. Solar is perfect to put on roofs, you need no extra land. It is basically free (save the investment of the panels which pay off quickly nowadays)
  7. If we think of every generation as a compression step of some form of information into our DNA and early humans existed for ~1.000.000 years and a generation is happening ~20years on average, then we have only ~50.000 compression steps to today. Of course, we have genes from both parents so they is some overlap from others, but especially in the early days the pool of other humans was small. So that still does not look like it is on the order of magnitude anywhere close to modern machine learning. Sure, early humans had already a lot of information in their DNA but still
  8. ".. Claude estimates that AI reduces task completion time by 80%. We use Claude to evaluate anonymized Claude.ai transcripts to estimate the productivity impact of AI."

    What is this? So they take Claude and ask how much do you think you saved on time here? How can you take this seriously. ChatBots are easy to exaggerate, especially about something positive like this.

  9. We are so much more productive and efficient, so we should even work on weekends to be even more productive.. and then they make an ID System that has been solved for ages with simple passports
  10. It is interesting that the Gemini 3 beats every other model on these benchmarks, mostly by a wide margin, but not on SWE Bench. Sonnet is still king here and all three look to be basically on the same level. Kind of wild to see them hit such a wall when it comes to agentic coding
  11. So after all those people killed themselves while chatgpt encouraged them they make their model, yet again, more 'conversational'. It is hard to believe how you could justify this.
  12. Thanks for creating this! I have been trying it out and it looks like it is a more dense monospace than my other ones, so I can actually see more horizontally on my screen.
  13. Cool idea! I had a look at the code and have been wondering about the sigmoid gating, it is used to add some of the q_struct and k_struct into the original key and query. But I wonder why this gating is independend of the input? I would have expected this gating to be dependednd on the input, so if the model sees something more complex it needs more of this information (or something similar). But it is just a fix, learnable parameter per layer, or am I mistaken? What is the intuition about this?
  14. If you put it against the value created from these hours, the graph almost flips entirely: https://figure.nz/chart/mMmSnWWbULiK4SvY-17BBScq4PaYeiUnz

    Also: in some countries, like Germany, there is a lot of part time work for mothers, which does impact this statistic quite a bit

  15. "beats at writing" is kind of far fetched since the page provides three generic pormpts with answers from Opus, GPT5 and K3. But a 3.5T model is crazy huge and the answers still feel more or less the same to the other models
  16. Super interesting blogpost. I just wonder how this is actually different to LORA, since LORA also adds some parameters and freezes the rest of the model. This seems like a sparse, memory efficient LORA with a couple of extra steps, since it uses attention again to make the sparsity work. All while making it a lot more effective compared to LORA (performance drop of only 11% compared to 71%).
  17. Make others turn the web into a perfect LLM training set, perfect strategy
  18. This will probably take a little longer for private use, but the industrial sector is already doing this. Cooling chambers being cooled down further during cheap electricity prices (or sunshine when they have their own solar) or storing heat/"cool" underground
  19. For reference: Circle (USDC) has about 74B market cap and is valued at 30B. Tether (USDT) has about 172B market cap and is about to be valued at 500B. I get that Tether is the market leader and all, but that valuation is totally nuts
  20. No clue why they are calling a 560B model "Flash". But one very interesting thing is, it is beating all the other frontier models in safety, and most of them by quite a margin. The same for Formal Theorem Proving, where I would have expected OpenAI to do way better
  21. How does Memori choose what part of past conversations is relevant to the current conversation? Is there some maximum amount of memory it can feasibly handle before it will spam the context with irrelevant "memories"?
  22. “To be an artist means you must declare a loyalty to your art form and your vision that runs deeper than almost any other, even sometimes deeper than blood kinship.”

    That is absurd. Selling out your childrens childhood for some sort of pseudo deep "art" and defending it in this way is almost psychopathic

  23. Balcony solar is absolute awesome in germany. I get about 30% return on investment per year on my small solar panel. Hard not to do it. I have no idea why it is still a little niche
  24. For anyone curious about what the Gated Delta Network is: https://arxiv.org/pdf/2412.06464
  25. This is a bit disingenous since the tokens are free until September 10th, so of course the usage is spiking and people are flocking towards it.
  26. This is super cool! One thing I find counterintuitive is that GPT5 or o3 not have better performance. GPT5 gets about 800k on average per round but I would have expected it to be nearly perfect, since these are not particularly hard questions and mostly trivia or simple look up knowledge questions. There is little reasoning involved so I expected the big models to do much better.
  27. i have been anticipating this release for some time now. I am an avid user of ecosia and find the results passable in comparison to google. A couple of queries a week I am unsatisfied with and switch to google or a chatbot these days. I hope this new index is also comparable in quality and one step closer to a bit more competition in this market. Sadly, I could not find an official blog post or anything on the qwant news site..
  28. I clicked because i was very curious what this actually is, but it really is just what it says. The problem is, there is no twerking, the girls are just swaying from left to right really slowly which is not really twerking, but that is beside the point. Why are things like this being built? How can someone look themselves in the eyes after this?
  29. this seems very interesting: they got a big sensor dataset and generated some text from that. I guess this involves things like maximum values, mean values, maybe simple trends and things like was the person walking or biking etc. It would be interesting to see if the model identifies things that were not so easily provided in the training data. Otherwise this is just teaching the model to sort of calculate the mean from sensor data instead of using tools to do this

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal