Preferences

sakex
Joined 338 karma

  1. Levels in big tech are just a way to keep you motivated. You'll work harder to get a promo.

    In the end it doesn't matter, you'll make more money by either leaving or getting a retention offer.

  2. Exactly like Trump is doing to you.
  3. I'd be surprised if they didn't scale it up.
  4. There are new things being tested and yielding results monthly in modelling. We've deviated quite a bit from the original multi head attention.
  5. Maybe add the date to the title, because it's hardly new at this point
  6. What about MRI? Just had one. Sorry if it's a stupid question, I don't know much about this
  7. The anomaly here are AI researchers
  8. > They won’t notice those extra 10 milliseconds you saved

    Depends what you're doing. In my case I'm saving microseconds on the step time of an LLM used by hundreds of millions of people.

  9. The point you're missing is that people were making the same kind of comments about Amazon and Uber not too long ago
  10. I was laid off from Google in January last year alongside 150 people in my extended team. I managed to find a different team in Gemini, so now I'm part of Deepmind. I have very conflicting feelings because on one hand I really enjoy the work, the team, and the absolute genius of people I get to talk to; but on the other hand, I have some resentment for being so inhumanely laid off, I am sad for the people in my team who were not as lucky as me, and I know it can happen again any time.
  11. > I have some doubt about #2. Weren't Big Tech companies paying senior engineers $300K+ - in 2025-adjusted dollars - back in 2013?

    Yes but big tech got bigger. Google had a 4th of its current workforce for instance, Meta a 10th, etc. It got much easier to get into those companies.

  12. Such an accurate description of what I went through in Switzerland. Kids are mean, and I had to be mean to survive too. It stained my character in ways I am still trying to overcome more than a decade later.
  13. Name and shame
  14. That's something I wish I had when I started looking at einsums. It gets interesting when you start thinking about optimum paths (like in the opt_einsum package), sharded/distributed einsums, and ML accelerators.
  15. I have a hard time imagining who would suffer from this more, but I have an even harder time imagining anyone would benefit from this.
  16. Great article. I think what you should cover next is collective matmuls and sharding.
  17. Using Jax, you should be able to get good performance out of the box
  18. Shameless plug:

    I have an implementation in Rust (CPU only) with added GRU gates

    https://crates.io/crates/neat-gru

  19. I agree to some extent, but I'd say it depends on the programming language. For instance, in Python, you absolutely need documentation because you don't know what type is expected, what different values can be passed or even the optional kwargs etc. It's also very common to pass strings as parameters when one should be using enums, which means you can't know what are the possible values without having the doc or diving into the code.

    In Rust however you can get away with way less documentation because the type system and signatures are self explanatory. And if you're doing things right, the set of values you can pass to your function is limited.

  20. Jax is really nice
  21. There is no preemption in Javascript. It is based on cooperative multitasking[1] (your await statements and non-blocking callback) which is the opposite of preemption.

    [1]https://en.wikipedia.org/wiki/Cooperative_multitasking#:~:te....

  22. They don't bother explaining
  23. According to the article they link[1], the 1% richest is defined as earning more than $109,000 in 2015. So basically, they want to tax the people who already pay the most taxes. Actually rich people having the means to avoid paying taxes altogether, it's the educated middle class that's once again expected to pay for opaque, poorly managed, and unaccountable government programs.

    Glad I live in Switzerland.

    [1] https://www.oxfam.org/fr/communiques-presse/les-1-les-plus-r....

  24.     Location: Switzerland 
        Remote: Yes
        Willing to relocate: Yes
        Technologies: Jax, Rust, Cpp, Python, JS, TS, Go
        Résumé/CV: Currently working on Gemini at Google, previous experience at a smaller company where I did computer vision 
    
     
       https://www.linkedin.com/in/alexandre-senges-0a02a4111?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=android_app 
    
    
        Email: senges.alex [at] gmail.com
  25. I think I heard Google was aiming to do that by 2030. Not sure though
  26. Another day I'm happy my country (Switzerland) respects its citizens wishes to not join EU
  27. An interesting parameter that I don't read about a lot is vocab size. A larger vocab means you will need to generate less tokens for the same word on average, also the context window will be larger. This means that a model with a large vocab might be more expensive on a per token basis, but would generate less tokens for the same sentence, making it cheaper overall. This should be taken into consideration when comparing API prices.

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal