Preferences

ankeshanand
Joined 250 karma
AI Researcher https://twitter.com/ankesh_anand

  1. If you're an individual developer and not an enterprise, just go straight to Google AIStudio or GeminiAPI instead: https://aistudio.google.com/app/apikey. It's dead simple getting an API key and calling with a rest client.
  2. We've done extensive comparisons against GPT-4V for video inputs in our technical report: https://storage.googleapis.com/deepmind-media/gemini/gemini_....

    Most notably, at 1FPS the GPT-4V API errors out around 3-4 mins, while 1.5 Pro supports upto an hour of video inputs.

  3. Has anyone in this subthread actually read the papers and compared the benchmarks? LLama2 is behind PALM-2 on all major benchmarks, I mean they spell this out in the paper explicitly.
  4. You can also rent a cloud TPU-v4 pod (https://cloud.google.com/tpu) which 4096 TPUv-4 chips with fast interconnect, amounting to around 1.1 exaflops of compute. It won't be cheap though (excess of 20M$/year I believe).
  5. It's important in the context that RL does not have performance ceilings.
  6. Looks like any Github pages served with CloudFlare are getting blocked, I am trying out a fix.
  7. Yep, Karpathy has mentioned this multiple times in their AI talks.
  8. If you carefully curate who you follow, Twitter can be more like a bunch of subreddits, with the added signal of knowing who's posting. So it ends up a being great way to keep up with small communities.
  9. Sorry if it wasn't clear, I do mention the linear classification protocol several times in the post. If you want to evaluate performance on a classification task, you have to show it labels during evaluation, otherwise it's an impossible task. Note that the encoder is freezed during evaluation, and only a linear classifier is trained on top. Now, even when evaluated on a limited set of labels (as low as 1%), contrastive pretraining outperforms purely supervised training by a large margin (check out Figure 1 in the Data-Efficient CPC paper: https://arxiv.org/abs/1905.09272.

    I did not get the second part unfortunately, could you elaborate more and clarify if you are talking about a specific paper?

  10. I didn't mean to convey that we should abandon generative self-supervised methods, but I can see how comparing them gives that impression.

    Agree that using them in conjunction would make sense, since generative methods could capture some features better and vice versa.

  11. Great piece, you might want to update the article with the mention of PyTorch Mobile that released today: https://pytorch.org/mobile/home/
  12. There's active research in Model-Based RL right now that tries to tackle 1) and 2) together.

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal