1
point
ankeshanand
Joined 250 karma
AI Researcher
https://twitter.com/ankesh_anand
- 2 points
- ankeshanandIf you're an individual developer and not an enterprise, just go straight to Google AIStudio or GeminiAPI instead: https://aistudio.google.com/app/apikey. It's dead simple getting an API key and calling with a rest client.
- We've done extensive comparisons against GPT-4V for video inputs in our technical report: https://storage.googleapis.com/deepmind-media/gemini/gemini_....
Most notably, at 1FPS the GPT-4V API errors out around 3-4 mins, while 1.5 Pro supports upto an hour of video inputs.
- Has anyone in this subthread actually read the papers and compared the benchmarks? LLama2 is behind PALM-2 on all major benchmarks, I mean they spell this out in the paper explicitly.
- You can also rent a cloud TPU-v4 pod (https://cloud.google.com/tpu) which 4096 TPUv-4 chips with fast interconnect, amounting to around 1.1 exaflops of compute. It won't be cheap though (excess of 20M$/year I believe).
- It's important in the context that RL does not have performance ceilings.
- Looks like any Github pages served with CloudFlare are getting blocked, I am trying out a fix.
- 22 points
- Yep, Karpathy has mentioned this multiple times in their AI talks.
- If you carefully curate who you follow, Twitter can be more like a bunch of subreddits, with the added signal of knowing who's posting. So it ends up a being great way to keep up with small communities.
- Sorry if it wasn't clear, I do mention the linear classification protocol several times in the post. If you want to evaluate performance on a classification task, you have to show it labels during evaluation, otherwise it's an impossible task. Note that the encoder is freezed during evaluation, and only a linear classifier is trained on top. Now, even when evaluated on a limited set of labels (as low as 1%), contrastive pretraining outperforms purely supervised training by a large margin (check out Figure 1 in the Data-Efficient CPC paper: https://arxiv.org/abs/1905.09272.
I did not get the second part unfortunately, could you elaborate more and clarify if you are talking about a specific paper?
- I didn't mean to convey that we should abandon generative self-supervised methods, but I can see how comparing them gives that impression.
Agree that using them in conjunction would make sense, since generative methods could capture some features better and vice versa.
- 97 points
- Great piece, you might want to update the article with the mention of PyTorch Mobile that released today: https://pytorch.org/mobile/home/
- There's active research in Model-Based RL right now that tries to tackle 1) and 2) together.