Preferences

howlgarnish parent
Coral is powered by an Edge TPU (Tensor Processing Unit), which wipes the floor with GPUs like the Jetson Nano when it comes to running Tensorflow:

https://blog.usejournal.com/google-coral-edge-tpu-vs-nvidia-...

...and Google is pretty invested in TPUs, since it uses lots of them in house.

https://en.wikipedia.org/wiki/Tensor_Processing_Unit


michaelt
They might be great for inference with tensorflow - but from what I can tell from Google's documentation, Coral doesn't support training at all.

I'm sure an ML accelerator that doesn't support training will be great for applications like mass-produced self-driving cars. But for hobbyists - the kind of people who care about the difference between a $170 dev board and a $100 dev board - being unable to train is a pretty glaring omission.

MichaelBurge
You wouldn't want to use it for training: This chip can do 4 INT8 TOPs with 2 watts. A Tesla T4 can do 130 INT8 TOPs with 70 watts, and 8.1 FP32 TFLOPs.

Assuming that ratio holds, you'd maybe get 231 GFLOPs for training. The Nvidia GTX 9800 that I bought in 2008 gets 432 GFLOPs according to a quick Google search.

Hobbyists don't care about power efficiency for training, so buy any GPU made in the last 12 years instead, train on your desktop, and transfer the trained model to the board.

rewq4321
On the other hand, it would be useful for people experimenting with low-compute online learning. Also, those types of projects tend to have novel architectures that benefit from the generality of a GPU.
Last I’ve heard covid was making GPUs about as difficult to find as the other things it’s jacked the prices up on, too.
gridlockd
You can get pretty much any GPU at pre-COVID prices right now, except for the newest generation NVIDIA GPUs that just came out to higher-than-expected demand.
omgwtfbyobbq
As a hobbyist in a state with relatively high electricity prices, I do care about the power efficiency of training.
jnwatson
Training is what the cloud is for.
wongarsu
That makes a $170 board that can also do training look dirt cheap in comparison
lawrenceyan
Good luck training anything in any reasonable time on it.
R0b0t1
Useful for adapting existing models. Not everything needs millions of hours of input.
tachyonbeam
If you want to train yet-another-convnet sure, but there could be applications where you want to train directly on a robot with live data, as in interactive learning.

See this paper for an example of interactive RL: https://arxiv.org/abs/1807.00412

suyash
or a highly rigged machine, this looks more for fast real time ML inference on the edge
debbiedowner
You can adapt the final layer of weights on edge tpu.

Training on a dev board should be a last resort.

Even hobbyists can afford to rent gpus for training on vast.ai or emrys

pinewurst
Google is pretty invested in TPUs for their own workloads but I fail to see any durable encouragement of them as an external product. At best they're there to encourage standalone development of applications/frameworks to be deployed on Google Cloud (IMHO of course).
tachyonbeam
AFAIK, apart from toy dev boards like this, you can't buy a TPU, you can only rent access to them in the cloud. I wouldn't want my company to rely on that. What if Google decides to lock you out? If you've adapted your workload to rely on TPUs, you'd be fucked.
akiselev
What's the difference between Coral's production line of Edge TPU modules and chips [1] and Google's cloud TPU offering?

Note: I haven't tried sourcing these in production (100k+) quantities so I have no idea what guarantees that product line gives customers.

[1] https://coral.ai/products/#production-products

usmannk
They're nothing alike at all. Similar to how a low end laptop GPU differs from a top of the line NVIDIA datacenter offering. Google's cloud TPU offering is the strongest ML training hardware that exists, the edge devices simply support the same API.
debbiedowner
Edge tpu is 2 tflops at half precision, cloud tpu starts at 140 tflops single precision and scales further.

Also edge tpu is 2-5Watts. Supposedly cloud tpus are more power efficient than GPUs, and for eg the 14 tflops 2080 ran at 300 W regularly.

popinman322
Coral can only run inference, and is optimized for models using 8-bit integers (via quantization).

A full TPU v2/v3 can train models and use 16/32 bit floats. They also have a Google-specific (?) 16-bit floating point type with reduced precision.

kordlessagain
And don't forget, TPUs are horrible at floating point math! The errors!
debbiedowner
Yea I've been wondering about charts I've seen comparing tpu model quality perf to gpu model quality like here [1], whether that could be due to error correction. At the same time training on gaming gpus like 1080 ti or 2080 ti is widely popular, though they lack the ECC memory of the "professional" quadro cards or V100. I did think conventional DL wisdom said "precision doesn't matter" and "small errors don't matter" though.

I've noticed this difference in quality perf in my own experiments tpu vs gaming gpu, but don't know for sure what the cause is. I never did notice a difference between gaming gpu trained models and quadro trained modela. Have more info/links?

1: https://github.com/tensorflow/gan/tree/master/tensorflow_gan...

kanwisher
Until you want to use Pytorch or another non tensor flow framework the support goes down dramatically. Jetson Nano supports more frameworks out of the box quite well, and it ends up being same cuda code you run on your big Nvidia cloud servers
panpanna
Not only that, nvidia cares deeply about pytorch. Visit pytorch forums and look at most upvoted answers. All by nvidia field engineers.
sorenbouma
That benchmark appears to compare full precision fp32 inference on the nano with uint8 inference on the coral, that floor wiping comes with a lot of caveats
There seems to be more than one jetson board.

This item has no comments currently.