Comment by latenightcoding

latenightcoding Jun 26, 2017 parent

I am glad I am not the only one with questions about the external GPU, I had considered trying that, but came to the conclusion that the data transfer between CPU to GPU would be too slow for ML tasks. So, what is your opinion on this ? if you had to do it again would you use the eGPU or just use AWS or another GPU cloud service .

timanglade Jun 26, 2017

My takeaway is that local development has a huge developer experience advantage when you are going through your initial network design / data wrangling phase. You can iterate quickly on labeling images, develop using all your favorite tools/IDEs, and dealing with the lack of official eGPU support is bearable. Efficiency-wise it’s not bad. As far as I could tell the bottleneck ended up being on the GPU, even on a 2016 MacBook Pro with Thunderbolt 2 and tons of data augmentation done on CPU. It’s also a very lengthy phase so it helps that’s it’s a lot cheaper than cloud.

When you get into the final, long training runs, I would say the developer experience advantages start to come down, and not having to deal with the freezes/crashes or other eGPU disadvantages (like keeping your laptop powered on in one place for an 80-hour run) makes moving to the cloud (or a dedicated machine) become very appealing indeed. You will also sometimes be able to parallelize your training in such a way that the cloud will be more time-efficient (if still not quite money-efficient). For Cloud, I had my best experience using Paperspace [0]. I’m very interested to give Google Cloud’s Machine Learning API a try.

If you’re pressed for money, you can’t do better than buying a top of the line GPU once every year or every other year, and putting it in an eGPU enclosure.

If you want the absolute best experience, I’d build a local desktop machine with 2–4 GPUs (so you can do multiple training runs in parallel while you design, or do a faster, parallelized run when you are finalizing).

Cloud does not quite totally make sense to me until the costs come down, unless you are 1) pressed for time and 2) will not be doing more than 1 machine learning training in your lifetime. Building your own local cluster becomes cost-efficient after 2 or 3 AI projects per year, I’d say.

[0]: https://www.paperspace.com/ml

latenightcoding OP Jun 26, 2017

Awesome, thanks!

skraelingjar Jun 26, 2017

I have used the AWS machine learning API and would recommend it. The time savings using that vs running it on my hacked together ubuntu-chromebook-mashup is worth more than what I had to pay. I have also used Paperspace. My only issue was that whatever they use for streaming the virtual desktop to the browser didn't work over sub 4MB/s network connection.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous