As someone currently writing an app which uses a retrained Inception model, I watched the show pointing and laughing (and then crying) at the same issues and frustrations. The accuracy of the show and especially this episode has been just brilliant.
Thanks for sharing all the tech details too, it's been great to read. I'm even more amazed to see it as a real app, that I didn't expect!
I'm confused about the 80 hours to train. Wasn't this episode shown on HBO less than 48 hrs ago?
Edit: Just read your bio. Now it makes sense!
We ended up with a custom architecture trained from scratch due to runtime constraints more so than accuracy reasons (the inference runs on phones, so we have to be efficient with CPU + memory), but that model also ended up being the most accurate model we could build in the time we had. (With more time/resources I have no doubt I could have achieved better accuracy with a heavier model!)
Training the final model took about 80 hours on a single Nvidia GTX 980 Ti (the best thing I could hook to my MacBook Pro at the time). That's for 240 epochs (150k images in an epoch) ran in 3 rate annealing phases, each phase being a handful of CLR (cyclical learning rate) phases.
I'll answer in more detail in the full blogpost, it's a bit complicated to explain in a comment. I'll have charts & figures for y'all :)