Comment by timanglade

timanglade May 16, 2017 parent

About 150k total images, 3k of which were hotdogs. The results are far from perfect (there's ton of subtle — or hilarious — ways to trick the app) but it was better than using a pre-trained model or doing transfer learning, accuracy-wise (honestly it was even better than using Cloud APIs). As for the difficulty in preparing the training set, I'll just say I definitely empathize with Dinesh and Jian Yang’s feelings in episode 4 :D

austenallred May 16, 2017

> there's ton of subtle — or hilarious — ways to trick the app

ಠ_ಠ

alexcnwy May 16, 2017

AWESOME!

Can you please elaborate a bit more on the model architecture and what you tried with respect to transfer learning?

Did you use an imagenet architecture e.g. VGG and retrain from scratch or a custom architecture? Did you try chop off the last 1/2/3 layers of a prerrained mode and fine-tune?

Bonus points: 1. How much better were your results trained from scratch vs fine-tuned? 2. How long did it take to train your model and on what hardware?

timanglade OP May 16, 2017

Hey so I actually tried Vgg, Inception and SqueezeNet, out of the box, chopped and trained from scratch (SqueezeNet only for the latter due to resource constraints).

We ended up with a custom architecture trained from scratch due to runtime constraints more so than accuracy reasons (the inference runs on phones, so we have to be efficient with CPU + memory), but that model also ended up being the most accurate model we could build in the time we had. (With more time/resources I have no doubt I could have achieved better accuracy with a heavier model!)

Training the final model took about 80 hours on a single Nvidia GTX 980 Ti (the best thing I could hook to my MacBook Pro at the time). That's for 240 epochs (150k images in an epoch) ran in 3 rate annealing phases, each phase being a handful of CLR (cyclical learning rate) phases.

I'll answer in more detail in the full blogpost, it's a bit complicated to explain in a comment. I'll have charts & figures for y'all :)

timbutlerau May 16, 2017

As someone currently writing an app which uses a retrained Inception model, I watched the show pointing and laughing (and then crying) at the same issues and frustrations. The accuracy of the show and especially this episode has been just brilliant.

Thanks for sharing all the tech details too, it's been great to read. I'm even more amazed to see it as a real app, that I didn't expect!

luv2lrn May 16, 2017

I'm confused about the 80 hours to train. Wasn't this episode shown on HBO less than 48 hrs ago?

Edit: Just read your bio. Now it makes sense!

DonHopkins May 16, 2017

He's a guest lecturer at Stanford, and he had the students help him.

alexcnwy May 16, 2017

great, thanks for your reply and looking forward to that blogpost!

This item has no comments currently.