How HBO’s Silicon Valley Built “Not Hotdog” with TensorFlow, Keras and React Native - Hacker Neue

378 points Jun 26, 2017

How HBO’s Silicon Valley Built “Not Hotdog” with TensorFlow, Keras and React Native

101 comments _bkgg medium.com

timanglade Jun 26, 2017

Just wanted to say thanks for the warm welcome from HN when the app was released last month — I hope this blogpost answers the questions that were raised back then.

I’d be happy to answer anything else you’d like to know!

Original thread: https://www.hackerneue.com/item?id=14347211

Demo of the app (in the show): https://www.youtube.com/watch?v=ACmydtFDTGs

App for iOS: https://itunes.apple.com/app/not-hotdog/id1212457521

App for Android (just released yesterday): https://play.google.com/store/apps/details?id=com.seefoodtec...

zitterbewegung Jun 26, 2017

I am making an app that takes pictures and tries to tell you if the food in the picture has allergens. I didn't know if I should feel humble or just laugh. (I decided it was hilarious in the end) But it made me aim higher in a hackathon last weekend. I also use your app in my elevator pitch for people to understand.

0003 Jun 26, 2017

Great idea that I would be terrified to pursue from a legal perspective.

KGIII Jun 26, 2017

I am going to chime in and really suggest they hire a qualified legal professional - if they intend to share this application. A simple disclaimer may not be adequate.

Edit: Typo.

zitterbewegung Jun 26, 2017

Yea I'm not going to release the app. I was already on the fence on it. I could have finished it months ago but I was worried about legal implications . Thank you .

3 More Comments →

zitterbewegung Jun 26, 2017

Yes. I'm not going to release the app. The show already discouraged me and made me do another app that won't have legal implications .

azinman2 Jun 26, 2017

While it seems great to be able to take a picture, that’s going to be very hard to know what’s in a sauce, stuffed inside of something, complex presentation, etc.

Why not be able to search over a list, and once on an item, show frequently related (e.g. garlic if you search onion). It may not involve any “AI,” but it’ll be far more accurate and easier to implement.

chjohasbrouck Jun 26, 2017

Is this possible? It doesn't seem like the necessary information is captured in a picture.

I don't think there's a person that could tell you whether food in a picture has allergens, let alone an app.

zitterbewegung Jun 26, 2017

Yes you are right. The app works by identifying the dish and then searching in an ingredient database. If the dish looks visually the same to one with peanuts then the app would have a false negative .

dagurp Jun 26, 2017

Why is this not available outside the US and Canada?

timanglade Jun 26, 2017

So the long version of this story is that the app was released in partnership with the HBO Go / HBO Now team, and using the terms of service they used for these apps (which are only available in the US & Canada). We’ve been working with lawyers to release the app worldwide without running afoul of any local laws, and I’m keeping my fingers crossed that will get cleared this week, we just gotta make sure our terms of service are up to par… After all, we wouldn’t want to fall prey to something we made fun of this very season ;) http://www.vulture.com/2017/04/silicon-valley-recap-season-4...

devoply Jun 26, 2017

Do you have a special algorithm to determine Canada/US and not Canada/US.

ethbro Jun 26, 2017

Their neural network localizes your submitted hotdog images. To get around the block, try showing it a Canadian hotdog.

2 More Comments →

timanglade Jun 26, 2017

:D now, that would be an app worth investing in

OJFord Jun 26, 2017

Big interest from the Trump administration! (Huge interest, so big, the biggest.)

rconti Jun 26, 2017

Hmmm, BRB, writing pylsur or not pylsur.

a012 Jun 26, 2017

Because HBO I guess. There are a lot of contents are filtered outside of USA.

justifier Jun 26, 2017

Can you speak to the origins of it being 'not hotdog'?

Did you, or the writing staff or other consultants, start with hotdogs or dick picks?

It made me laugh because it reminded me of this popular lecture(o) that was, and is, passed around tech circles

Any relation? Or fun coincidence?

(o) https://youtu.be/uJnd7GBgxuA?t=5555s .. the lecture is given by andrej karpathy on 2016-09-24 and the timestamp goes to a point in the lecture where the lecturer discusses an interface made to have humans compete with conv nets to identify hotdogs

timanglade Jun 26, 2017

Ha seems like a fun coincidence. The writers came up with it early in the writing of season 4, and I started working on it sometime in the Summer of 2016 iirc. As far as the origin story goes, it was just great writers coming up with a great joke! Our lead technical consultant Todd mentioned to them we could actually build their joke for real, and the show jumped on the idea!

justifier Jun 26, 2017

Classic.. yeah I think the lecturer says the project was from 2014

I suppose it's a clear testament to the quality of the show and its commentary.. thanks for your contribution to that end!

mamon Jun 26, 2017

Slightly offtopic, but I wonder how hard it was to attach eGPU to MacBook Pro? Is there some official support, or was it just a dirty hack?

timanglade Jun 26, 2017

Not offtopic at all! Dirty hack for sure. The enclosure I bought was a hack, the drivers were a hack, and there was software on top that was a hack as well. But the developer experience was totally awesome… Almost made the constant graphics crashes worth it. Almost.

rexreed Jun 26, 2017

Can you provide some more details on your eGPU hack for those of us who might want to try our hands on that as well?

astrange Jun 26, 2017

There is official support for eGPUs in macOS High Sierra beta.

https://developer.apple.com/metal/

ckirksey Jun 26, 2017

Thanks for sharing your process. I was inspired by the show to build my own app. It's pretty crazy how quickly you can build something like this now.

askl Jun 26, 2017

The Android App is not available in Germany. :(

plumeria Jun 26, 2017

In which countries is it available? Mine isn't :(

richardkeller Jun 26, 2017

What an ingenious way to covertly distribute the Pied Piper decentralized storage app.

timanglade Jun 26, 2017

Boy I sure hope no one does a static analysis of the binary…

rattray Jun 26, 2017

I really hope you've hidden an Easter egg in there

timanglade Jun 26, 2017

a gentleman never tells

Retr0spectrum Jun 26, 2017

If such a thing existed, would it be in the Android or iOS app, or both?

mpeg Jun 26, 2017

I'm waiting for the PiperCash (PiedPaper? PiperPound?) ICO to do p2p payments.

m-i-l Jun 26, 2017

PaidPiper?

tnecniv Jun 26, 2017

The series finale reveal is that Pied Piper exists and was using the show to fund their company and distribute their software.

x2398dh1 Jun 26, 2017

I don't relish saying this but the, "Not Hotdog" app does not cut the mustard in even the most rudimentary of tests:

https://twitter.com/iotmpls/status/879381125541613568/photo/...

Probably only 20% of the world's hot dogs are just a basic hot dog with mustard on it. Once you move past one or two condiments, the domain of hot dogs identification along with fixings gets confusing from a computer vision standpoint.

Pinterest's similar images function is able to identify hotdogs with single condiments fairly well:

https://www.pinterest.com/pin/268175352794006376/visual-sear...

They appear to be using deep CNN's.

https://labs.pinterest.com/assets/paper/visual_search_at_pin...

Having embedded tensorflow for on-site identification is all well and good for immediacy and cost, but if I can't really properly identify whether something is a hotdog vs. a long skinny thing with a mustard squiggle, what good does that do me? What would be the next step up in your mind?

I ask this as someone who is sincerely interested in building low cost, fun, projects.

OJFord Jun 26, 2017

> I don't relish saying this

My condiments to the author, I see what you did there ;)

timanglade Jun 26, 2017

While we’re here and chatting about this, I should say most of the credit for this app should really go towards the following people:

Mike Judge, Alec Berg, Clay Tarver, and all the awesome writers that actually came up with the concept: Meghan Pleticha (who wrote the episode), Adam Countee, Carrie Kemper, Dan O’Keefe (of Festivus fame), Chris Provenzano (who wrote the amazing “Hooli-con” episode this season), Graham Wagner, Shawn Boxee, Rachele Lynn & Andrew Law…

Todd Silverstein, Jonathan Dotan, Amy Solomon, Jim Klever-Weis and our awesome Transmedia Producer Lisa Schomas for shepherding it through and making it real!

Our kick-ass production designers Dorothy Street & Rich Toyon.

Meaghan, Dana, David, Jay, Jonathan and the entire crew at HBO that worked hard to get the app published (yay! we did it!)

loader Jun 26, 2017

Oh he's going long ... and queue music.

giantwolf Jun 26, 2017

bluetwo Jun 26, 2017

OK, but where are the eight octopus recipes?

latenightcoding Jun 26, 2017

I am glad I am not the only one with questions about the external GPU, I had considered trying that, but came to the conclusion that the data transfer between CPU to GPU would be too slow for ML tasks. So, what is your opinion on this ? if you had to do it again would you use the eGPU or just use AWS or another GPU cloud service .

timanglade Jun 26, 2017

My takeaway is that local development has a huge developer experience advantage when you are going through your initial network design / data wrangling phase. You can iterate quickly on labeling images, develop using all your favorite tools/IDEs, and dealing with the lack of official eGPU support is bearable. Efficiency-wise it’s not bad. As far as I could tell the bottleneck ended up being on the GPU, even on a 2016 MacBook Pro with Thunderbolt 2 and tons of data augmentation done on CPU. It’s also a very lengthy phase so it helps that’s it’s a lot cheaper than cloud.

When you get into the final, long training runs, I would say the developer experience advantages start to come down, and not having to deal with the freezes/crashes or other eGPU disadvantages (like keeping your laptop powered on in one place for an 80-hour run) makes moving to the cloud (or a dedicated machine) become very appealing indeed. You will also sometimes be able to parallelize your training in such a way that the cloud will be more time-efficient (if still not quite money-efficient). For Cloud, I had my best experience using Paperspace [0]. I’m very interested to give Google Cloud’s Machine Learning API a try.

If you’re pressed for money, you can’t do better than buying a top of the line GPU once every year or every other year, and putting it in an eGPU enclosure.

If you want the absolute best experience, I’d build a local desktop machine with 2–4 GPUs (so you can do multiple training runs in parallel while you design, or do a faster, parallelized run when you are finalizing).

Cloud does not quite totally make sense to me until the costs come down, unless you are 1) pressed for time and 2) will not be doing more than 1 machine learning training in your lifetime. Building your own local cluster becomes cost-efficient after 2 or 3 AI projects per year, I’d say.

[0]: https://www.paperspace.com/ml

latenightcoding Jun 26, 2017

Awesome, thanks!

skraelingjar Jun 26, 2017

I have used the AWS machine learning API and would recommend it. The time savings using that vs running it on my hacked together ubuntu-chromebook-mashup is worth more than what I had to pay. I have also used Paperspace. My only issue was that whatever they use for streaming the virtual desktop to the browser didn't work over sub 4MB/s network connection.

thearn4 Jun 26, 2017

It's interesting how amenable image classification neural networks are to the "take working model, peel off last layer or two, retrain for a new application" approach. I've seen this suggested as working pretty well in a few instances.

I guess the interpretation is that the first few normalize->convolution->pool->dropout layers are basically achieving something broadly analogous to the initial feature extraction steps that used to be the mainstay in this area (PCA/ICA, HOG, SIFT/SURF, etc.), and are reasonably problem-independent.

timanglade Jun 26, 2017

For sure, although I should say, for this specific instance I ended up training a network from scratch. I did get inspiration from the MobileNets architecture, but I did not keep any of the weights from their ImageNet training. That was shockingly affordable to do even on my very limited setup, and the results were better than what I could do with a retraining (mostly has to do with how finicky small networks can be when it comes to retraining).

thearn4 Jun 26, 2017

That's very cool to hear, I'm a lot more interested in the eGPUs (vs. something like an AWS P2 instance) after reading this. Thanks again for sharing.

cedric Jun 26, 2017

Interesting. What is the external GPU (eGPU) enclosure you used for the Nvidia GTX 980 Ti card? Is it this one? https://bizon-tech.com/us/bizonbox2s-egpu.html/

timanglade Jun 26, 2017

Yes, that’s what you see in the picture, although as completely personal advice, I would stop short of recommending it. For one there are arguably better cases out there now, and you can sometimes build your own eGPU rig for less. Finally, the Mac software integration (with any eGPU) is very hacky at the moment despite the community’s best efforts, and I had to deal with a lot of kernel panics and graphics crashes, so overall I’m not sure I would recommend others attempt the same setup.

minimaxir Jun 26, 2017

It's worth noting that High Sierra removes some of the hackiness of an eGPU.

I am waiting until the next-gen enclosures/cards come out which play nicer with the OS for deep learning.

rogerb Jun 26, 2017

This should be standard 'hello world' tutorial for Pragmatic ML.

tuna Jun 26, 2017

Nice write up that should become the go-to tutorial for TF and local training. Helped me a lot w/ the mobile part, it was a bit strange to thing about transfer the training when I read at first but it became clear in the second reading.

nganig Jun 26, 2017

Pretty fascinating and encouraging to see how much was accomplished with a laptop and consumer GPU. Gave me some great ideas. Also happy to see Chicago dogs properly identified.

timanglade Jun 26, 2017

Ha you have no idea how hard chicago hotdogs made my life! There was a joke in the show about Dinesh having to stare at a lot of “adult” imagery for days on end to tune his AI, but my waterloo was chicago hotdogs — the stupid pickles end up hiding the sausage more times than not, which makes it hard to differentiate them from say, a sandwich.

For those of you like me that never knew they existed, here is what they look like: http://img.food.com/img/recipes/25/97/2/large/picMKCD0m.jpg

ethbro Jun 26, 2017

I feel your pain. I once watched a single episode of Jeopardy 50+ times in pursuit of timings for a prototype app demo.

I could have a good half-hour conversion about the nuances of Alex Trebek's vocal inflections... shudders

timanglade Jun 26, 2017

Ha, I’d love to hear more. What was the app for? I can’t imagine why you’d have to pick up on Trebek’s elocution??

ethbro Jun 26, 2017

Less interesting than you'd expect, as it was for a rapid mobile app prototyping class.

We had a telesync'd demo that let you play along with a Jeopardy episode by yelling answers at your phone. The app knew the timing markers for when the question was asked + when a contestant answered, so would only give you credit if you beat the contestant with the correct answer.

Our model user was "people who yell answers at the screen when Jeopardy is on."

Still think it would have made a decent companion app to the show though...

Trebek's elocution is just something you pick up on after rewatching an episode enough times. He has really interesting ways of emphasizing things, but they seem normal if you're just listening to them once through.

tebica Jun 26, 2017

Love the post! This explains how mobile TensorFlow can be a actually used on daily life.

timanglade Jun 26, 2017

One of my primary motivators behind building this blogpost was to show how exactly one can use TensorFlow to ship a production mobile application. There’s certainly a lot of material out there, but a lot of it is either light on details, or only fit for prototypes/demos. There was quite a bit of work involved in making TensorFlow work well on a variety of devices, and I’m proud we managed to get it down to just 50MB or so of RAM usage (network included), and a very low crash rate. Hopefully things like CoreML on iOS and TensorFlow Lite on Android will make things even easier for developers in the future!

metakermit Jun 26, 2017

yeah, that's my main pain with the TF docs – great if you just want to try one of the MNIST tutorial variations, but there's a lot more you need to figure out when you get beyond of these "hello world" examples…

ckirksey Jun 26, 2017

This is the Twitter bot I built a few days after the show (similar to Tim's original prototype with Google Cloud): https://hackernoon.com/building-silicon-valleys-hot-dog-app-...

laibert Jun 26, 2017

This is amazing - impressed by your persistence to source the training data yourself, that must have been tedious!

Did you try quantizing the parameters to shrink the model size some more? If so, how did it affect the results? It also runs slightly faster on mobile from my experience.

timanglade Jun 26, 2017

Great question — I did not, because I had unfortunately spent all of my data on that last training run, and I did not have a untainted dataset left to measure the impact of quantization on. (Just poor planning on my part really.)

It’s also my understanding at the moment that quantization does not help with inference speed or memory usage, which were my chief concerns. I was comfortable with the binary size (<20MB) that was being shipped and did not feel the need to save a few more MBs there. I was more worried about accuracy, and did not want to ship a quantized version of my network without being able to assess the impact.

Finally, it now seems that quantization may be best applied at training time rather than at shipping time, according to a recent paper by the University of Iowa & Snapchat [0], so I would probably want to bake that earlier into my design phase next time around.

[0]: https://arxiv.org/abs/1706.03912

laibert Jun 26, 2017

Thanks! Haven't seen that paper, I'll check it out. I think quantization only helps with inference speed if the network is running on CPU with negligible gains on GPU (Tensorflow only supported CPU on mobile last I looked which was a while ago). However your app is already super fast so don't I think anyone would notice if it was marginally faster at this point!

vinum_sabbathi Jun 26, 2017

At MongoDB World this past week they did a demo of stitch where they actually built something similar with no back end code required and used the Clarifai API and an angular front end. It took like less than 80 minutes and could like run on prod of I wanted.

kenwalger Jun 26, 2017

MongoDB Stitch is a great new BaaS offering. They even have some great tutorials online to use it.

Have a look at their sample PlateSpace app: https://github.com/mongodb/platespace

Very cool new service and some excellent tutorials as well, for example for the PlateSpace web app: https://docs.mongodb.com/stitch/getting-started/platespace-w...

I'd definitely recommend having a look.

SmellTheGlove Jun 26, 2017

Wow, great writeup. This is an area that I know nothing about but have wanted to learn - seems like this post is a good starting point.

Any chance the full source will ever be opened up? Would be an excellent companion to the article.

timanglade Jun 26, 2017

That’s not in the cards, at least at the moment, although if we get enough requests for it, I may be able to convince the powers that be…

In the meantime, iff there are any details you’d like to see, don’t hesitate to chime in and I’ll try to respond with details!

woodrowbarlow Jun 26, 2017

what is the best avenue for making such requests?

tmaly Jun 26, 2017

Just want to say, awesome post. Its amazing how quickly you created this.

timanglade Jun 26, 2017

Thanks for the kind words! To prevent impostor syndrome, I should clarify that I worked on the app for many, many months — basically since August of last year — as a nights/weekends thing. It’s true that the final version was built almost from scratch in a few weeks, but it wouldn’t have been possible without the time investment in the preceding months. Although for the most part I just wasted a lot of time because I had no idea what I was doing lol (still don’t)

subcosmos Jun 26, 2017

I finally played with it this morning. I'm blown away by the speed and how smooth the experience is.

quotewall Jun 26, 2017

Finally for Android! Cool to see a cross-platform implementation of this, and how much can be done by one person and some reasonable gear.

timanglade Jun 26, 2017

Yes, I was very excited we were able to release it for Android… And even though we used React Native, there were so many native (and C++) bits, it ended up being quite complex!

As for the gear, I think it’s really damaging that so many people think Deep Learning is only for people with large datasets, cloud farms (and PhDs) — as the app proves, you can do a lot with just data you curate by hand, a laptop (and a lowly Master’s degree :p)

giantwolf Jun 26, 2017

Do you think it's possible to generalize the way you handled the cross-platform complexity into a shared component?

subcosmos Jun 26, 2017

Love this architecture. I think Im going to adopt some of it for HungryBot, my nonprofits diet tracking research arm. I think on-phone predictions solves a lot of my affordability issues.

https://www.infino.me/hungrybot

Great work!

john_borkowski Jun 26, 2017

Very informative write up. Thanks!

How did you source and categorize the initial 150K of hotdogs & not hotdogs?

tkrupicka Jun 26, 2017

As someone who maintains a popular android camera library; what is this app using to take photos on both iOS and Android? Android can be a bit tricky with device-specific differences and Camera 1 vs. Camera 2 API changes.

timanglade Jun 26, 2017

The amazing react-native-camera plugin! [0] I’m still getting a few camera-related crashes on Android right now, but overall I would say it makes things pretty smooth!

[0]: https://github.com/lwansbrough/react-native-camera

tkrupicka Jun 26, 2017

Thanks for the response and the writeup! Glad to hear somebody has had success with that library.

bigfish24 Jun 26, 2017

What kind of accuracy did you get with the transfer learning attempts?

timanglade Jun 26, 2017

Well for a while I was lulled into complacency because the retrained networks would indicate 98%+ accuracy, but really that was just an artifact of my 49:1 nothotdog:hotdog image imbalance. When I started weighing proportionately, a lot of networks were measurably lower, although it’s obviously possible to get Inception of Vgg back to a “true” 98% accuracy given enough training time.

That would have beat what I ended up shipping, but the problem of course was the size of those networks. So really, if we’re comparing apples to apples, I’ll say none of the “small”, mobile-friendly neural nets (e.g. SqueezeNet, MobileNet) I tried to retrain did anywhere near as well as my DeepDog network trained from scratch. The training runs were really erratic and never really reached any sort of upper bound asymptotically as they should. I think this has to do with the fact that these very small networks contain data about a lot of ImageNet classes, and it’s very hard to tune what they should retain vs. what they should forget, so picking your learning rate (and possibly adjusting it on the fly) ends up being very critical. It’s like doing neurosurgery on a mouse vs. a human I guess — the brain is much smaller, but the blade says the same size :-/

bigfish24 Jun 26, 2017

Very interesting! If you were to make a v2, would you adjust the 49:1 imbalance and add more hot dog images?

arrspdx Jun 26, 2017

What’s your biggest regret with this app? What are you most proud of?

kennu Jun 27, 2017

The iOS app is not available in the Finnish App Store, only US :-( We have hotdogs here, too! (And not hotdogs.)

startupdiscuss Jun 26, 2017

Okay, who is going to test this on you-know-what to see if Jian-Yang's pivot would have worked?

k2xl Jun 26, 2017

Tim: Any tips/online resources for someone starting out with ML? How did you learn?

longsangstan Jun 26, 2017

great app! any plan to open source it?

timanglade Jun 26, 2017

Not at the moment — although you’ll find the most critical aspects explained in detail in the post. The rest I fear will age very quickly… With stuff like CoreML and TensorFlow Lite on the immediate horizon, I can’t imagine people will want or need to use the cumbersome manual approach I had to use to ship this app. Anything in particular you’d like to see? I can try to share it in a follow-up post or in comments here.

lightbyte Jun 26, 2017

I have a quick question on this. The blog post mentions that you guys went with CONV-BATCHNORM-ACTIVATION (still unsure on whether this is the better order), but in the model code that is posted the batchnorm and activations are the other way around. Which ordering did you end up using?

timanglade Jun 26, 2017

Ooops, good catch — I had posted the wrong definition. Corrected now! It was convolution, batch norm, then elu activation.

lightbyte Jun 26, 2017

Quick followup, what type of optimizer did you guys end up using?

2 More Comments →

nulldev Jun 26, 2017 (dead)

megamindbrian Jun 27, 2017

LOL, I love that part. So funny.

This item has no comments currently.

It looks like you have JavaScript disabled. This web app requires that JavaScript is enabled. Please enable JavaScript to use this site (or just go read Hacker News).