lightbyte parent
I have a quick question on this. The blog post mentions that you guys went with CONV-BATCHNORM-ACTIVATION (still unsure on whether this is the better order), but in the model code that is posted the batchnorm and activations are the other way around. Which ordering did you end up using?
Ooops, good catch — I had posted the wrong definition. Corrected now! It was convolution, batch norm, then elu activation.
Quick followup, what type of optimizer did you guys end up using?
SGD with Cyclical Learning Rates [0]. Honestly, it’s the closest to a Machine Learning silver bullet I’ve found to date! That paper is awesome.
Thanks! I'm trying out your model on a small data set I've been playing with for identifying invasive species of flowers [0] and so far it's working way better than my initial version that was based on resnet (though slower)!