timanglade parent
Ooops, good catch — I had posted the wrong definition. Corrected now! It was convolution, batch norm, then elu activation.
Quick followup, what type of optimizer did you guys end up using?
SGD with Cyclical Learning Rates [0]. Honestly, it’s the closest to a Machine Learning silver bullet I’ve found to date! That paper is awesome.
Thanks! I'm trying out your model on a small data set I've been playing with for identifying invasive species of flowers [0] and so far it's working way better than my initial version that was based on resnet (though slower)!