Things To Be Considered When Doing Model Converting
Image Input format
[0, 255] input or [0, 1] input
mean, variance
MXNet Pre-trained Model: All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (N x 3 x H x W), where N is the batch size, and H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using
mean = [0.485, 0.456,0.406]
andstd = [0.229, 0.224, 0.225]
.Caffe: [0, 255] input, only has a scale parameter shared across all three channels.
Convolution Kernels
Row major or Column major?
has bias term or not
Batch norm layer
Corresponding relations
moving mean, moving variance, gamma, beta
Caffe batch norm layer has THREE blobs, the third one is counts
The actual moving mean/variance is calculated via: moving_mean / counts and moving_var / counts
