Things To Be Considered When Doing Model Converting

Image Input format
- RGB or BGR
- Preprocessing
  - [0, 255] input or [0, 1] input
  - mean, variance
    MXNet Pre-trained Model: All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (N x 3 x H x W), where N is the batch size, and H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456,0.406] and std = [0.229, 0.224, 0.225].
    Caffe: [0, 255] input, only has a scale parameter shared across all three channels.
Convolution Kernels
- Row major or Column major?
- has bias term or not
Batch norm layer
- Corresponding relations
  - moving mean, moving variance, gamma, beta
  - Caffe batch norm layer has THREE blobs, the third one is counts
    The actual moving mean/variance is calculated via: moving_mean / counts and moving_var / counts

Reference:

Last updated 7 years ago