Things To Be Considered When Doing Model Converting

  1. Image Input format

    • RGB or BGR

    • Preprocessing

      • [0, 255] input or [0, 1] input

      • mean, variance

        • MXNet Pre-trained Model: All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (N x 3 x H x W), where N is the batch size, and H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456,0.406] and std = [0.229, 0.224, 0.225].

        • Caffe: [0, 255] input, only has a scale parameter shared across all three channels.

  2. Convolution Kernels

    • Row major or Column major?

    • has bias term or not

  3. Batch norm layer

    • Corresponding relations

      • moving mean, moving variance, gamma, beta

      • Caffe batch norm layer has THREE blobs, the third one is counts

        • The actual moving mean/variance is calculated via: moving_mean / counts and moving_var / counts

Reference:

Last updated