Customized Caffe Memory Optimization

Usage

The behavior of the system in optimizing memory usage is controlled by the mem_param in the network config.

See the following excerpt from a network config file for example

name: "example_net"
mem_param {
  optimize_train: true
  optimize_test: true
  exclude_blob: "fc1"
  exclude_blob: "fc2"
}

The above example demos how to control the memory optimization module. Let's explain what the lines stand for:

  • optimize_train controls whether we will enable the optimization on the train net. If set to true, it will save about half of the memory for training.

  • optimize_test controls whether we enable the optimization for any test-phase network. Setting it to true will save most of the memory usage for testing.

  • exclude_blob is a repeated field which specifies the blobs to be excluded from the optimization process. Excluding a blob will make sure this blob is safe to be extracted by the Python/Matlab interfaces. By default, all inputs, outputs, and losses of a net will be excluded.

How is it working?

The reason that we can reduce memory usage is that for each forward step (and backward step for training), the memory used by previous blobs can be reused by later blobs. The memory optimization module works by tracking the dependencies of the blobs in a network instance to safely implement this idea. This is sometimes referred to as multiloading.

In particular, we do a "dry run" process beforehand to determine how we can reuse the memory block of a blob. This is a static optimization process. On the other hand, our next generation deep learning toolbox, Parrots, achieves this by dynamically scheduling the memory usage, which can bring even better memory reduction and more flexibility

Last updated