[Gluon] When to use (Hybrid)Sequential and (Hybrid)Block

Is there any side effect by using ‘Sequential()’ or ‘HybridSequential()’ as a container only?MXNet Forum

In Gluon, networks are build using Blocks. If something is not a Block, it cannot be part of a Gluon network. Dense layer is a Block, Convolution is a Block, Pooling layer is a Block, etc.

Sometimes you might want a Block that is not a pre-defined block in Gluon but is a sequence of predefined Gluon blocks. For example,

Conv2D -> MaxPool2D -> Conv2D -> MaxPool2D -> Flatten -> Dense -> Dense

Gluon doesn’t have a pre-defined block that does the above sequence of operation. But Gluon does have Blocks that does each of the individual operation. So, you can create your own block that does the above sequence of operation by stringing together predefined Gluon blocks. Example:

net = gluon.nn.HybridSequential()

with net.name_scope():

    # First convolution
    net.add(gluon.nn.Conv2D(channels=20, kernel_size=5, activation='relu'))
    net.add(gluon.nn.MaxPool2D(pool_size=2, strides=2))

    # Second convolution
    net.add(gluon.nn.Conv2D(channels=50, kernel_size=5, activation='relu'))
    net.add(gluon.nn.MaxPool2D(pool_size=2, strides=2))

    # Flatten the output before the fully connected layers
    net.add(gluon.nn.Flatten())

    # First fully connected layers with 512 neurons
    net.add(gluon.nn.Dense(512, activation="relu"))

    # Second fully connected layer with as many neurons as the number of classes
    net.add(gluon.nn.Dense(num_outputs))

When you create a sequence like that, you can either use HybridSequential or Sequential. To understand the difference, you need to understand the difference between symbolic and imperative programming 6.

HybridBlock is a Block that can be converted into symbolic graph for faster execution. HybridSequential is a sequence of Hybrid blocks.
Blocks (not the hybrid ones) is a Block that cannot be converted into symbolic graph. Sequential is a sequence of non hybrid Blocks.

Whether or not a block is Hybrid depends on how it is implemented. Almost all predefined Gluon blocks are also HybridBlocks. Sometimes there is reason why some blocks cannot be Hybrid. Tree LSTM 2 is one example. More often, something is not Hybrid just because whoever wrote it didn’t put in the effort to make it Hybrid for several reasons (ex: maybe making it hybrid won’t give big performance boost or maybe it is hard to make the block hybrid).

Note that Sequential and HybridSequential are not just containers like Python list. When you use one of them, you are actually creating a new Block using preexisting blocks. This is why you cannot replace Sequential using Python list.

Okay, so you know how to create your own block by stringing together preexisting blocks. Good. What if you want to not just pass the data through a sequence of blocks? What if you want to conditionally pass the data through one of those blocks. Here is an example from ResNet:

class BasicBlockV1(HybridBlock):
    def __init__(self, channels, stride, downsample=False, in_channels=0, **kwargs):
        super(BasicBlockV1, self).__init__(**kwargs)
        self.body = nn.HybridSequential(prefix='')
        self.body.add(_conv3x3(channels, stride, in_channels))
        self.body.add(nn.BatchNorm())
        self.body.add(nn.Activation('relu'))
        self.body.add(_conv3x3(channels, 1, channels))
        self.body.add(nn.BatchNorm())
        if downsample:
            self.downsample = nn.HybridSequential(prefix='')
            self.downsample.add(nn.Conv2D(channels, kernel_size=1, strides=stride,
                                          use_bias=False, in_channels=in_channels))
            self.downsample.add(nn.BatchNorm())
        else:
            self.downsample = None

    def hybrid_forward(self, F, x):
        residual = x

        x = self.body(x)

        if self.downsample:
            residual = self.downsample(residual)

        x = F.Activation(residual+x, act_type='relu')

        return x

This code creates a new Block using preexisting Gluon blocks. But it does more than just running the data through some preexisting blocks. Given some data, the block runs the data through the body block aways. But then, runs the data through downsample only if this Block was created with downsample set to true. It then concats the output of body and downsample to create the output. Like you can see there is more happening than just passing data through a sequence of Blocks. This is when you create your own block by subclassing HybridBlock or Block.

Note that the __init__ function created the necessary blocks and forward function gets the inputs and runs the input through the blocks created in __init__. forward does not modify the blocks created in __init__. It only runs the data through the blocks created in __init__.

PreviousHow does Caffe work NextGluon Hybrid Intro

Last updated 6 years ago