Create a Neural Network with Gluon

MXNet crash course

this page replicates

script for this page

mx-02_neural-network.pl

outline of this page

This second page of the MXNet tutorial explains how to create neural networks with Gluon.

use AI::MXNet qw(mx);

use AI::MXNet::Gluon::NN qw(nn);

Note that here, we must explicitly list the AI::MXNet::Gluon::NN module.

create the neural network's first layer

To begin, let's create a dense layer (i.e. a "fully-connected" layer) with two output units:

my $layer = nn->Dense(2);

print $layer;

which prints:

Dense(2 -> 0, linear)

Next, let's define the initialization method for the weights of the dense layer. Here, we'll use the default method, which randomly draws uniformly distributed values within the interval: [-0.7, 0.7].

$layer->initialize;

The line above defines the initialization method (in this case, the default). Initialization also requires a particular number of weights to be defined. That number depends on the number of inputs.

So our next step is to create a (3,4) shape, called $zz, which represents a batch of three and four inputs. To generate its data, we'll randomly draw from the uniform distribution with interval: [-1, 1].

my $zz = nd->random->uniform(-1,1,[3,4]);

And we'll feed $zz into our dense layer:

my $zlayer = $layer->($zz);

print $zlayer->aspdl;

to obtain two outputs for each of the three samples:

[
[ -0.0252413 -0.00874885]
[ -0.0602654 -0.0130806]
[ 0.024684 -0.0218156]
]

Note that Gluon automatically inferred the number of weights to initialize. It initialized eight because there are four inputs and two outputs. Let's access those (randomly generated) weights:

my $layerwd = $layer->weight->data;

print $layerwd->aspdl;

and obtain those eight weights:

[
[-0.00873779 -0.0283452 0.0548482 -0.0620602]
[ 0.0649128 -0.0318281 -0.0163182 -0.00312688]
]

chain layers into a neural network

In practice, a neural network would have a chain of layers, so let's consider the relatively simple case of the LeNet network and implement it with nn->Sequential.

my $net = nn->Sequential();

$net->name_scope(sub {
$net->add(nn->Conv2D(channels=>6, kernel_size=>5, activation=>'relu'));
$net->add(nn->MaxPool2D(pool_size=>2, strides=>2));
$net->add(nn->Conv2D(channels=>16, kernel_size=>3, activation=>'relu'));
$net->add(nn->MaxPool2D(pool_size=>2, strides=>2));
$net->add(nn->Flatten());
$net->add(nn->Dense(120, activation=>"relu"));
$net->add(nn->Dense(84, activation=>"relu"));
$net->add(nn->Dense(10));
});

For a concise summary of the model, we can:

print $net;

to obtain:

Sequential(
  (0): Conv2D(6, kernel_size=(5,5), stride=(1,1))
  (1): MaxPool2D(size=(2,2), stride=(2,2), padding=(0,0), ceil_mode=0)
  (2): Conv2D(16, kernel_size=(3,3), stride=(1,1))
  (3): MaxPool2D(size=(2,2), stride=(2,2), padding=(0,0), ceil_mode=0)
  (4): Flatten
  (5): Dense(120 -> 0, Activation(relu))
  (6): Dense(84 -> 0, Activation(relu))
  (7): Dense(10 -> 0, linear)
)

As before, our next step is to define the initialization method:

$net->initialize();

and pass some data $xx through the network:

my $xx = nd->random->uniform(shape=>[4,1,28,28]);

my $yy = $net->($xx);

In Perl, the network is an array, so we can access the i-th layer of the network, just as we would access the i-th element of an array: ${$net}[$i]. For example, to print the first layer's weight and sixth layer's bias:

print 'shape of ${$net}[0] : ['. join(",", @{${$net}[0]->weight->data->shape }).']'."\n";

print 'shape of ${$net}[5] : ['. join(",", @{${$net}[5]->bias->data->shape }).']'."\n";

which returns:

shape of ${$net}[0] : [6,1,5,5]
shape of ${$net}[5] : [120]

create a neural network flexibly

The nn->Sequential method that we used above automatically constructs the "forward" function that sequentially executes each added layer. For more flexibility, we can create a subclass of AI::MXNet::Gluon::Block that implements two methods: an initialization method that creates the layers and a method that defines the forward function.

{   package MixMLP;
use strict;
use warnings;
use AI::MXNet::Gluon::Mouse;
use AI::MXNet::Function::Parameters;
extends 'AI::MXNet::Gluon::Block';

sub BUILD {
  my $self = shift;
  $self->name_scope(
    sub {
      $self->blk( nn->Sequential );
      $self->blk->name_scope(
        sub {
          $self->blk->add( nn->Dense(3, activation=>'relu'));
          $self->blk->add( nn->Dense(4, activation=>'relu'));
        });
      $self->dense(nn->Dense(5));
    });
}

method forward($xx) {
  my $yy = nd->relu( $self->blk->($xx) );
  return $self->dense->($yy);
}

}

The sequential approach that we used above adds instances where nn->Block is the base class. Then it runs them in a forward pass. This example demonstrates a more flexible way to define the forward function. In this case, it applies the relu activation function to the intermediate results, then passes it to the final layer.

So let's use our new class to create the network. And let's print a summary of it:

my $mixmlp = MixMLP->new();

print $mixmlp;

which returns:

MixMLP(
  (blk): Sequential(
    (0): Dense(3 -> 0, Activation(relu))
    (1): Dense(4 -> 0, Activation(relu))
  )
  (dense): Dense(5 -> 0, linear)
)

As before, we define the initialization method. And once again, we use the default.

$mixmlp->initialize();

For this example, we'll use random draws to simulate a batch of two and two inputs:

my $ww = nd->random->uniform(shape=>[2,2]);

and pass it to the network:

$mixmlp->($ww);

Having initialized the network, let's examine the weight of a particular layer:

print ${$mixmlp->blk}[1]->weight->data->aspdl;

which returns:

[
[ -0.0343901 -0.0580586 -0.0618759]
[ -0.0621014 -0.00918167 -0.00170272]
[ -0.0263486 0.0533406 0.0274881]
[ 0.0666966 -0.0171147 0.0164721]
]

:: previous ::
manipulate data with NDArray

table of contents

:: next ::
automatic differentiation with autograd