--- title: RNNPlus keywords: fastai sidebar: home_sidebar summary: "These are RNN, LSTM and GRU PyTorch implementations created by Ignacio Oguiza - timeseriesAI@gmail.com based on:" description: "These are RNN, LSTM and GRU PyTorch implementations created by Ignacio Oguiza - timeseriesAI@gmail.com based on:" nb_path: "nbs/105_models.RNNPlus.ipynb" ---
{% raw %}
{% endraw %}

The idea of including a feature extractor to the RNN network comes from the solution developed by the UPSTAGE team (https://www.kaggle.com/songwonho, https://www.kaggle.com/limerobot and https://www.kaggle.com/jungikhyo). They finished in 3rd position in Kaggle's Google Brain - Ventilator Pressure Prediction competition. They used a Conv1d + Stacked LSTM architecture.

{% raw %}
{% endraw %} {% raw %}
{% endraw %} {% raw %}

class RNNPlus[source]

RNNPlus(c_in, c_out, seq_len=None, hidden_size=[100], n_layers=1, bias=True, rnn_dropout=0, bidirectional=False, n_embeds=None, embed_dims=None, padding_idxs=None, cat_pos=None, feature_extractor=None, fc_dropout=0.0, last_step=True, bn=False, custom_head=None, y_range=None, init_weights=True) :: _RNNPlus_Base

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an OrderedDict of modules can be passed in. The forward() method of Sequential accepts any input and forwards it to the first module it contains. It then "chains" outputs to inputs sequentially for each subsequent module, finally returning the output of the last module.

The value a Sequential provides over manually calling a sequence of modules is that it allows treating the whole container as a single module, such that performing a transformation on the Sequential applies to each of the modules it stores (which are each a registered submodule of the Sequential).

What's the difference between a Sequential and a :class:torch.nn.ModuleList? A ModuleList is exactly what it sounds like--a list for storing Module s! On the other hand, the layers in a Sequential are connected in a cascading way.

Example::

# Using Sequential to create a small model. When `model` is run,
# input will first be passed to `Conv2d(1,20,5)`. The output of
# `Conv2d(1,20,5)` will be used as the input to the first
# `ReLU`; the output of the first `ReLU` will become the input
# for `Conv2d(20,64,5)`. Finally, the output of
# `Conv2d(20,64,5)` will be used as input to the second `ReLU`
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Using Sequential with OrderedDict. This is functionally the
# same as the above code
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))
{% endraw %} {% raw %}

class LSTMPlus[source]

LSTMPlus(c_in, c_out, seq_len=None, hidden_size=[100], n_layers=1, bias=True, rnn_dropout=0, bidirectional=False, n_embeds=None, embed_dims=None, padding_idxs=None, cat_pos=None, feature_extractor=None, fc_dropout=0.0, last_step=True, bn=False, custom_head=None, y_range=None, init_weights=True) :: _RNNPlus_Base

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an OrderedDict of modules can be passed in. The forward() method of Sequential accepts any input and forwards it to the first module it contains. It then "chains" outputs to inputs sequentially for each subsequent module, finally returning the output of the last module.

The value a Sequential provides over manually calling a sequence of modules is that it allows treating the whole container as a single module, such that performing a transformation on the Sequential applies to each of the modules it stores (which are each a registered submodule of the Sequential).

What's the difference between a Sequential and a :class:torch.nn.ModuleList? A ModuleList is exactly what it sounds like--a list for storing Module s! On the other hand, the layers in a Sequential are connected in a cascading way.

Example::

# Using Sequential to create a small model. When `model` is run,
# input will first be passed to `Conv2d(1,20,5)`. The output of
# `Conv2d(1,20,5)` will be used as the input to the first
# `ReLU`; the output of the first `ReLU` will become the input
# for `Conv2d(20,64,5)`. Finally, the output of
# `Conv2d(20,64,5)` will be used as input to the second `ReLU`
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Using Sequential with OrderedDict. This is functionally the
# same as the above code
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))
{% endraw %} {% raw %}

class GRUPlus[source]

GRUPlus(c_in, c_out, seq_len=None, hidden_size=[100], n_layers=1, bias=True, rnn_dropout=0, bidirectional=False, n_embeds=None, embed_dims=None, padding_idxs=None, cat_pos=None, feature_extractor=None, fc_dropout=0.0, last_step=True, bn=False, custom_head=None, y_range=None, init_weights=True) :: _RNNPlus_Base

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an OrderedDict of modules can be passed in. The forward() method of Sequential accepts any input and forwards it to the first module it contains. It then "chains" outputs to inputs sequentially for each subsequent module, finally returning the output of the last module.

The value a Sequential provides over manually calling a sequence of modules is that it allows treating the whole container as a single module, such that performing a transformation on the Sequential applies to each of the modules it stores (which are each a registered submodule of the Sequential).

What's the difference between a Sequential and a :class:torch.nn.ModuleList? A ModuleList is exactly what it sounds like--a list for storing Module s! On the other hand, the layers in a Sequential are connected in a cascading way.

Example::

# Using Sequential to create a small model. When `model` is run,
# input will first be passed to `Conv2d(1,20,5)`. The output of
# `Conv2d(1,20,5)` will be used as the input to the first
# `ReLU`; the output of the first `ReLU` will become the input
# for `Conv2d(20,64,5)`. Finally, the output of
# `Conv2d(20,64,5)` will be used as input to the second `ReLU`
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Using Sequential with OrderedDict. This is functionally the
# same as the above code
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))
{% endraw %} {% raw %}
{% endraw %} {% raw %}
bs = 16
c_in = 3
seq_len = 12
c_out = 2
xb = torch.rand(bs, c_in, seq_len)
test_eq(RNNPlus(c_in, c_out)(xb).shape, [bs, c_out])
test_eq(RNNPlus(c_in, c_out, hidden_size=100, n_layers=2, bias=True, rnn_dropout=0.2, bidirectional=True, fc_dropout=0.5)(xb).shape, [bs, c_out])
test_eq(RNNPlus(c_in, c_out, hidden_size=[100, 50, 10], bias=True, rnn_dropout=0.2, bidirectional=True, fc_dropout=0.5)(xb).shape, [bs, c_out])
test_eq(RNNPlus(c_in, c_out, hidden_size=[100], n_layers=2, bias=True, rnn_dropout=0.2, bidirectional=True, fc_dropout=0.5)(xb).shape, 
        [bs, c_out])
test_eq(LSTMPlus(c_in, c_out, hidden_size=100, n_layers=2, bias=True, rnn_dropout=0.2, bidirectional=True, fc_dropout=0.5)(xb).shape, [bs, c_out])
test_eq(GRUPlus(c_in, c_out, hidden_size=100, n_layers=2, bias=True, rnn_dropout=0.2, bidirectional=True, fc_dropout=0.5)(xb).shape, [bs, c_out])
test_eq(RNNPlus(c_in, c_out, seq_len, last_step=False)(xb).shape, [bs, c_out])
test_eq(RNNPlus(c_in, c_out, seq_len, last_step=False)(xb).shape, [bs, c_out])
test_eq(RNNPlus(c_in, c_out, seq_len, hidden_size=100, n_layers=2, bias=True, rnn_dropout=0.2, bidirectional=True, fc_dropout=0.5, last_step=False)(xb).shape, 
        [bs, c_out])
test_eq(LSTMPlus(c_in, c_out, seq_len, last_step=False)(xb).shape, [bs, c_out])
test_eq(GRUPlus(c_in, c_out, seq_len, last_step=False)(xb).shape, [bs, c_out])
{% endraw %} {% raw %}
feature_extractor = MultiConv1d(c_in, kss=[1,3,5,7])
custom_head = nn.Sequential(Transpose(1,2), nn.Linear(8,8), nn.SELU(), nn.Linear(8, 1), Squeeze())
test_eq(LSTMPlus(c_in, c_out, seq_len, hidden_size=[32,16,8,4], bidirectional=True, 
                 feature_extractor=feature_extractor, custom_head=custom_head)(xb).shape, [bs, seq_len])
feature_extractor = MultiConv1d(c_in, kss=[1,3,5,7], keep_original=True)
custom_head = nn.Sequential(Transpose(1,2), nn.Linear(8,8), nn.SELU(), nn.Linear(8, 1), Squeeze())
test_eq(LSTMPlus(c_in, c_out, seq_len, hidden_size=[32,16,8,4], bidirectional=True, 
                 feature_extractor=feature_extractor, custom_head=custom_head)(xb).shape, [bs, seq_len])
[W NNPACK.cpp:79] Could not initialize NNPACK! Reason: Unsupported hardware.
{% endraw %} {% raw %}
bs = 16
c_in = 3
seq_len = 12
c_out = 2
x1 = torch.rand(bs,1,seq_len)
x2 = torch.randint(0,3,(bs,1,seq_len))
x3 = torch.randint(0,5,(bs,1,seq_len))
xb = torch.cat([x1,x2,x3],1)

custom_head = partial(create_mlp_head, fc_dropout=0.5)
test_eq(LSTMPlus(c_in, c_out, seq_len, last_step=False, custom_head=custom_head)(xb).shape, [bs, c_out])
custom_head = partial(create_pool_head, concat_pool=True, fc_dropout=0.5)
test_eq(LSTMPlus(c_in, c_out, seq_len, last_step=False, custom_head=custom_head)(xb).shape, [bs, c_out])
custom_head = partial(create_pool_plus_head, fc_dropout=0.5)
test_eq(LSTMPlus(c_in, c_out, seq_len, last_step=False, custom_head=custom_head)(xb).shape, [bs, c_out])
custom_head = partial(create_conv_head)
test_eq(LSTMPlus(c_in, c_out, seq_len, last_step=False, custom_head=custom_head)(xb).shape, [bs, c_out])
test_eq(LSTMPlus(c_in, c_out, seq_len, hidden_size=[100, 50], n_layers=2, bias=True, rnn_dropout=0.2, bidirectional=True)(xb).shape, [bs, c_out])

n_embeds = [3, 5]
cat_pos = [1, 2]
custom_head = partial(create_conv_head)
m = LSTMPlus(c_in, c_out, seq_len, hidden_size=[100, 50], n_layers=2, bias=True, rnn_dropout=0.2, bidirectional=True, n_embeds=n_embeds, cat_pos=cat_pos)
test_eq(m(xb).shape, [bs, c_out])
{% endraw %} {% raw %}
from tsai.data.all import *
from tsai.models.utils import *
dsid = 'NATOPS' 
bs = 16
X, y, splits = get_UCR_data(dsid, return_split=False)
tfms  = [None, [Categorize()]]
dls = get_ts_dls(X, y, tfms=tfms, splits=splits, bs=bs)
{% endraw %} {% raw %}
model = build_ts_model(LSTMPlus, dls=dls)
print(model[-1])
learn = Learner(dls, model,  metrics=accuracy)
learn.fit_one_cycle(1, 3e-3)
Sequential(
  (0): LastStep()
  (1): Linear(in_features=100, out_features=6, bias=True)
)
epoch train_loss valid_loss accuracy time
0 1.770549 1.695585 0.244444 00:01
{% endraw %} {% raw %}
model = LSTMPlus(dls.vars, dls.c, dls.len, last_step=False)
learn = Learner(dls, model,  metrics=accuracy)
learn.fit_one_cycle(1, 3e-3)
epoch train_loss valid_loss accuracy time
0 1.120653 0.634209 0.738889 00:02
{% endraw %} {% raw %}
custom_head = partial(create_pool_head, concat_pool=True)
model = LSTMPlus(dls.vars, dls.c, dls.len, last_step=False, custom_head=custom_head)
learn = Learner(dls, model,  metrics=accuracy)
learn.fit_one_cycle(1, 3e-3)
epoch train_loss valid_loss accuracy time
0 1.711861 1.544655 0.477778 00:02
{% endraw %} {% raw %}
custom_head = partial(create_pool_plus_head, concat_pool=True)
model = LSTMPlus(dls.vars, dls.c, dls.len, last_step=False, custom_head=custom_head)
learn = Learner(dls, model,  metrics=accuracy)
learn.fit_one_cycle(1, 3e-3)
epoch train_loss valid_loss accuracy time
0 0.973254 1.331597 0.672222 00:02
{% endraw %} {% raw %}
m = RNNPlus(c_in, c_out, seq_len, hidden_size=100,n_layers=2,bidirectional=True,rnn_dropout=.5,fc_dropout=.5)
print(m)
print(count_parameters(m))
m(xb).shape
RNNPlus(
  (backbone): _RNN_Backbone(
    (to_cat_embed): Identity()
    (feature_extractor): Identity()
    (rnn): Sequential(
      (0): RNN(3, 100, num_layers=2, batch_first=True, dropout=0.5, bidirectional=True)
      (1): LSTMOutput()
    )
    (transpose): Transpose(dims=-1, -2).contiguous()
  )
  (head): Sequential(
    (0): LastStep()
    (1): Dropout(p=0.5, inplace=False)
    (2): Linear(in_features=200, out_features=2, bias=True)
  )
)
81802
torch.Size([16, 2])
{% endraw %} {% raw %}
m = LSTMPlus(c_in, c_out, seq_len, hidden_size=100,n_layers=2,bidirectional=True,rnn_dropout=.5,fc_dropout=.5)
print(m)
print(count_parameters(m))
m(xb).shape
LSTMPlus(
  (backbone): _RNN_Backbone(
    (to_cat_embed): Identity()
    (feature_extractor): Identity()
    (rnn): Sequential(
      (0): LSTM(3, 100, num_layers=2, batch_first=True, dropout=0.5, bidirectional=True)
      (1): LSTMOutput()
    )
    (transpose): Transpose(dims=-1, -2).contiguous()
  )
  (head): Sequential(
    (0): LastStep()
    (1): Dropout(p=0.5, inplace=False)
    (2): Linear(in_features=200, out_features=2, bias=True)
  )
)
326002
torch.Size([16, 2])
{% endraw %} {% raw %}
m = GRUPlus(c_in, c_out, seq_len, hidden_size=100,n_layers=2,bidirectional=True,rnn_dropout=.5,fc_dropout=.5)
print(m)
print(count_parameters(m))
m(xb).shape
GRUPlus(
  (backbone): _RNN_Backbone(
    (to_cat_embed): Identity()
    (feature_extractor): Identity()
    (rnn): Sequential(
      (0): GRU(3, 100, num_layers=2, batch_first=True, dropout=0.5, bidirectional=True)
      (1): LSTMOutput()
    )
    (transpose): Transpose(dims=-1, -2).contiguous()
  )
  (head): Sequential(
    (0): LastStep()
    (1): Dropout(p=0.5, inplace=False)
    (2): Linear(in_features=200, out_features=2, bias=True)
  )
)
244602
torch.Size([16, 2])
{% endraw %}