DeepGRU: Gesture Recognition Utility (DeepGRU
)¶
DeepGRU is a neural network architecture created by Mehran Maghoumi and Joseph J. LaViola Jr, originally designed to perform the task of gesture recognition, but is widely applicable to general sequence classification tasks.
The architecture is essentially a recurrent neural network encoder combined with an attentional module which learns to place more focus on subsequences which are more important for the classification.
Rather than the commonly used long shortterm memory (LSTM) unit, the authors opt for the gated recurrent unit (GRU), which has fewer parameters, and therefore makes the network faster to train. Interestingly, the encoder network used in DeepGRU is not bidirectional, which is typically the standard way to use recurrent neural networks in sequence classification and sequencetosequence modelling. The authors found that a unidirectional one was sufficient, faster to train and had similar performance to a bidirectional one.
The DeepGRU
class is a PyTorch implementation of the DeepGRU architecture.
A utility function collate_fn()
is also provided, which is passed to a torch.utils.data.DataLoader
and specifies how batches should be formed from provided observation sequences.
Note
The existing preprocessing methods in sequentia.preprocessing
are currently only
applicable to lists of numpy.ndarray
objects, and therefore cannot be applied
as transformations for torch.Tensor
objects.
Unfortunately this means that the preprocessing methods can only be used to preprocess data for
sequentia.classifiers.knn.KNNClassifier
and sequentia.classifiers.hmm.HMMClassifier
,
and not sequentia.classifiers.rnn.DeepGRU
.
API Reference¶

class
sequentia.classifiers.rnn.
DeepGRU
(n_features, n_classes, dims={'fc': 256, 'gru1': 512, 'gru2': 256, 'gru3': 128}, device=None)[source]¶ A modular PyTorch implementation of the DeepGRU (Deep Gesture Recognition Utility) recurrent neural network architecture designed by Maghoumi & LaViola Jr. 1, originally for gesture recognition, but applicable to general sequence classification tasks.
 Parameters
 n_features: int
The number of features that each observation within a sequence has.
 n_classes: int
The number of different sequence classes.
 dims: dict
A dictionary consisting of dimension configuration for the GRUs and fullyconnected layers.
Note
Values for the keys
'gru1'
,'gru2'
,'gru3'
and'fc'
must be set. device: str, optional
The device to send the model parameters to for computation.
If no device is specified, a check is made for any available CUDA device, otherwise the CPU is used.
Notes
 1
Mehran Maghoumi & Joseph J. LaViola Jr. “DeepGRU: Deep Gesture Recognition Utility” Advances in Visual Computing, 14th International Symposium on Visual Computing, ISVC 2019, Lake Tahoe, NV, USA, October 7–9, 2019, Proceedings, Part I (pp.1631)

forward
(x, x_lengths)[source]¶ Passes the batched input sequences through the encoder network, attention module and classifier to generate logsoftmax scores.
Note
Since logsoftmax scores are returned, it is advised to use the negative loglikelihood loss
torch.nn.NLLLoss
. Parameters
 x: torch.PackedSequence
A packed representation of a batch of input observation sequences.
 x_lengths: torch.Tensor (long/int)
A tensor of the sequence lengths of the batch in descending order.
 Returns
 log_softmax:
torch.Tensor
(float) \(B\times C\) tensor of \(C\) logsoftmax scores (class predictions) for each observation sequence in the batch.
 log_softmax:
Batching and collation¶

sequentia.classifiers.rnn.
collate_fn
(batch)[source]¶ Collects together univariate or multivariate sequences into a single batch, arranged in descending order of length.
Also returns the corresponding lengths and labels as
torch.LongTensor
objects. Parameters
 batch: list of tuple(torch.FloatTensor, int)
Collection of \(B\) sequencelabel pairs, where the \(n^\text{th}\) sequence is of shape \((T_n \times D)\) or \((T_n,)\) and the label is an integer.
 Returns
 padded_sequences:
torch.Tensor
(float) A tensor of size \(B \times T_\text{max} \times D\) containing all of the sequences in descending length order, padded to the length of the longest sequence in the batch.
 lengths:
torch.Tensor
(long/int) A tensor of the \(B\) sequence lengths in descending order.
 labels:
torch.Tensor
(long/int) A tensor of the \(B\) sequence labels in descending length order.
 padded_sequences: