Language Models

For your convenience, we have included some ready-to-use Neural Network Models in NLarge.

Ready-to-Use Models

Some NLP models are implemented as classes for the convenience of use. Each class is implemented to provide users with different neural network architectures for text classification, allowing flexibility in model selection based on specific needs like attention mechanisms, multi-head attention, or recurrent layers such as LSTM, GRU, and vanilla RNN.

RNN.py

This module offers models based on a vanilla RNN for text classification, providing alternatives with and without max pooling.

1. TextClassifierRNN

A simple RNN-based classifier, utilizing a fully connected layer after the final hidden state for sequence classification. Suitable for shorter sequences where RNN limitations, such as gradient vanishing, are less significant.

2. TextClassifierRNNMaxPool

Extends TextClassifierRNN by applying a max-pooling operation over the RNN outputs. This version can better capture the most relevant features across the entire sequence.

Example Usage:

# Import pipeline and Model
from NLarge.pipeline import TextClassificationPipeline
from NLarge.model.RNN import TextClassifierRNN 

# Initialize Pipeline
pipeline_augmented = TextClassificationPipeline(
    augmented_data=augmented_train_data,
    test_data=original_test_data,
    max_length=128,
    test_size=0.2,
    model_class=TextClassifierRNN,
)

LSTM.py

This module provides LSTM-based models for text classification, with and without attention mechanisms.

1. TextClassifierLSTM

Implements a bidirectional LSTM classifier, where the final hidden state of the sequence is passed to a fully connected layer for binary classification.

2. Attention

An attention layer for use with LSTM outputs. It computes attention scores for each step in the sequence, providing a weighted sum of hidden states based on their importance.

3. TextClassifierLSTMWithAttention

An LSTM classifier incorporating the attention mechanism. It uses a bidirectional LSTM to capture contextual information, followed by attention to highlight critical sequence components, which enhances interpretability.

Example Usage:

# Import pipeline and Model
from NLarge.pipeline import TextClassificationPipeline
from NLarge.model.LSTM import TextClassifierLSTMWithAttention 

# Initialize Pipeline
pipeline_augmented = TextClassificationPipeline(
    augmented_data=augmented_train_data,
    test_data=original_test_data,
    max_length=128,
    test_size=0.2,
    model_class=TextClassifierLSTMWithAttention,
)

GRU.py

This module contains GRU-based models for text classification, providing options with and without attention.

1. TextClassifierGRU

Implements a GRU-based classifier using bidirectional GRU layers, allowing context from both directions in the sequence. It applies a fully connected layer on the final hidden state, suitable for capturing sequential information.

2. Attention

This class defines an attention mechanism specific to the GRU output, helping focus on the most informative parts of the sequence.

3. TextClassifierGRUWithAttention

A GRU-based classifier that incorporates attention to focus on important sequence parts before final classification. This model uses a bidirectional GRU, with attention applied to the output, and aggregates the context vector to the final output layer.

Example Usage:

# Import pipeline and Model
from NLarge.pipeline import TextClassificationPipeline
from NLarge.model.GRU import TextClassifierGRU 

# Initialize Pipeline
pipeline_augmented = TextClassificationPipeline(
    augmented_data=augmented_train_data,
    test_data=original_test_data,
    max_length=128,
    test_size=0.2,
    model_class=TextClassifierGRU,
)

attention.py

This module provides attention-based models for text classification.

1. TextClassifierAttentionNetwork

Implements a basic attention mechanism with query, key, and value layers. The model calculates attention weights, aggregates them to form a context vector, and then applies a fully connected layer followed by a sigmoid activation for binary classification.

2. MultiHeadAttention

A multi-head attention layer that splits the input into multiple heads to capture various aspects of the input representation, making it effective in handling complex relationships in the input sequence.

3. TextClassifierMultiHeadAttentionNetwork

Uses the MultiHeadAttention class for multi-head attention followed by a fully connected layer for classification. It aggregates attention across multiple heads to increase model interpretability and capture richer sequence-level information.

Example Usage:

# Import pipeline and Model
from NLarge.pipeline import TextClassificationPipeline
from NLarge.model.Attention import MultiHeadAttention 

# Initialize Pipeline
pipeline_augmented = TextClassificationPipeline(
    augmented_data=augmented_train_data,
    test_data=original_test_data,
    max_length=128,
    test_size=0.2,
    model_class=MultiHeadAttention,
)