Skip to content

Transformer model to translate Portugese to English with high accuracy

Notifications You must be signed in to change notification settings

wshahbaz/transformer_translator

Repository files navigation

Transformer Model for Language Understanding

Transformer model implementation from scratch and trained on Portuguese-English translation dataset from the TED Talks Open Translation Project to translate Portugese phrases to English phrases.

The Transformer model is implemented from scratch: Scaled Dot-Product Attention, followed by multi-head attention, afterwards building an encoder and decoder layer and building the overall encoder and decoder architectures. Then we build input embeddings and positional encoders and the final NN layers of the Transformer network, before combining all components to have a trainable Transformer network model.

Transformer Diagram

The original text data was tokenized using a pretrained subword tokenizer that uses a BERT tokenizer optimized for the Portuguese-English translation dataset

After training for 20 epochs, the Transformer model was able to reliably translate portugese sentences to english sentences, achieving 68% test accuracy on the Portuguese-English translation test dataset. Below are a few examples of model translating along with the attention weight plots to further understand how the model performed.

Attention weight plots for a simple (known vocabulary) sentence (Portugese: "este é o primeiro livro que eu fiz.", ground truth: "this is the first book i've ever done."):

Attention weights for simple sentence

Attention weight plots for more complex (contains unknown vocabulary) sentence (Portugese: "Eu li sobre triceratops na enciclopédia.", ground truth: "I read about triceratops in the encyclopedia."):

Attention weights for simple sentence

References

About

Transformer model to translate Portugese to English with high accuracy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published