Course 2

Deep Learning for Machine Translation
Hassan Sajjad & Fahim Dalvi (Hamad Bin Khalifa University Qatar)

Statistical methods have dominated the field of machine translation for almost a decade now. These methods use a parallel corpus, i.e. a set of sentence pairs, where a sentence pair consists of a source language sentence, and its corresponding target language translation. The main objective of these methods has been to learn a mapping between the source and target words, and then use this mapping to generate translations of new source sentences. As recently as a couple of years ago, Deep Neural Networks have dethroned the Phrase based methods, and have been shown to give state-of-the-art results for machine translation.

In this lecture series, we will first cover the basics of statistical machine translation to establish the intuition behind machine translation. We will then cover the basics of neural network models – word embedding and neural language model. Finally, we will learn an end-to-end translation system based completely on deep neural networks. In the last part of the lecture series, we will learn to peek into these neural systems and analyze what they learn about the intricacies of a language like morphology and syntax, without ever explicitly seeing these details in the training data. We will see how to adapt these models quickly to the required domain without retraining the model from scratch.

Background reading