DCU Home | Our Courses | Loop | Registry | Library | Search DCU
<< Back to Module List

Module Specifications.

Current Academic Year 2024 - 2025

All Module information is indicative, and this portal is an interim interface pending the full upgrade of Coursebuilder and subsequent integration to the new DCU Student Information System (DCU Key).

As such, this is a point in time view of data which will be refreshed periodically. Some fields/data may not yet be available pending the completion of the full Coursebuilder upgrade and integration project. We will post status updates as they become available. Thank you for your patience and understanding.

Date posted: September 2024

Module Title Machine Translation
Module Code CA4012 (ITS) / CSC1105 (Banner)
Faculty Engineering & Computing School Computing
Module Co-ordinatorEllen Rushe
Module TeachersAndrew Way, Brian Davis, John McKenna, Kolawole John Adebayo, Maja Popovic
NFQ level 8 Credit Rating 7.5
Pre-requisite Not Available
Co-requisite Not Available
Compatibles Not Available
Incompatibles Not Available
None
Description

This course introduces the fundamentals of machine translation, including the currently widely used neural approach.

Learning Outcomes

1. Discuss the challenges associated with machine translation including its evaluation.
2. Explain the concept of machine translation including approaches and the importance of language data.
3. Demonstrate how a statistical translation model can be inferred from a parallel corpus of texts using unsupervised machine learning techniques.
4. Explain how neural networks work in general and how they can be used for language-related tasks.
5. Explain the concepts of statistical language modelling and neural language modelling and their differences.
6. Explain the decoding process in NMT and understand the differences between decoding in SMT and decoding in NMT.
7. Demonstrate a knowledge of the state-of-the-art transformer neural machine translation.
8. Explain the differences between recurrent machine translation and transformer machine translation.
9. Train, test and evaluate MT system using the open-source Joey NMT tookit.



Workload Full-time hours per semester
Type Hours Description
Lecture24Two lectures a week
Laboratory24One two-hour lab session a week
Group work40Group project
Assignment Completion50Individual assignment
Independent Study50Studying material presented in lecture, reading research papers
Total Workload: 188

All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml

Indicative Content and Learning Activities

Introduction to Machine Translation
What is machine translation? Overview of the three approaches: rule-based, statistical, neural. Importance of data for statistical and neural MT. Sentence alignment and preprocessing.

Evaluating MT systems
The relative advantages and disadvantages of human evaluation and automatic evaluation. Two main concepts used for automatic evaluation metrics: n-gram matching and edit distance.

Statistical Machine Translation
Probability model for translation, Translation model and Language model, Word Alignments and IBM models, Phrase-based SMT, Decoding.

Introduction to Neural Networks
What are neural networks? Architectures: feed forward and recurrent networks. Training neural networks: back-propagation and gradient descent.

Neural Language Models
Word representations: why are they needed? Different types: one-hot, static, contextual, external vs internal representations. Feed-forward neural language models. Recurrent neural language models.

Neural Machine Translation
Encoder-decoder architecture and sequence-to-sequence modelling. Decoding for NMT. Recurrent neural networks for MT. Recurrent neural MT with attention. Transformer neural networks for MT.

Assessment Breakdown
Continuous Assessment30% Examination Weight70%
Course Work Breakdown
TypeDescription% of totalAssessment Date
AssignmentStudents undertake a group project of their choosing which involves training a machine translation system using the open-source toolkit Joey NMT which was developed for educational purposes.10%Once per semester
AssignmentStudents take on a significant individual project which involves calculations related to 1) automatic evaluation methods, 2) language model probabilities, 3) translation model probabilities 4) neural networks.20%Once per semester
Reassessment Requirement Type
Resit arrangements are explained by the following categories:
Resit category 1: A resit is available for both* components of the module.
Resit category 2: No resit is available for a 100% continuous assessment module.
Resit category 3: No resit is available for the continuous assessment component where there is a continuous assessment and examination element.
* ‘Both’ is used in the context of the module having a Continuous Assessment/Examination split; where the module is 100% continuous assessment, there will also be a resit of the assessment
This module is category 1
Indicative Reading List

  • Philipp Koehn,: 0, Statistical Machine Translation, 0521874157
  • Philipp Koehn: 0, Neural Machine Translation, 9781108608480
Other Resources

None

<< Back to Module List