DCU Home | Our Courses | Loop | Registry | Library | Search DCU

Module Specifications..

Current Academic Year 2023 - 2024

Please note that this information is subject to change.

Module Title Deep Learning for Natural Language Processing
Module Code CA6011
School School of Computing
Module Co-ordinatorSemester 1: Anya Belz
Semester 2: Anya Belz
Autumn: Anya Belz
Module TeachersBrian Davis
Anya Belz
NFQ level 9 Credit Rating 7.5
Pre-requisite None
Co-requisite None
Compatibles None
Incompatibles None
Repeat examination
Description

Neural natural language processing (NLP) underpins some of the most important technologies of the information age. It is found in tools for web search, advertising, emails, customer service, translation, and virtual agents, among many other applications. Most recently, large language models (LLMs) like the ones powering ChatGPT have been shown to have surprisingly varied knowledge and abilities far beyond the tasks they were trained for, and this has opened new and potentially very important application possibilities for NLP. This module will introduce students to the neural network architectures that power modern NLP including LLMs like GPT. Students will learn how such networks function and will be given the opportunity to train NLP systems using popular open-source neural NLP toolkits and libraries. The module will progress through three main learning blocks. The first block will impart theoretical understanding of the principal neural network architectures used for NLP, including feed-forward, recurrent and transformer network architectures, graph-based neural networks, and large-scale pretrained language models. Students will be introduced to the mathematical foundations of the relevant machine learning models and their associated optimisation algorithms. In the second learning block, students will gain practical understanding and skills in solving a number of NLP tasks by applying end-to-end neural architectures, fine-tuning existing neural language models on specific problems, and other approaches, covering a range of applications including analysing latent dimensions in text, transcribing speech to text, translating between languages, and answering questions. Students will learn about challenges, risks, and opportunities arising from the applications of deep learning techniques to such tasks. The third learning block will cover recent applications of neural networks including LLMs to multimodal and multilingual tasks that were largely infeasible before the emergence of modern neural network architectures.

Learning Outcomes

1. Reflect on and assess the theoretical underpinnings and practical applications of a range of different neural models used to solve NLP tasks, and how to select and apply optimisation algorithms for them
2. Design, test and implement neural attention mechanisms and sequence embedding models, and combine these modular components to build state of the art NLP systems.
3. Critically assess the range of available commonly used toolkits, libraries, reusable trained models and datasets in neural NLP, understand their possible uses, and assess their limitations
4. Critically assess and choose appropriate neural architectures for different NLP tasks, taking into account computational requirements, and adapting techniques from different subfields, languages and domains
5. Design, test and implement common neural network models for NLP tasks including those first introduced in the Foundations of NLP module (CA6010).
6. Critically assess and apply in practice reusable word and higher-level representations in neural NLP, and the difference between non-contextualised word vectors (word2vec, GloVe, etc.), and contextualised word vectors (ELMo, BERT, etc.), and the methods used to produce them.
7. Reflect on the challenges posed by pre-trained neural language models, including issues of bias and factual correctness in generated text
8. Reflect on and apply in practice knowledge about the possibilities opened up by modern neural architectures in enabling learning across languages and modalities
9. Reflect on and apply in practice learning relating to working and communicating effectively in a team to design and implement solutions for new domains or unfamiliar contexts, justifying the proposed design and development strategy.



Workload Full-time hours per semester
Type Hours Description
Lecture24Twice Weekly Lecture
Laboratory242 hour lab once a week
Assignment Completion80Project Work
Independent Study59.5This reflects the work carried out by students outside the lecture (reading background material, finishing lab assignments)
Total Workload: 187.5

All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml

Indicative Content and Learning Activities

I. Neural Network Architectures for Natural Language Processing
In the first learning block, students will gain a theoretical understanding of the principal neural network architectures used for NLP, including feed-forward, recurrent, encoder-decoder and transformer network architectures, and large-scale pretrained language models. Students will be introduced to the mathematical definitions of the relevant machine learning models and their associated optimisation algorithms.

II. Applying Neural NLP Methods
In the second learning block, students will explore selected neural NLP methods in depth, including applying them in practical exercises. Topics covered will include learning neural word vectors, fine-tuning language models for NLP tasks and Neural Language Generation, as below.

Learning neural word vectors
Building on concepts introduced in CA2010 Foundations of NLP, students will learn about neural word vectors and the differences between static, non-contextualised word vectors (word2vec, GloVe, etc.), and contextualized word vectors (ELMo, BERT, etc.), and the methods used to produce them.

Fine-tuning language models for NLP tasks
Students will learn about one of the most common approaches in modern NLP, namely building NLP systems by fine-tuning large-scale language models (BERT, GPT, T5, XLNet), and how this can be used to build solutions for a range of different NLP tasks.

Neural Language Generation
Basic concepts in NLG will be recapped including learning task construals, applications and history, before introducing the main current neural techniques, architectures and resources in use in NLG. The power and shortcomings of large language models will be examined, and the issues of controllability, bias and transparency will be discussed.

III. Neural Methods for Multilingual and Multimodal NLP
In the third learning block, students will be introduced to multilingual systems which can work for several languages, including those for which training data is severely limited. Techniques explored will include transfer learning and multilingual language models (XLM-R, mBERT). Students will also explore developing foundational systems which use NN architectures to integrate multiple modalities including text, speech, image and video.

Assessment Breakdown
Continuous Assessment70% Examination Weight30%
Course Work Breakdown
TypeDescription% of totalAssessment Date
Laboratory PortfolioDuring the laboratory sessions, the students will explore and learn how to implement neural systems for a selection of NLP problems20%Every Second Week
Group project Working in groups, students will address a real research problem in the field involving an NLP task using neural architectures, e.g. sentiment analysis in Twitter, fake news detection, question answering, cross-lingual text summarisation, caption generation, etc. Assessment will be split into 20% for group project work project and 10% for a short individual critical reflective scientific report on their contribution to project (challenges as learner, technical and scientific, and the solutions that worked for them).30%Once per semester
Reassessment Requirement Type
Resit arrangements are explained by the following categories;
1 = A resit is available for all components of the module
2 = No resit is available for 100% continuous assessment module
3 = No resit is available for the continuous assessment component
This module is category 1
Indicative Reading List

  • Ian Goodfellow,Yoshua Bengio,Aaron Courville: 2016, Deep Learning, MIT Press, 0262035618
  • Yoav Goldberg: 2017, Neural Network Methods in Natural Language Processing (Synthesis Lectures on Human Language Technologies), Morgan & Claypool Publishers, 1627052984
  • Shashi Narayan,Claire Gardent: 2020, Deep Learning Approaches to Text Production, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, 9781681737607
Other Resources

49337, Blog, Andrej Karpathy, 2015, The unreasonable effectiveness of recurrent neural networks,, http://karpathy. github. io/2015/05/21/rnn-effectiveness., 49342, Technical Report, Bommasani, R., Hudson, D.A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M.S., Bohg, J., Bosselut, A., Brunskill, E. and Brynjolfsson, E.,, 2021, On the Opportunities and Risks of Foundation Models, Center for Research on Foundation Models (CRFM)., https://crfm.stanford.edu/report.html., 49344, Open-source neural NLP technologies,, 0, Hugging Face, https://huggingface.co/,
Programme or List of Programmes
MCMM.Sc. in Computing
Archives:

My DCU | Loop | Disclaimer | Privacy Statement