Module Specifications.
Current Academic Year 2024 - 2025
All Module information is indicative, and this portal is an interim interface pending the full upgrade of Coursebuilder and subsequent integration to the new DCU Student Information System (DCU Key).
As such, this is a point in time view of data which will be refreshed periodically. Some fields/data may not yet be available pending the completion of the full Coursebuilder upgrade and integration project. We will post status updates as they become available. Thank you for your patience and understanding.
Date posted: September 2024
| |||||||||||||||||||||||||||||||||||||||||||
Repeat examination |
|||||||||||||||||||||||||||||||||||||||||||
Description Neural natural language processing (NLP) underpins some of the most important technologies of the information age. It is found in tools for web search, advertising, emails, customer service, translation, and virtual agents, among many other applications. Most recently, large language models (LLMs) like the ones powering ChatGPT have been shown to have surprisingly varied knowledge and abilities far beyond the tasks they were trained for, and this has opened new and potentially very important application possibilities for NLP. This module will introduce students to the neural network architectures that power modern NLP including LLMs like GPT. Students will learn how such networks function and will be given the opportunity to train NLP systems using popular open-source neural NLP toolkits and libraries. The module will progress through three main learning blocks. The first block will impart theoretical understanding of the principal neural network architectures used for NLP, including feed-forward, recurrent and transformer network architectures, graph-based neural networks, and large-scale pretrained language models. Students will be introduced to the mathematical foundations of the relevant machine learning models and their associated optimisation algorithms. In the second learning block, students will gain practical understanding and skills in solving a number of NLP tasks by applying end-to-end neural architectures, fine-tuning existing neural language models on specific problems, and other approaches, covering a range of applications including analysing latent dimensions in text, transcribing speech to text, translating between languages, and answering questions. Students will learn about challenges, risks, and opportunities arising from the applications of deep learning techniques to such tasks. The third learning block will cover recent applications of neural networks including LLMs to multimodal and multilingual tasks that were largely infeasible before the emergence of modern neural network architectures. | |||||||||||||||||||||||||||||||||||||||||||
Learning Outcomes 1. Reflect on and assess the theoretical underpinnings and practical applications of a range of different neural models used to solve NLP tasks, and how to select and apply optimisation algorithms for them 2. Design, test and implement neural attention mechanisms and sequence embedding models, and combine these modular components to build state of the art NLP systems. 3. Critically assess the range of available commonly used toolkits, libraries, reusable trained models and datasets in neural NLP, understand their possible uses, and assess their limitations 4. Critically assess and choose appropriate neural architectures for different NLP tasks, taking into account computational requirements, and adapting techniques from different subfields, languages and domains 5. Design, test and implement common neural network models for NLP tasks including those first introduced in the Foundations of NLP module (CA6010). 6. Critically assess and apply in practice reusable word and higher-level representations in neural NLP, and the difference between non-contextualised word vectors (word2vec, GloVe, etc.), and contextualised word vectors (ELMo, BERT, etc.), and the methods used to produce them. 7. Reflect on the challenges posed by pre-trained neural language models, including issues of bias and factual correctness in generated text 8. Reflect on and apply in practice knowledge about the possibilities opened up by modern neural architectures in enabling learning across languages and modalities 9. Reflect on and apply in practice learning relating to working and communicating effectively in a team to design and implement solutions for new domains or unfamiliar contexts, justifying the proposed design and development strategy. | |||||||||||||||||||||||||||||||||||||||||||
All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml |
|||||||||||||||||||||||||||||||||||||||||||
Indicative Content and Learning Activities
I. Neural Network Architectures for Natural Language ProcessingIn the first learning block, students will gain a theoretical understanding of the principal neural network architectures used for NLP, including feed-forward, recurrent, encoder-decoder and transformer network architectures, and large-scale pretrained language models. Students will be introduced to the mathematical definitions of the relevant machine learning models and their associated optimisation algorithms.II. Applying Neural NLP MethodsIn the second learning block, students will explore selected neural NLP methods in depth, including applying them in practical exercises. Topics covered will include learning neural word vectors, fine-tuning language models for NLP tasks and Neural Language Generation, as below.Learning neural word vectorsBuilding on concepts introduced in CA2010 Foundations of NLP, students will learn about neural word vectors and the differences between static, non-contextualised word vectors (word2vec, GloVe, etc.), and contextualized word vectors (ELMo, BERT, etc.), and the methods used to produce them.Fine-tuning language models for NLP tasksStudents will learn about one of the most common approaches in modern NLP, namely building NLP systems by fine-tuning large-scale language models (BERT, GPT, T5, XLNet), and how this can be used to build solutions for a range of different NLP tasks.Neural Language GenerationBasic concepts in NLG will be recapped including learning task construals, applications and history, before introducing the main current neural techniques, architectures and resources in use in NLG. The power and shortcomings of large language models will be examined, and the issues of controllability, bias and transparency will be discussed.III. Neural Methods for Multilingual and Multimodal NLPIn the third learning block, students will be introduced to multilingual systems which can work for several languages, including those for which training data is severely limited. Techniques explored will include transfer learning and multilingual language models (XLM-R, mBERT). Students will also explore developing foundational systems which use NN architectures to integrate multiple modalities including text, speech, image and video. | |||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||
Indicative Reading List
| |||||||||||||||||||||||||||||||||||||||||||
Other Resources 49337, Blog, Andrej Karpathy, 2015, The unreasonable effectiveness of recurrent neural networks,, http://karpathy. github. io/2015/05/21/rnn-effectiveness., 49342, Technical Report, Bommasani, R., Hudson, D.A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M.S., Bohg, J., Bosselut, A., Brunskill, E. and Brynjolfsson, E.,, 2021, On the Opportunities and Risks of Foundation Models, Center for Research on Foundation Models (CRFM)., https://crfm.stanford.edu/report.html., 49344, Open-source neural NLP technologies,, 0, Hugging Face, https://huggingface.co/, | |||||||||||||||||||||||||||||||||||||||||||