Module:

Latest Module Specifications

Current Academic Year 2025 - 2026

Our systems are undergoing maintenance and are temporarily unavailable. Please try again later.

Module Title
Module Code	(ITS: CA6011)
Faculty		School
NFQ level		Credit Rating

Description

Neural natural language processing (NLP) underpins some of the most important technologies of the information age. It is found in tools for web search, advertising, emails, customer service, translation, and virtual agents, among many other applications. Most recently, large language models (LLMs) like the ones powering ChatGPT have been shown to have surprisingly varied knowledge and abilities far beyond the tasks they were trained for, and this has opened new and potentially very important application possibilities for NLP. This module will introduce students to the neural network architectures that power modern NLP including LLMs like GPT. Students will learn how such networks function and will be given the opportunity to train NLP systems using popular open-source neural NLP toolkits and libraries. The module will progress through three main learning blocks. The first block will impart theoretical understanding of the principal neural network architectures used for NLP, including feed-forward, recurrent and transformer network architectures, graph-based neural networks, and large-scale pretrained language models. Students will be introduced to the mathematical foundations of the relevant machine learning models and their associated optimisation algorithms. In the second learning block, students will gain practical understanding and skills in solving a number of NLP tasks by applying end-to-end neural architectures, fine-tuning existing neural language models on specific problems, and other approaches, covering a range of applications including analysing latent dimensions in text, transcribing speech to text, translating between languages, and answering questions. Students will learn about challenges, risks, and opportunities arising from the applications of deep learning techniques to such tasks. The third learning block will cover recent applications of neural networks including LLMs to multimodal and multilingual tasks that were largely infeasible before the emergence of modern neural network architectures.

Learning Outcomes

1. Reflect on and assess the theoretical underpinnings and practical applications of a range of different neural models used to solve NLP tasks, and how to select and apply optimisation algorithms for them
2. Design, test and implement neural attention mechanisms and sequence embedding models, and combine these modular components to build state of the art NLP systems.
3. Critically assess the range of available commonly used toolkits, libraries, reusable trained models and datasets in neural NLP, understand their possible uses, and assess their limitations
4. Critically assess and choose appropriate neural architectures for different NLP tasks, taking into account computational requirements, and adapting techniques from different subfields, languages and domains
5. Design, test and implement common neural network models for NLP tasks including those first introduced in the Foundations of NLP module (CA6010).
6. Critically assess and apply in practice reusable word and higher-level representations in neural NLP, and the difference between non-contextualised word vectors (word2vec, GloVe, etc.), and contextualised word vectors (ELMo, BERT, etc.), and the methods used to produce them.
7. Reflect on the challenges posed by pre-trained neural language models, including issues of bias and factual correctness in generated text
8. Reflect on and apply in practice knowledge about the possibilities opened up by modern neural architectures in enabling learning across languages and modalities
9. Reflect on and apply in practice learning relating to working and communicating effectively in a team to design and implement solutions for new domains or unfamiliar contexts, justifying the proposed design and development strategy.

*Type*	*Hours*	*Description*
Workload	Full time hours per semester
Lecture	24	Twice Weekly Lecture
Laboratory	24	2 hour lab once a week
Assignment Completion	80	Project Work
Independent Study	59.5	This reflects the work carried out by students outside the lecture (reading background material, finishing lab assignments)
Total Workload: 187.5

Type	Description	% of total	Assessment Date
Assessment Breakdown
Laboratory Portfolio	During the first 7 laboratory sessions, students will explore and learn how to implement neural systems for a selection of NLP problems.	40%	As required
Group project	Working in groups, students will address a real research problem in the field involving an NLP task using neural architectures, e.g. sentiment analysis in Twitter, fake news detection, question answering, cross-lingual text summarisation, caption generation, etc. Assessment will be split into 20% for group project work project and 10% for a short individual critical reflective scientific report on their contribution to project (challenges as learner, technical and scientific, and the solutions that worked for them).	30%	Once per semester
Formal Examination	End-of-Semester Final Examination	30%	End-of-Semester

Reassessment Requirement Type
Resit arrangements are explained by the following categories; RC1: A resit is available for both^* components of the module. RC2: No resit is available for a 100% coursework module. RC3: No resit is available for the coursework component where there is a coursework and summative examination element. ^* ‘Both’ is used in the context of the module having a coursework/summative examination split; where the module is 100% coursework, there will also be a resit of the assessment

Pre-requisite	None
Co-requisite	None
Compatibles	None
Incompatibles	None

All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml

Indicative Content and Learning Activities

I. Neural Network Architectures for Natural Language Processing
In the first learning block, students will gain a theoretical understanding of the principal neural network architectures used for NLP, including feed-forward, recurrent, encoder-decoder and transformer network architectures, and large-scale pretrained language models. Students will be introduced to the mathematical definitions of the relevant machine learning models and their associated optimisation algorithms.

II. Applying Neural NLP Methods
In the second learning block, students will explore selected neural NLP methods in depth, including applying them in practical exercises. Topics covered will include learning neural word vectors, fine-tuning language models for NLP tasks and Neural Language Generation, as below.

Learning neural word vectors
Building on concepts introduced in CA2010 Foundations of NLP, students will learn about neural word vectors and the differences between static, non-contextualised word vectors (word2vec, GloVe, etc.), and contextualized word vectors (ELMo, BERT, etc.), and the methods used to produce them.

Fine-tuning language models for NLP tasks
Students will learn about one of the most common approaches in modern NLP, namely building NLP systems by fine-tuning large-scale language models (BERT, GPT, T5, XLNet), and how this can be used to build solutions for a range of different NLP tasks.

Neural Language Generation
Basic concepts in NLG will be recapped including learning task construals, applications and history, before introducing the main current neural techniques, architectures and resources in use in NLG. The power and shortcomings of large language models will be examined, and the issues of controllability, bias and transparency will be discussed.

III. Neural Methods for Multilingual and Multimodal NLP
In the third learning block, students will be introduced to multilingual systems which can work for several languages, including those for which training data is severely limited. Techniques explored will include transfer learning and multilingual language models (XLM-R, mBERT). Students will also explore developing foundational systems which use NN architectures to integrate multiple modalities including text, speech, image and video.

Indicative Reading List

Books:

Ian Goodfellow,Yoshua Bengio,Aaron Courville: 2016, Deep Learning, MIT Press, 800, 0262035618
Yoav Goldberg: 2017, Neural Network Methods in Natural Language Processing (Synthesis Lectures on Human Language Technologies), Morgan & Claypool Publishers, 1627052984
Shashi Narayan,Claire Gardent: 2020, Deep Learning Approaches to Text Production, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, 9781681737607

Articles:

Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin: 2003, A neural probabilistic language model., J. Mach. Learn. Re, 3, https://dl.acm.org/doi/pdf/10.5555/944919.944966, 64989
2017: Attention is all you need., In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17), https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf, 64990, 1
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers): BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 1, https://aclanthology.org/N19-1423.pdf, 64991, 1, Jeffrey Pennington, Richard Socher, Christopher Manning
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP): https://aclanthology.org/D14-1162, 64992, 1, Omer Levy, Yoav Goldberg, Ido Dagan, 2015
3: https://aclanthology.org/Q15-1016.pdf, 64993, 1, Tobias Schnabel, Igor Labutov, David Mimno, Thorsten Joachims, 2015, Evaluation methods for unsupervised word embeddings
F: https://aclanthology.org/D15-1036.pdf: 64994, 1, Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, Jeff Dean, 2013, Distributed Representations of Words and Phrases and their Compositionality, Advances in Neural Information Processing Systems (NIPS 2013), 26,
64995: 1, Jeremy Howard, Sebastian Ruder, 2018, Universal Language Model Fine-tuning for Text Classification, Computational Linguistics, 1,
1: Guo, Yunhui, Honghui Shi, Abhishek Kumar, Kristen Grauman, Tajana Rosing, and Rogerio Feris, 2019, Spottune: transfer learning through adaptive fine-tuning, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), https://openaccess.thecvf.com/content_CVPR_2019/papers/Guo_SpotTune_Transfer_Learning_Through_Adaptive_Fine-Tuning_CVPR_2019_paper.pdf
Anna Rogers, Olga Kovaleva, Anna Rumshisky: 2020, A Primer in BERTology: What We Know About How BERT Works, Transactions of the Association for Computational Linguistics, 8, https://aclanthology.org/2020.tacl-1.54.pdf, 64998
2021: How fine-tuning allows for effective meta-learning, Advances in Neural Information Processing Systems 34 pre-proceedings (NeurIPS 2021), 34, https://proceedings.neurips.cc/paper/2021/file/4a533591763dfa743a13affab1a85793-Paper.pdf, 64999, 1
The Curious Case of Neural Text Degeneration: Proceedings of International Conference on Learning Representations, https://pubs.cs.uct.ac.za/id/eprint/1407/1/the_curious_case_of_neural_text_degeneration.pdf, 65000, 1, Abigail See, Peter J. Liu, Christopher D. Manning
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: 1, https://aclanthology.org/P17-1099.pdf, 65001, 1, Angela Fan, Mike Lewis, Yann Dauphin, 2018
1: https://aclanthology.org/P18-1082.pdf, 65002, 1, Chia-Wei Liu, Ryan Lowe, Iulian Serban, Mike Noseworthy, Laurent Charlin, Joelle Pineau, 2016, How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
https://aclanthology.org/D16-1230.pdf: 65003, 1, Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell, 2021, On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?, Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, March 2021,
65004: 1, Andrew Ross, Finale Doshi-Velez, 2018, Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing Their Input Gradients, Proceedings of the AAAI Conference on Artificial Intelligence,

Other Resources

Blog: Andrej Karpathy, 2015, The unreasonable effectiveness of recurrent neural networks,, http://karpathy. github. io/2015/05/21/rnn-effectiveness.
Technical Report: Bommasani, R., Hudson, D.A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M.S., Bohg, J., Bosselut, A., Brunskill, E. and Brynjolfsson, E.,, 2021, On the Opportunities and Risks of Foundation Models, Center for Research on Foundation Models (CRFM)., https://crfm.stanford.edu/report.html.
Open-source neural NLP technologies,: Hugging Face, https://huggingface.co/