DCU Home | Our Courses | Loop | Registry | Library | Search DCU
<< Back to Module List

Latest Module Specifications

Current Academic Year 2025 - 2026

Module Title Natural Language Technologies
Module Code CSC1110 (ITS: CA4023)
Faculty Engineering & Computing School Computing
NFQ level 8 Credit Rating 7.5
Description

This module provides students with a practical and theoretical grounding in the following core topics in modern Natural Language Processing: language modelling, language analysis, information extraction and language understanding. Students will learn how these problems are tackled using supervised and semi-supervised machine learning, and they will gain hands-on experience developing machine-learning solutions during the laboratory sessions. Popular benchmark datasets will be employed, including ‘noisy’ datasets containing text from sources such as Twitter and reddit.

Learning Outcomes

1. 1E266EB2-6EF3-0001-C3FF-5C701A791840
2. Describe the applications of Natural Language Processing (NLP) in Data Science
4. 7,6
5. 1
6. 1E266EB2-8C95-0001-21A0-B29015C0C070
7. Illustrate how neural word embeddings underpin modern NLP systems
9. 8,9
10. 2
11. 1E266EB2-97B1-0001-696D-F7E210CC8B70
12. Develop an English language model
14. 8,9,10
15. 3
16. 1E266EB2-9DFD-0001-51D3-127017B067A0
17. Evaluate an English language model
19. 11,9
20. 4
21. 1E266EB2-AE4F-0001-9973-1170B0F51501
22. Develop an English part-of-speech tagger
24. 8,9,10
25. 5
26. 1E266EB2-B95A-0001-28A3-3CAF1344F670
27. Evaluate an English part-of-speech tagger
29. 11,9
30. 6
31. 1E266EB2-BF01-0001-546D-82301A7816FC
32. Develop a sentiment analysis system for English
34. 8,9,10
35. 7
36. 1E266EB2-C297-0001-3A8E-186775CC186F
37. Evaluate a sentiment analysis system for English
39. 11,9
40. 8
41. 1E266EB2-CB21-0001-D0B7-ADAF3160DE20
42. Develop an English question answer/reading comprehension system
44. 8,9,10
45. 9
46. 1E266EB2-CD08-0001-45CB-1A4EB3F012F2
47. Evaluate an English question answer/reading comprehension system
49. 11,9
50. 10
51. 1E266EB2-DA38-0001-5C97-5A611E201EEC
52. Explain the unsolved problems in NLP research
54. 10
55. 11
56. 1E266EB2-E5E1-0001-6EA0-E0061180A580
57. Summarize the ethical issues surrounding modern data-driven NLP
59. 10
60. 12


WorkloadFull time hours per semester
TypeHoursDescription
Lecture24Formal lectures introducing NLP for data science
Laboratory12Series of laboratories introducing python-based machine learning techniques for NLP
Assignment Completion21.5No Description
Independent Study130No Description
Total Workload: 187.5
Section Breakdown
CRN10610Part of TermSemester 1
Coursework40%Examination Weight60%
Grade Scale40PASSPass Both ElementsN
Resit CategoryRC1Best MarkN
Module Co-ordinatorEllen RusheModule Teacher
Section Breakdown
CRN12069Part of TermSemester 1
Coursework40%Examination Weight60%
Grade Scale40PASSPass Both ElementsN
Resit CategoryRC1Best MarkN
Module Co-ordinatorEllen RusheModule TeacherJennifer Foster
Assessment Breakdown
TypeDescription% of totalAssessment Date
AssignmentLanguage modelling10%Week 21
AssignmentPart-of-speech tagging10%Week 24
AssignmentSentiment analysis10%Week 27
AssignmentQuestion Answering/Machine Reading Comprehension10%Week 30
Formal Examination3 hour written exam60%End-of-Semester
Reassessment Requirement Type
Resit arrangements are explained by the following categories;
RC1: A resit is available for both* components of the module.
RC2: No resit is available for a 100% coursework module.
RC3: No resit is available for the coursework component where there is a coursework and summative examination element.

* ‘Both’ is used in the context of the module having a coursework/summative examination split; where the module is 100% coursework, there will also be a resit of the assessment

Pre-requisite None
Co-requisite None
Compatibles None
Incompatibles None

All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml

Indicative Content and Learning Activities

Language Modelling
Next word prediction using n-gram language models, modern word embeddings including word2vec (Mikolov et al. 2013) and contextualised word embeddings such as BERT (Devlin et al. 2019). Generating text using language models

Language analysis and data extraction
Part-of-speech tagging, dependency parsing, named-entity recognition, semantic parsing. Sequence labelling and structured prediction using recurrent neural nets (LSTMs) and transformer networks

Language understanding
Sentiment analysis, automatic reading comprehension and question answering. Using information retrieval techniques in question answering. Using language analysis and extraction tools (see above) to improve baseline models for language understanding applications

Indicative Reading List

Books:
  • Jurafsky and Martin: 0, Speech and Language Processing, Prentice Hall,
  • Manning and Schutze: 1999, Foundations of Statistical Natural Language Processing, MIT Press,
  • Goldberg: 0, Neural Network Methods for Natural Language Processing, Morgan and Claypool,


Articles:
  • Mikolov, Chen, Corrado and Dean.: 2013, Efficient Estimation of Word Representations in Vector Space, https://arxiv.org/pdf/1301.3781.pdf, 41140
  • 2019: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, https://www.aclweb.org/anthology/N19-1423.pdf,
Other Resources

None

<< Back to Module List View 2024/25 Module Record for CSC1110