DCU Home | Our Courses | Loop | Registry | Library | Search DCU

Module Specifications..

Current Academic Year 2023 - 2024

Please note that this information is subject to change.

Module Title Tools & Tech for Large Scale Data Analyt-NUIG
Module Code CA6008I
School School of Computing
Module Co-ordinatorSemester 1: Annalina Caputo
Semester 2: Annalina Caputo
Autumn: Annalina Caputo
Module TeachersDenise Freir
Annalina Caputo
NFQ level 9 Credit Rating 5
Pre-requisite None
Co-requisite None
Compatibles None
Incompatibles None
Repeat examination
Description

This module is accredited by NUIG. Large-scale data analytics is concerned with the processing and analysis of large quantities of data, typically from distributed sources (such as data streams on the internet). This module introduces students to state-of-the-art approaches to large-scale data analytics. Students learn about foundational concepts, software tools and advanced programming techniques for the scalable storage, processing and predictive analysis of high- volume and high-velocity data, and how to apply them to practical problems. ** This module uses Java as programming language. Knowledge of Java is a prerequisite for participation in this module. ** Planned topics include: Definition of large-scale computational data analytics; Overview of approaches to the processing and analysis of high volume and high velocity data from distributed sources; Applications of large-scale data analytics; Foundations of cluster computing and parallel data processing; The Hadoop and Spark ecosystems. MapReduce; Advanced programming concepts for large-scale data analytics; Concepts and tools for large-scale data storage; Stream data analytics. Complex Event Processing (CEP); Techniques and open-source tools for largescale predictive analytics; Computational statistics and machine learning with large-scale data processing frameworks such as Spark; Privacy in the context of large-scale data analytics. Further information pertaining to the module is available from NUIG.

Learning Outcomes

1. Be able to define large-scale data analytics and understand its characteristics
2. Be able to explain and apply concepts and tools for distributed and parallel processing of large-scale data
3. Know how to explain and apply concepts and tools for highly scalable collection, querying, filtering, sorting and synthesizing of data
4. Know how to describe and apply selected statistical and machine learning techniques and tools for the analysis of large-scale data
5. Know how to explain and apply approaches to stream data analytics and complex event processing
6. Understand and be able to discuss privacy issues in connection with largescale data analytics



Workload Full-time hours per semester
Type Hours Description
Total Workload: 0

All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml

Indicative Content and Learning Activities

Assessment Breakdown
Continuous Assessment30% Examination Weight70%
Course Work Breakdown
TypeDescription% of totalAssessment Date
AssignmentGraded assignments on topics such as: Definition of large-scale computational data analytics. Overview of approaches to the processing and analysis of high volume and high velocity data from distributed sources. Applications of large-scale data analytics. Foundations of cluster computing and parallel data processing. The Hadoop and Spark ecosystems. MapReduce. Advanced programming concepts for large-scale data analytics. Concepts and tools for large- scale data storage. Stream data analytics. Complex Event Processing (CEP). Overview of computational statistics and machine learning in the Hadoop/Spark universe. Techniques and open- source tools for large-scale predictive analytics.30%n/a
Reassessment Requirement Type
Resit arrangements are explained by the following categories;
1 = A resit is available for all components of the module
2 = No resit is available for 100% continuous assessment module
3 = No resit is available for the continuous assessment component
This module is category 3
Indicative Reading List

    Other Resources

    None
    Programme or List of Programmes
    MCMM.Sc. in Computing
    Archives:

    My DCU | Loop | Disclaimer | Privacy Statement