DCU Home | Our Courses | Loop | Registry | Library | Search DCU
<< Back to Module List

Module Specifications.

Current Academic Year 2024 - 2025

All Module information is indicative, and this portal is an interim interface pending the full upgrade of Coursebuilder and subsequent integration to the new DCU Student Information System (DCU Key).

As such, this is a point in time view of data which will be refreshed periodically. Some fields/data may not yet be available pending the completion of the full Coursebuilder upgrade and integration project. We will post status updates as they become available. Thank you for your patience and understanding.

Date posted: September 2024

Module Title Programming for Data Analysis
Module Code CA274 (ITS) / CSC1033 (Banner)
Faculty Engineering & Computing School Computing
Module Co-ordinatorStephen Blott
Module Teachers-
NFQ level 8 Credit Rating 5
Pre-requisite Not Available
Co-requisite Not Available
Compatibles Not Available
Incompatibles Not Available
Coursework Only
Description

This module aims to give the student a background in using a programming language such as R to deliver a competent analysis of both structured and unstructured data.

Learning Outcomes

1. R Basics: The student should be able to manipulate and read data into various R dataframes and R Tables.
2. R Objects: Creating a library in R and using class objects.
3. Parallel Programming: Developing Parallel code to handle computationally intensive analysis.
4. Visualisation: Basic plots, Geographic Maps, Multi-Dimensional reduction, 3D plotting, Dynamic Graphics
5. Big Data in R: The student should be able to handle large data-sets and demonstrate the various techniques and libraries that can be used in R to analyze BIG data-sets.
6. Differential Equations and linear Algebra in R: Understanding of the packages used in linear Algebra and differential calculus.



Workload Full-time hours per semester
Type Hours Description
Lecture12Lectures will cover the material required for the course.
Laboratory24Laboratory will be used to demonstrate the techniques and R packages taught in class.
Assignment Completion89The will be 4 Assignments throughout the term. Assignments will vary in marks from 10 to 40 % of the final mark. See Module Content and Assessment.
Total Workload: 125

All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml

Indicative Content and Learning Activities

R Basics
Introduction to R. Variable function declaration, Creating data-frames & Matrices, Reading data from outside sources such as Databases/CSV /XML files

R objects
Library development in R. How to get the best out of CRAN. Using R objects and classes to handle data.

Parallel programming in R
Developing Parallel code to handle simple computationally intensive basic analysis. The following packages will be examined: pbdMPI/openMP, pbDSLAP, snowfall,foreach, future, rborist, randomForestSRC

R visualisation
Basic plots, Geographic Maps, Multi-Dimensional reduction, 3D plotting, Dynamic Graphics

R Big Data
Package, rJava, RCCP, pqR,pddR

Maths in R
Cover some Linear algebra and Differential Calculus techniques in R.

Assessment Breakdown
Continuous Assessment100% Examination Weight0%
Course Work Breakdown
TypeDescription% of totalAssessment Date
AssignmentData manipulation with a specified data-set. One should demonstrate how to pivot a table and provide a detailed univariate statistics with supplementary graphics.10%Week 3
AssignmentCreate a R library that will have a numer of class objects that can be used to create new features for a chosen dataset. These features should include transformations such as moving averages, exponential averages and other potential smoothing utilities.20%Week 6
AssignmentDemonstrate how simple linear regression can be implemented using parallel programming in R30%Week 8
AssignmentComplete a data analysis on a Big dataset that combines both text and numeric data.40%Week 12
Reassessment Requirement Type
Resit arrangements are explained by the following categories:
Resit category 1: A resit is available for both* components of the module.
Resit category 2: No resit is available for a 100% continuous assessment module.
Resit category 3: No resit is available for the continuous assessment component where there is a continuous assessment and examination element.
* ‘Both’ is used in the context of the module having a Continuous Assessment/Examination split; where the module is 100% continuous assessment, there will also be a resit of the assessment
This module is category 1
Indicative Reading List

  • Jane M. Horgan: 2009, Probability with R, Wiley, Hoboken, N.J., 9780470280737
  • Robert Kabacoff: 0, R in Action, Manning Publications, 9781935182399
  • Tony Fischetti: 2015, Data Analysis with R, Packt Publishing, 9781785288142
Other Resources

None

<< Back to Module List