Registry
Module Specifications
Archived Version 2019 - 2020
| |||||||||||||||||||||||||||||||||||||
Description A Data Warehouse is the model or structure that supports data mining and decision support. This module teaches students how to build Data Warehouses by understanding their structures and the concept of multi-dimensional modelling. It also covers Data Mining to teach students how to extract knowledge from data warehouses using three different approaches: clustering, association rule mining and classification. | |||||||||||||||||||||||||||||||||||||
Learning Outcomes 1. Be able to build Data Warehouses for different applications types 2. Be able to deploy the Data Warehouse Bus Matrix to create individual data marts. 3. Be able to design a multi-dimensional schema model. 4. Analyse the different strategies and techniques involved in Data Mining, and choose the correct approach for each dataset. 5. Be able to construct and deploy data mining algorithms. 6. Be able to determine the predictive accuracy of data mining algorithms | |||||||||||||||||||||||||||||||||||||
All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml |
|||||||||||||||||||||||||||||||||||||
Indicative Content and
Learning Activities Data Mining ConceptsIntroduction to terminology and basic concepts.ClassificationIn this section, we describe two classification algorithms: one that can be used when all the attributes are categorical, the other when attributes are continuous.Association Rule MiningUnlike classification, the left- and right-hand sides of rules can potentially include tests on the value of any attribute or combination of attributes. Rules of this more general kind represent an association between the values of certain attributes and those of others and are called association rules. The process of extracting such rules from a given dataset is called association rule mining. In this section, algorithms for efficient rule generation are described.ClusteringClustering is concerned with grouping together objects that are similar to each other and dissimilar to the objects belonging to other clusters. We will describe two methods for which the similarity between objects is based on a measure of the distance between them.Predictive AccuracyTwo approaches to determining the quality of our data mining predictions are covered.Overfitting Decision TreesMany data mining methods suffer from the problem of overfitting to the training data, resulting in some cases in excessively large rule sets and/or rules with very low predictive power for previously unseen data. In this section, we look at ways of adjusting a decision tree either while it is being generated, or afterwards, in order to increase its predictive accuracy.Data Warehouse CharacteristicsAn overview of the terminology, background and motivation for constructing a data warehouse.Multidimensional ModellingStudents will cover the concepts of dimensions, pivots, fact table granularity, roll-up and drill-down functions.Building the Data WarehouseA step-by-step case study to creating fact and dimension tables.Web Data WarehousesIn this section, we describe how click stream data can be included into a traditional data warehouse using new dimensions and a click stream fact table. | |||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||
Indicative Reading List
| |||||||||||||||||||||||||||||||||||||
Other Resources None | |||||||||||||||||||||||||||||||||||||
Programme or List of Programmes | |||||||||||||||||||||||||||||||||||||
Archives: |
|