DCU Home | Our Courses | Loop | Registry | Library | Search DCU

Module Specifications..

Current Academic Year 2023 - 2024

Please note that this information is subject to change.

Module Title Search Technologies
Module Code CA4009
School School of Computing
Module Co-ordinatorSemester 1: Gareth Jones
Semester 2: Gareth Jones
Autumn: Gareth Jones
Module TeachersDenise Freir
Gareth Jones
NFQ level 8 Credit Rating 7.5
Pre-requisite None
Co-requisite None
Compatibles None
Incompatibles None
None
Students will undertake laboratories on a self-study basis. Project will be undertaken and assessed as an individual assignment.
Description

The use of search technologies to locate relevant information from within increasingly voluminous archives of online digital media is rapidly becoming a ubiquitous and vital technology for daily life both in social and working environments. These archives include formally published text materials, heterogeneous web content, social media, audio-visual content, and various forms of enterprise content. The efficient location and delivery of content from these archives is enabling many exciting opportunities, increasing social engagement, creative exploitation of information, improved efficiency in business operations. However, realizing systems to perform reliable search and discovery of information, and effective delivery to users poses many challenges. This module introduces relevant search technologies and explores applications such as web search, image and video search, enterprise search and mobile search applications. The module covers key search topics, including content indexing, file structures, algorithms to support retrieval; related technologies such as content summarisation, speech and video processing; and user interaction in search and evaluation of search systems.

Learning Outcomes

1. Explain the process of content indexing in information retrieval including stop word removal, conflation (stemming, string-comparison), and the language dependency of these methods.
2. Demonstrate an understanding of the importance and application of data structures in efficient information retrieval, in particular inverted file structures.
3. Have knowledge of the importance and operation of standard algorithms for ranked information retrieval, including the term weighting and ranking models, e.g. tf-idf weighting, vector-space model, probabilistic model, language modeling.
4. Describe the process of relevance feedback for improved ranking in information retrieval, and apply standard relevance feedback algorithms, e.g Roochio, and probabilistic methods.
5. Explain the principles of content summarization, be able to describe and apply standard extractive summarization methods, e.g. to form document snippets for web retrieval.
6. Describe the need for indexing in multimedia content including spoken and visual content, including explaining the impact of recognition errors on information retrieval behaviour.
7. Understand the importance of evaluation in development of search engines, and the application of standard evaluation metrics such as precision and recall and test collections in measuring effectiveness of information retrieval systems.
8. Appreciate the application and operation of search engines in diverse environments, e.g. web search, audio-visual search, context-aware and mobile search, enterprise search, patent search, search in lifelogging.
9. Be able to begin to combine technologies relevant to search systems in novel ways to synthesise new information retrieval applications.



Workload Full-time hours per semester
Type Hours Description
Lecture24Lectures will present the core material from the module.
Laboratory12Students will undertaken structured laboratory examining key elements of indexing, search and evaluation.
Tutorial12Students will be given practical guidance on use implementation and evaluation of their research project.
Assignment Completion48Students will work on completing and writing up individual laboratory exercises, and undertaking group project.
Independent Study91.5Students will study course material and prepare for the final examination.
Total Workload: 187.5

All module information is indicative and subject to change. For further information,students are advised to refer to the University's Marks and Standards and Programme Specific Regulations at: http://www.dcu.ie/registry/examinations/index.shtml

Indicative Content and Learning Activities

indexing
tokenisation, stop word removal, conflation (e.g. stemming), data structures (e.g. inverted files)

information retrieval algorithms
Boolean search, term weighting (e.g. tf-idf), vector-space model probabilistic model, language model, relevance feedback methods (e.g. Rocchio, probabilistic approaches)

multimedia indexing
speech recognition, processing of image and video

information retrieval evaluation
evaluation metrics (e.g. precision, recall), evaluation task and test collection development

Information retrieval applications
e.g. web search (including the learning-to-rank approach), audio-visual search, patent search,.lifelog search, mobile search, enterprise search

Assessment Breakdown
Continuous Assessment30% Examination Weight70%
Course Work Breakdown
TypeDescription% of totalAssessment Date
Laboratory PortfolioUndertake a series of individual laboratory exercises using a standard information retrieval toolkit to explore the operation of the indexing process, ranking models in information retrieval, relevance feedback, summary generation, and information retrieval evaluation for a small information retrieval test collection using standard evaluation metrics.15%Week 25
ProjectA research-style group project developing a simple novel information retrieval application. Project based on module material, but also taking in private research, with documentation of the design of the proposed application, and the evaluation of its search effectiveness using information retrieval evaluation methods. Students will be provided formative feedback on their project proposal in an assessed formative presentation. Assessment will be based on a group written report and group presentation.15%Sem 2 End
Reassessment Requirement Type
Resit arrangements are explained by the following categories;
1 = A resit is available for all components of the module
2 = No resit is available for 100% continuous assessment module
3 = No resit is available for the continuous assessment component
This module is category 1
Indicative Reading List

  • Christopher D. Manning, Prabhakar Raghavan, Hinrich Schutze:: 2008, Introduction to information retrieval, 1, Cambridge University Press, 506, 978-0521865715
  • Ricardo Baeza-Yates, Berthier Ribeiro-Neto: 2010, Modern Information Retrieval: The Concepts and Technology Behind Search, 2, Addison Wesley, 978-0321416919
  • Peter Jackson, Isabelle Moulinier: 2007, Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization, 2, John Benjamins Publishing Company, 978-9027249920
Other Resources

None
Programme or List of Programmes
CASEBSc in Computer Applications (Sft.Eng.)
DSBSc in Data Science
ECBSc in Enterprise Computing
ECSAStudy Abroad (Engineering & Computing)
ECSAOStudy Abroad (Engineering & Computing)
Archives:

My DCU | Loop | Disclaimer | Privacy Statement