Module Overview
Programming for Big Data
Part-time / Level 9 / Online / 5 ECTS
Students taking this module will acquire the computer programming skills necessary to analyse and manipulate big data. Big data in this context refers to datasets that are too large to be handled by the software tools commonly used to analyse and manipulate data within a tolerable elapsed time. The context and challenges for processing large datasets form a core part of this course, such that the student will be able to select the appropriate approaches, tools or methods for big data problems in addition to being able to implement and evaluate solutions using a variety of programming tools and techniques.
- Introduction to programming for big data
What is big data?
How is programming for big data different?
Distributed programming paradigms
Hadoop and HDFS
Map, Reduce, and Chained MapReduce Processes
Distributed programming tools for data storage and data analysis
- Advanced Big Data Analytics and Machine Learning
Examine various distributed analytics and machine learning languages (Mahoot, Flink, Spark)
Spark architecture
RDDs
Creating Spark pipelines
Working with different data sources
Practical application of these various technologies for given problems and case studies
- Big Data Storage Engines
Examine various distributed data storage engines for big data
Examine the use of Object Storage in big data environments
The module is designed to be delivered within a blended learning model, employing mixed modes (online and face to face) of learning, teaching and assessment.
TU059 will be delivered primarily in a face-to-face mode while TU060 will be delivered in a blended mode.
This module will employ teaching methods and learning situations in the traditional roles such as lectures, seminars and tutorials, as well as more innovative, Student-based learning methods such as problem solving in groups for both theoretical and practical situations.
Students will be encouraged to be pro-active in their approach to learning through the use of case studies and simulation exercises, working independently and in groups. In some cases students
will be expected to use computer-based learning material to supplement studies.
There will be a strong emphasis on the practical element of the module and this will be supported through the medium of supervised and independent practical sessions. Students will be able to
explore the characteristics, advantages and limitations of approaches learnt through their application to suitable case studies and simulation exercises. Where appropriate, students will provide
feedback from group research through cascading the knowledge to peers and through presentations. In-class discussions, review of leading research papers in each topic covered will also
contribute towards the practical content.
Guest lecturers from industry and academia will be invited where appropriate to expose students to how topics covered in this module are used within the broader area of data analytics.
The most appropriate distribution methods will be used to distribute materials to students, between students and from students, e.g. a VLE, blogs, Twitter, a forum.
Students will be expected to develop independence in, and responsibility for their own learning.
Module Content & Assessment
Assessment Breakdown | % |
---|---|
Other Assessment (s) | 100.00% |
Contact school.cs@tudublin.ie for further information.
EU students: €230
Non-EU students: Contact international.city@tudublin.ie for more details.