DSA Modules

Module 1: Introduction to Data Science and Big Data

This module provides an introduction to data science processes, big data types and nature, sources, techniques and tools; and provides examples. This module is a foundation, which will help students gain a deeper understanding into Data Science processes, tools, big data sources, flow, curation and analysis as well as the associated ethics and challenges such as availability, reliability/quality, privacy and security. (Duration: up to 6 hours)

Module 2: Big data modeling, management and its technologies

This topic covers one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible. It also covers tools including AsterixDB, HP Vertica, Impala, Neo4j, Redis, and SparkSQL. This module help students gain understanding into the hardware and software technologies used to store, retrieve and process big data. (Duration: up to 5 hours)

Module 3: Data Analytics

This topic covers the Data Analytics process: Asking interesting questions related to one’s discipline area, retrieving data, exploring data, modeling, and presentation of data analysis results. This module helps students to get deeper understanding of how large amount of data can be analyzed to uncover hidden patterns, correlations and other insightful relationships (Duration up to 8 hours).

Module 4: Visualization and Visualize Analytics

This topic covers design principles and techniques for visualizing data, visualization techniques for spatial and geospatial data (Duration up to 4 hours)

Module 5: Data Mining (DM) and Machine Learning (ML)

This topic covers the data mining processes for discovering patterns in datasets both structured and unstructured, and machine learning approaches and techniques. It focus on the applications of DM and ML for scientific and business problems (Duration: up to 6 hours)