This is a tentative schedule. It will be updated according to the actual progress.
Resource Management and Infrastructure for Big Data Systems and Cloud-Native Applications
MondayAssignment #0 - Hadoop Cluster Setup is due!
An Introduction to ZooKeeper (ZK)
Chinese Lunar New Year
DAG-based Dataflow Systems: Dryad, DryadLINQ, Tez and Beyond
SundayAssignment #1 - Hadoop over Kubernetes is due!
High-level Big Data Query Languages: Pig and Hive
WednesdayAssignment #1 - Similar Users Detection via MapReduce (updated on 30 Jan) is due!
BDAS and Spark
TuesdayAssignment #2 - Pig and Hive is due!
Big Stream Processing frameworks: Unified Log via Apache Kafka; Storm ; Spark Streaming ; Spark Structural Streaming ; Lambda & Kappa Architecture;
Big Graph Processing frameworks: Pregel/Giraph and GraphLab ; GraphX, GraphFrame;
MondayAssignment #3 - Spark (updated on 14 Mar) is due!
Optional Seminars in Reading Week (04/06/2022 and 04/08/2022) : Generalized Streaming Model and Apache Beam; GraphLab 2.0: Challenges and solutions for processing Power-Law Graphs in Practice
Big Data Stores (aka NoSQL Databases)
TuesdayAssignment #4 - Kafka is due!
(Cont'd) Big Data Stores (aka NoSQL Databases)
LectureMachine Learning Support and Beyond
TuesdayAssignment #5 - GraphFrames, GraphX, HBase and SparkML is due!
TuesdayQ&A Assignment is due!
MondayProject (ESTR4316) is due!