~ Home ~
Description
This is a graduate-level course in cloud computing. Topics to be discussed include:
- Computing as a Utility ; Cloud Service Models (e.g., IaaS, PaaS, SaaS)
- Data Center Architecture
- Case studies of major Cloud Providers, e.g. Amazon Web Services (AWS).
- Data-Center Operating System/Management Platforms: e.g. Windows Azure, OpenStack
- Programming models and platforms for Cloud Computing and Big Data Processing (e.g. MapReduce/ Hadoop, GFS/HDFS ; other components of the Cloud Computing/ Big Data processing stack)
- Concurrency, Consistency and Replication/Fault-tolerance in the Cloud (Locks and Transactions, CAP Theorem, ACID vs. BASE, Consensus management - Paxos and Zookeeper);
- Cloud-scale Datastore: NoSQL databases, e.g. Dynamo, BigTable/HBASE, Cassandra etc.
- High-level Data Query processing systems (e.g. Pig, Hive )
- Virtualization Technologies: Virtual Machine Monitors (e.g. Xen, VMware), Network Virtualization (e.g. VxLAN, SDN, NFV),
- Cloud Service Security and Privacy ;
Old Course Webpage for CMSC5735, Fall 2014
Course materials from Fall 2014 offering
Course Pre-requisite:
This course contains substantial hands-on components which require solid background in programming and hands-on operating systems experience. If you have never used a command-line interface to install/configure/manage an operating system, e.g. a linux-based one, you will need to pick-up the skills yourself and IT CAN BE VERY TIME-CONSUMING for you to complete the homeworks. (In last year's offering, students without the aforementioned required background took several 10's of hours to finish EACH homework).
Course Information
Lecture time:
THUR
7:00pm - 10:00pm
Lab Workshop/Tutorial:
- To be scheduled
Instructor:
- Prof. Wing Cheong Lau.
wclau [at] ie [dot] cuhk [dot] edu [dot] hk
- Office hours: Thu 11:00pm to noon or by Appointment
Teaching Assistant:
- LI Guanchen
lg014 [at] ie [dot] cuhk [dot] edu [dot] hk
- Office hour: Thursday 3:00pm-4:00pm
- YANG Ronghai
yr013 [at] ie [dot] cuhk [dot] edu [dot] hk
- Office hour: Friday 3:00pm-4:00pm
Login info for the Protected parts of this website:
User:cmsc5735
Password: fall5735cmsc
Recommended Textbooks
[DataAlgorithms] Data Algorithms: Recipes for Scaling Up with Hadoop and Spark, by Mahmoud Parsian, Publisher: O'Reilly Media, Aug 2015
[CCHO] Cloud Computing: a Hands-On approach, by Bahga and Madisetti, Publisher: CreateSpace Independent Publishing Platform, Dec 2013.
[KenBirman] Guide to Reliable Distributed Systems: Building High-Assurance Applications and Cloud-hosted Services, by Kenneth Birman, Publisher: Springer Verlag 2012.
[CCTP] Cloud Computing: Theory and Practice, by Dan C. Marinescu, Publisher: Morgan Kaufmann 2009.
[JLin] Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer, Morgan and Claypool Publishers, 2010, can be freely downloaded from http://lintool.github.io/MapReduceAlgorithms/
[Hadoop] Hadoop: The Definitive Guide 4th Edition, by Tom White, published by Oreilly, 2015.
[MMDS] Mining of Massive Datasets (Download version 1.3) by Anand Rajaraman, Jeff Ullman and Jure Leskovec, Cambridge University Press. Latest version can be downloaded from http://i.stanford.edu/~ullman/mmds.html#latest
[PaperTrailBlog2PC] http://the-paper-trail.org/blog/consensus-protocols-two-phase-commit
[PaperTrailBlogPaxos] http://the-paper-trail.org/blog/consensus-protocols-paxos ; http://the-paper-trail.org/blog/consensus-protocols-a-paxos-implementation
[NoSQL] NoSQL Overview, Appendix A of the book titled "Graph Databases", by Ian Robinson, Jim Webber and Emil Eifrem (Can request a free copy from http://graphdatabases.com)
[HBase] HBase: The Definitive Guide, by Lars George, published by Oreilly.
[Cassandra] Cassandra: The Definitive Guide, by Eben Hewitt, published by Oreilly.
[Pig] Programming Pig, by Alan Gates, published by Oreilly.
[Hive] Programming Hive, by Edward Capriolo, Dean Wampler, Jason Rutherglen, published by Oreilly.
[OpenStackOp] OpenStack Operations Guide, published by Oreilly, (current-version available online at: http://docs.openstack.org/openstack-ops/content )
Tentative Timetable
Lecture Date | Class Room | Topic | Period | Recommended Readings | Additional References |
---|---|---|---|---|---|
Sep 10 | HKPC Room 108 | Course Admin ; Computing as a Utility ; Cloud Service Models ; Data Center Architecture | 7:00pm - 10:00pm | [JLin]Ch1 | - |
Sep 17 | HKPC Room 108 | Case Study on major Cloud Providers ; Data Center Operating Systems | 7:00pm - 10:00pm | [DataCenter], [OpenStackOp] | - |
Sep 24 | HKPC Room 108 | Distributed/Parallel Programming Models for the Cloud: MapReduce/ Hadoop, GFS/HDFS and the Big Data Processing Stack | 7:00pm - 10:00pm | [MMDS]Ch2.1-2.4 ; [JLin]Ch2, Ch3.1-3.4 ; [Hadoop]Ch.2-3 | |
**Oct 1 National Day Holiday** | |||||
Oct 8 | HKPC Room 108 | Distributed/Parallel Programming Models for the Cloud: MapReduce/ Hadoop, GFS/HDFS and the Big Data Processing Stack (cont'd) | 7:00pm - 10:00pm | [Hadoop]Ch.2-3 ; [KenBirman] Ch.5 | - |
Oct 15 | HKPC Room 108 | Concurrency, Consistency, Transaction control in Cloud-based systems | 7:00pm - 10:00pm | [PaperTrailBlog2PC] | - |
Oct 22 | HKPC Room 108 | Fault-tolerance, Replication Consistency, Consensus Management for Cloud-based systems | 7:00 - 10:00pm | [PaperTrailBlogPaxos] | [KenBirman]Ch.10 |
Oct 29 | HKPC Room 108 | CAP Theorem ; ACID vs. BASE ; The NoSQL movement | 7:00 - 10:00pm | [CloudData] ; [NoSQL] | - |
Nov 5 | HKPC Room 108 | NoSQL Databases for the Cloud: Dynamo, HBase, Cassandra | 7:00 - 10:00pm | [Hadoop] Ch.20 | [Dynamo] ; [HBase] ; [Cassandra] |
Nov 12 | CUHK WMY_506 | High-level Data Query Languages for the Clouds: Pig and Hive | 7:00 - 10:00pm | [Hadoop]Ch.16-17 | [Pig] ; [Hive] |
Nov 19 | CUHK WMY_506 | Server and Network Virtualization Technologies | 7:00 - 10:00pm | [CCTP]Ch.5 | - |
Nov 26 | CUHK WMY_506 | Cloud Service Security and Privacy | 7:00 - 10:00pm | - | - |
Dec 3 | CUHK WMY_506 | Cloud Service Security and Privacy (cont'd) ; | 7:00 - 10:00pm | - | - |
Dec 10 | Esther Lee Building (ELB) - Rm 401 (CUHK) | **Final examination on Dec 10 (Thu) 7:30pm to 9:30pm** | 7:30 - 9:30pm | - | |
Dec 17 | Cheng Yu Tung Building (CYT) - Rm 211 (CUHK) | ** Project Presentations** | 7:00pm - 10:00pm | - | |
Dec 19 | An Integrated Teaching Building (AIT) - Rm 211 (CUHK) | ** Project Presentations** | 9:00am - 6:00pm | - |
Course Assessment
Your grade will be based on the following components:
- Homeworks & Programming assignments (about 3 sets in total): 40%
- Project: 20%
- Final Exam: 40% (2-hour final examination)
Student/Faculty Expectations on Teaching and Learning
http://www.erg.cuhk.edu.hk/Student-Faculty-Expectations
Academic Honesty
You are expected to do your own work and acknowledge the use of anyone else's words or ideas. You MUST put down in your submitted work the names of people with whom you have had discussions.
Refer to http://www.cuhk.edu.hk/policy/academichonesty for details
When scholastic dishonesty is suspected, the matter will be turned over to the University authority for action.
You MUST include the following signed statement in all of your submitted homework, project assignments and examinations. Submission without a signed statement will not be graded.
I declare that the assignment here submitted is original except for source material explicitly acknowledged, and that the same or related material has not been previously submitted for another course. I also acknowledge that I am aware of University policy and regulations on honesty in academic work, and of the disciplinary guidelines and procedures applicable to breaches of such policy and regulations, as contained in the website http://www.cuhk.edu.hk/policy/academichonesty/.
Acknowledgement
Thanks a lot to Amazon Web Services for their great support of this course