This past semester, I co-developed and co-taught a graduate class on cloud computing (15-719, Fall 2013) with three other professors at CMU. The class was targeted at Master’s students (mostly) and PhD students (somewhat). My responsibilities included developing an initial version of the syllabus, creating and supporting the projects, and lecturing. I also created a project for a separate cloud computing online course being developed at CMU. This was my first time teaching/developing a class, so I wanted to record some of my experiences here.
Developing the syllabus
Cloud computing isn’t really a well-defined topic. For example, NIST’s definition (Peter Mell and Timothy Grance, 2011) is very broad and allows for lots of leeway in deciding what constitutes cloud computing. On the other hand, many of the concepts that enable cloud computing are already taught in other graduate classes, such as operating systems, networks, and databases. This lack of consensus on “what cloud computing is” makes coming up with a syllabus challenging. Overall, I think there are five options for a cloud-computing syllabus.
- It could focus on just the concepts related to cloud computing that aren’t covered in-depth in other graduate classes (e.g., distributed systems, operating systems, databases, etc.). This might include economic models for billing and scheduling, datacenter power management, analysis frameworks (e.g., Hadoop or Spark), and advanced scheduling (e.g., co-scheduling processes that run in different frameworks). I think this sort of syllabus would be appropriate for advanced PhD students.
- It could focus on recent papers from major conferences (e.g., SOSP, OSDI, SICCOMM, and NSDI) regardless of topic, as long as they apply to large-scale systems. This class would be most appropriate to advanced PhD students.
- It could focus on one specific building-block of cloud computing (e.g., scheduling or networks) as it applies to large-scale systems. This class could be appropriate for Master’s students and PhD students with undergraduate-level knowledge of the chosen building block.
- It could focus on practical aspects of using cloud-computing services so as to prepare students for the many industry jobs in which they will use cloud computing. For example, it could concentrate on asking students to learn how to use AWS. This class would be most appropriate for undergraduates and professional Master’s students.
- It could present a broad overview of all the concepts that constitute cloud computing, regardless of whether they are also covered in other graduate classes. The goal would be to teach students the basics of what they would need to become designers of cloud-computing infrastructures. This syllabus would be most appropriate for Master’s students and entry-level PhD students. Master’s students will likely not need additional depth and PhD students can pick and choose which concepts they would like to focus on in the future.
Before creating our class, we conducted a quick survey of existing cloud-computing classes and found instances of options 2 (recent papers) and 3 (focus on one building block). Option 4 (practical aspects as training for industry) is already the basis of 15-319 and 15-619 at CMU. For 15-719, we went with option 5 (the broad overview approach). Every concept was allocated 1-2 lectures and we covered a wide variety of them, including virtualization, scheduling, monitoring/diagnosis, and key-value stores. I think it went well, but only time will tell if it was a good approach.
We created two projects, the first was designed to acquaint students with how to use cloud-computing infrastructures. It involved using AWS’s Elastic MapReduce to analyze Wikipedia articles. Amazon graciously provided $100 of AWS credits for each student and students thought the project was straightforward. The second project was designed to acquaint students with designing and building cloud-computing infrastructures. It involved adding an auto scaler and load balancer to OpenStack, which was running within a single VM (set up via the DevStack script). Students did well on this project, but complained about its grittiness — especially that of setting up OpenStack and its confounding error messages.
It relatively straightforward to create projects that teach students how to use real clouds. But, it is much more difficult to create projects that engage students in issues related to designing and building cloud infrastructures. Such projects (e.g., running and modifying OpenStack) cannot be done within AWS or Google’s ComputeEngine because neither allow the underlying infrastructure to be modified. AWS also prohibits users from layering VMs on top of those they provide. When such projects are done with university resources, students don’t get the exposure to the scaling issues that are fundamental to building clouds. I am hopeful that emerging bare-metal clouds will help, but at the same time, I am concerned that they will be prohibitively expensive. Of course, internships and co-ops can provide a much more thorough exposure to infrastructure-level issues and must be a fundamental part of any cloud-computing curriculum.
My PhD is in a cloud-computing-related area, so teaching a cloud-computing class immediately after graduating was a great experience. When working on my PhD, my focus was myopic. Teaching this class helped me understand how my PhD research fits into the big picture and helped me get a better understanding of the outstanding problems in cloud computing. In a blog post, Nick Feamster said that “efforts in teaching can ultimately result in better research,” an opinion I completely agree with (Nick Feamster, 2013) .
I’ve heard many students claim that professors are often so entrenched in their research that they can’t relate to beginning (e.g., first-year PhD) students when teaching the fundamentals. I don’t know if this is true. But, I do know that, compared to other lectures that I gave, those I gave on problem diagnosis (my PhD topic) engendered the greatest student interest.
Except for serving as a TA a few times, I will readily admit that I didn’t have any prior experience with teaching a class. I feel like there is a fundamental difference between teaching and lecturing and too often I found myself doing the latter. One way to aid pedagogy is to treat lectures as conversations during which you ask students to identify the next step instead of lecturing it to them. But, what do you do when you ask a question and no one wants to raise their hand? Hopefully, more experience teaching will help me come up with an answer to this question. I also think better tools will help — Powerpoint is clearly the wrong tool, as it enforces a rigid structure to the lecture and isn’t conducive to brainstorming answers with students. In the future, I hope to experiment with using slides that can be annotated/filled-in dynamically during lecture.
In retrospect, I really enjoyed the opportunity to develop 15-719, as creating it helped me understand the “big picture” of cloud computing in a way that wouldn’t have been possible otherwise. I firmly believe that my work on this class will help my future (long term) research endeavours.
But, here’s a word of caution for other postdocs: creating this class took a lot of effort and prevented me from making much research progress. In a world where we academic hopefuls are judged far more for our research than for teaching, this is a very hard tradeoff to make. Why then did I initially agree to develop and teach 15-719? Well, for most of my PhD, I immersed myself in cloud-computing-related research. So, how could I say ‘no’ when given the privilege to develop an entire class on it? Plus, it seemed exciting, challenging, and seemed like it could be a lot of fun. (And, yes, I didn’t quite realize how much work would be involved when I said ‘yes’. :))
- Nick Feamster (2013, November), The Relationship Between Teaching and Research.
- Peter Mell and Timothy Grance (2011, September), The NIST Definition of Cloud Computing, Special Publication SP800-145. NIST.