Thursday, November 28, 2013

Why Scala for CS1 and CS2?

+Thomas Lockney pointed me to a Tweet by +martin odersky today with a link to the following blog post: This has prompted me to write a bit about my own experience with teaching CS1 and CS2 using Scala. In this first post I will focus on why we chose Scala and continue to use it. In a second post I will run through what I see as the most significant benefits of Scala, especially in CS1.

I got the Odersky, et al. book and started learning about Scala in 2009 because I was working with students on some projects that involved the experimental X10 and Fortress languages and they both borrowed ideas from Scala. Indeed, the big push for me was that X10 switched from having a Java-based syntax to a Scala-based syntax around version 1.7. Since these experimental languages don't have learning guides, just language specs, I decided that I probably needed to actually learn Scala from a useful book before I tried to tackle X10 in its new form.

I will admit that for the first 200 pages or so I was very skeptical. There were a number of language decisions that simply didn't make sense to me. At that point they seemed ad hoc. However, by the time I was half way through the book I was hooked. I had loved programming in ML a few years prior, but I couldn't do much in it because of the lack of library support. Scala brought me many of the things I loved from ML and the full Java libraries that I had years of experience with. Even then though I was thinking about using it in upper division classes and I recall saying to myself that it couldn't work for introductory courses.

Around this time our department also started evaluating our curriculum to make sure that everything was in order with ACM guidelines and to try to patch issues that we felt existed in how we were doing things. In Fall 2002 we had adopted an approach where we taught CS1 with no OO in C, CS2 with heavy OO in Java, and our Data Structures course using C++. In many ways we were happy with this setup, but there was one glaring issue. The breadth of languages had many benefits, but at the end of the day, students didn't feel comfortable in any single language. For that reason, we wanted to change things so that CS1 and CS2 were taught in one language while Data Structures and a new course on Algorithms were taught in another. It was fairly easy to pick C++ for the 3rd and 4th semesters. (I'm sure there could be endless discussion on that and comments are welcomed, but I'll just say here that it works well for us and we believe it works well for the students.) However, the choice for CS1 and CS2 was more challenging.

We were happy with focusing on logic in the first semester and holding off real OO until the second semester. After all, OO really shines on large projects and CS1 is not about writing long programs. CS1 is about figuring out how to break down problems and construct logic in the formal terms of a programming language. I'm also not a fan of using tools explicitly designed for education in majors courses. Tools like BlueJ, Greenfoot, and DrJava can be great for getting around some of the problem areas of Java early on and we are happy using them for non-majors, but we would rather work with "real" tools with majors.

We also had to consider the requirements of CS2. This isn't just a class about OO. It is also about abstraction and polymorphism. We want students to understand things like inheritance hierarchies and parametric polymorphism (generics/template/type parameters). We think it is nice to throw in things like GUIs, graphics, multithreading, and networking along the way in CS1 and CS2 to allow students to build applications they find more interesting and because we want to make sure those things are seen by everyone, even if they skip one or two of those topics in their upper division courses.

We had pretty much ruled out Java because of the tool issue and the fact that the language just doesn't support programming in the small well on its own. The keyword soup that is "Hello World" in Java wasn't an option for us. In many ways, Python was the early front runner at Trinity, as it has become in many other departments looking to change the early curriculum. Python works wonderfully for CS1 and has a feature set that fits well with what we want to teach in that course. Unfortunately, Python doesn't fit our learning objectives for CS2. The OO isn't very pure, and many of the topics we want to cover really only make sense in a statically typed language.

It was during these discussions about languages that I had my Scala educational epiphany. I realized that many of the things that made Python great for CS1 were also features of Scala. The availability of a REPL and a scripting environment meant that Scala was just as good for writing small programs and testing things out as Python or other scripting languages. Unlike Python though, Scala was even better than Java at handling the OO and related topics in CS2.

What we decided to do was run an informal experiment. For 1.5 years we ran sections using our old approach, Python, and Scala to see if there were any glaring holes. While everyone felt that each style did a reasonable job, there was some anecdotal evidence that students who started in Python had a slightly harder time moving to C/C++ because of little things like having to declare variables and worry about types. This isn't something that would be likely to be seen in final outcomes from the major, but it is the type of thing that can slow the class down the next semester and prevent them from getting quite the same depth or breadth.

In many ways, the decision was made to go with Scala because the primary supporter of Python retired and I didn't. We are now in the first half of the 4th year using Scala and the 2nd full year where all sections are using Scala. In general, I think the decision to go with Scala has served us well. In my next blog post, I will go into the details of how I use Scala in CS1 and the features that I think are most useful for that course.