Saturday, October 20, 2018

Scala Numerical Performance with Scala Native and Graal

This post is a follow-on to my earlier post looking at the performance of different approaches to writing an n-body simulation using Scala. That post focused on the style of Scala code used and how that impacted the performance. This post uses that same code, so you should refer to it to see what the different types of simulations are doing.


Comparing Runtimes

For someone who is interested in doing numerical simulations, Scala and the JVM might seem like odd choices, but the benefits of having a good language that is highly expressive and maintainable can be significant, even for numerical work. In addition to the impact of programming style and the optimizations done by the compiler that produces JVM bytecode, performance can also be significantly impacted by the choice of runtime environment.

Historically, there wasn't much in the way of choice. Sun, then later Oracle, made the only JVM that was in even reasonably broad usage, and enough effort was put into making the hotspot optimizer work that it was generally a good choice. Today, however, there are a few more options. If you are running Linux, odds are good that you have OpenJDK by default and would have to specifically download and install the Oracle version if you want it. In addition, Oracle has recently been working on Graal, a new virtual environment for both JVM and non-JVM languages. Part of the argument for Graal was that the old C2 hotspot compiler, written in C++, had simply because too brittle and it was hard to add new optimizations. Graal is being built fresh from the ground up using Java, and many new types of analysis are included in it. While I have seen benchmarks indicating the Graal, though young, is already a faster option for many Scala workloads, I wasn't certain if that would be the case for numerical work. This is at least in part due to the fact that one of the Graal talks this last summer at Scala Days mentioned that Graal was not yet emitting SIMD instructions for numerical computations.

In addition, the newest addition to the list of supported environments for Scala is Scala Native. This project uses LLVM to compile Scala source to native executables. One of the main motivators for this right now is using Scala for batch processing because native executables don't suffer from the startup times of bringing up the JVM. This project is still in beta, but I wanted to see if it might be able to produce executables with good numerical performance as well.

For these benchmarks, the Scala code was compiled with -opt:_ and run on each JVM with no additional options. I am using a different machine from my earlier post, which explains the significant runtime differences between this post and the earlier one using a similar JVM. The following table gives timing results for the five approaches using the five different runtimes.

EnvironmentStyleAverage Time [s]Stdev [s]
Oracle JDK 8-191Value Class0.3940.012
Mutable Class0.6830.015
Immutable Class0.8090.010
Functional 14.2460.439
Functional 21.7230.027
Oracle JDK 11Value Class0.3780.006
Mutable Class0.6900.012
Immutable Class0.9400.083
Functional 14.4200.059
Functional 21.5890.021
OpenJDK 10Value Class0.3880.008
Mutable Class0.7150.006
Immutable Class0.8920.013
Functional 14.4050.039
Functional 21.6890.013
GraalVM 1.0.0-rc7, Java 1.8Value Class0.3770.003
Mutable Class0.3960.003
Immutable Class0.6940.108
Functional 14.0540.151
Functional 20.7930.016
Scala Native 0.3.8Value Class2.6030.185
Mutable Class1.0280.020
Immutable Class2.5950.020
Functional 116.841.39
Functional 25.2320.655

Looking at the first three runtimes, we notice that there is very little difference between Oracle and OpenJDK over various Java versions from 8 to 11. In all three, the value class approach is fastest by far followed by the version with the mutable classes, then the immutable classes with functional approaches being slowest by a fair margin.

Things get more interesting when we look at Graal. The performance of the value class version is roughly the same as for the other JVMs, but every other version runs significantly faster under Graal than in the others. The ordering stays the same as to which approaches are fastest, but the magnitude of how much slower each version is than the value class version changes dramatically. Under Graal, the mutable class version is almost as fast as the value class version instead of being almost a factor of two slower. Most impressive is that the second functional version is only a factor of two slower than the value class version instead of being four times slower. This is significant for two reasons. One is that it is the version written in the most idiomatic Scala style. The other is that this version literally does twice as much work in terms of distance calculations as the other versions. That means that we really can't expect it to do any better than being 2x as slow. The fact that it takes nearly twice as long to run means that using Graal there isn't a significant overhead to the functional approach the way there is using the older JVMs.

At the end of the table, we have the results for Scala Native. Unfortunately, it is clear that Scala Native is not yet ready for running with performance-critical numerical code. One result that stands out is that the value class version is not the fastest. Indeed, it runs at a speed roughly equal to the immutable class and 2.5x slower than the mutable class. I assume that this means that the value class optimizations have not yet been implemented in Scala Native. As to why even the mutable class version is more than 2x slower than Graal and at least 50% slower than the other VMs is a bit puzzling to me as I did the timing using a release build. I expect that this is something that will improve over time. Scala Native is still in the very early stages, and there is a lot of room for the project to grow.


Comparison to C++

As before, I also ran a test comparing these Scala results to C++ code compiled with the GNU compiler using the -Ofast flag. This uses a simpler test with the value class technique. You can see in the table below that the Scala code is performing about 15% slower than C++ in all of the environments except Scala Native, which is several times slower. Given the results above indicating that Scala Native isn't nicely optimizing value classes yet, this result for Scala Native isn't surprising.

EnvironmentAverage Time [s]
g++3.29
Oracle JDK 8-1913.88
Oracle JDK 113.77
OpenJDK 103.82
GraalVM 1.0.0-rc73.82
Scala Native21.6


Conclusions

For me, there are two main takeaway messages from this. The first is that while Scala Native holds the longterm potential to give Scala a higher performance platform for running computationally intensive jobs, it isn't there yet. A lot more work needs to go into optimization to get it to reach its potential of competing with other natively compiled languages. I firmly believe that it can get there and is moving in that direction, but it isn't ready yet.

On the other hand, these results indicate to me that if you are programming Scala, you should strongly consider using Graal, even if you are doing numeric work. Based on presentations at Scala Days 2018 in New York, I know that this is the case for non-numeric codes, but at the time Graal wasn't emitting SIMD instructions, so it wasn't clear if it would compare well to the old C2 hot-spot optimizer. These results show that regardless of style, Graal is at least as performant as the other JVM options and that in some cases it is much faster. Perhaps most significantly, the functional 2 style, which is written in a much more idiomatic style for Scala, is more than 2x as fast with Graal was with the other JVMs. I should also note that Graal still allows me to run graphical applications like my SwifVis2 plotting package, so there isn't any loss of overall functionality.

Going forward, I want to test the performance of more complex n-body simulations using trees and also look at multithreaded performance to see how Graal compares for those. Scala Native is still only single threaded for pure Scala code, so it will likely be left out of those tests.

GraalVM native images are another feature that I would really like to explore, but there are some challenges in building them from Scala code that I did take the time to overcome for this post.

Monday, August 6, 2018

Why is JavaFX so slow? SwiftVis2 and Speed Problems with JavaFX 2D Canvas

For the last year or so I've been working on a plotting package called SwiftVis2. There are a number of goals to this project, but a big one is to provide a solid plotting package for Scala that can be used with Spark projects in Scala and which can draw plots with a large number of points. My own personal use case for this is plotting ring simulations often involving millions of particles, each of which is a point in a scatter chart. The figure below shows an example of this made with SwiftVis2 using the Swing/Java2D renderer with 8.9 million particles.

SwiftVis2 plot of ring simulation using Swing renderer.

This particular use case pretty much precludes a lot of browser-based plotting libraries that seem to be popular these days as converting 8.9 million data points to JSON for plotting in JavaScript simply isn't a feasible thing to do for both memory and speed reasons.

As the name SwiftVis2 implies, there is an earlier SwiftVis plotting program. It is a GUI based program written in Java. One of the things that excited me about the upgrade was the ability to use JavaFX instead of Java2D. My understanding is that one of the main reasons for building JavaFX new from the ground up was to take better advantage of graphics cards, and I was really hoping that a JavaFX based rendering engine would outperform one based on the older Java2D library.

I didn't really test this until I was writing up a paper on SwiftVis2 for CSCE'18 and I did performance tests against some other plotting packages. In particular, I compared to Breeze-Viz, which is a Scala wrapper for JFreeChart. JFreeChart uses Swing and Java2D, so I was really hoping that SwiftVis2 would be faster. At the time, I only had a renderer that used JavaFX in SwiftVis2, and I was really disappointed when Breeze-Viz turned out to run roughly twice as fast on a plot like the one above. Tests of NSLP, a different Scala plotting package using Swing/Java2D, showed that it also ran at roughly twice the speed of SwiftVis2 using JavaFX.

Some searching on the internet showed me that there were known issues with drawing too many elements at once on a JavaFX canvas because of the queuing. So I enhanced my renderer to batch the drawing. This has a nice side effect that users can see the plot draw incrementally, so they know their program isn't frozen, but it didn't help the overall speed at all.

Since SwifVis2 was written to allow multiple types of renderers, I went ahead and wrote a Swing renderer that uses Java2D, just to see what the performance was like. The results, shown in the following table, were pretty astounding to me. Note that these times were for drawing plots like the one above. It is also worth noting that upgrading my graphics driver improves the performance for JavaFX more than it did for the Swing based libraries, but even with new drivers, JavaFX is still slower.


PackageRender Time for 8.9 million Points
Breeze-Viz80 secs
SwiftVis2 with JavaFX108 secs
SwiftVis2 with Swing/Java2D13 secs

Keep in mind here that the two SwiftVis2 options are running the exact same code for everything except the final drawing as the only difference is which subclass of my Renderer trait is being used. While I do feel a certain amount of happiness in the fact that SwiftVis2 using Swing is significantly faster than the Breeze-Viz wrapper for JFreeChart, I'm still astounded that JavaFX is nearly 10x slower than Swing/Java2D.

Not only is JavaFX slower, it does an inferior job of antialiasing when drawing circles that are sub-pixel in size. The following figure shows the plot created using the same data and the JavaFX renderer. The higher saturation is obvious. The rendering with Swing/Java2D is the more accurate of the two.

Ring simulation plot made using the JavaFX renderer. Note that the points are sub-pixel and this renderer over-saturates the drawing.
To me, this seems like a serious failing on the part of JavaFX. JavaFX is actually harder to use than Swing, and I always forgave that based on the assumption that special handling was needed to work with the GPU, but that the tradeoff would be significantly improved performance. I can only hope that whatever ails JavaFX in this regard can be fixed and that at some point in the future, JavaFX rendering can match the performance promise that comes with completely rebuilding a GUI/graphics library from the ground up.

Also, in case anyone is wondering why I'm bothering to create yet another plotting package, I will note that SwiftVis2 looks significantly better than JFreeChart, especially in the default behavior. For comparison, the figure below shows the output of the most basic invocation of Breeze-Viz for this plot, which is comparable to the above plots for SwiftVis2. Even if the range on the y-axis is adjusted, it still generally doesn't look as good as the SwiftVis2 output, especially in regards to axis labels. This is the reason I never really got into using JFreeChart in the past. SwiftVis2 is still in the early stages of development, but it already does a better job with my primary use cases.

This is the same figure as those above made with default settings using Breeze-Viz instead of SwiftVis2.
You can find the code for my basic performance tests on GitHub. The data file is not on GitHub because it is rather large. You can find a copy of it on my web space.

Sunday, June 24, 2018

Supporting Scala Adoption in Colleges

Scala Days North America just finished, and one of the big surprises for me was how many talks focused on teaching Scala. The first day had roughly one talk on teaching Scala in every time block.


The topic of increasing the use of Scala in colleges also came up in the final panel (Presentation Video).

I think it is pretty clear that the companies using Scala feel a need to get more developers who either know the language or can transition into it quickly and easily. Large companies, like Twitter, can run their own training programs, but that is out of reach for most companies. What most companies need is for more colleges to at least introduce Scala and the idea of functional programming in their curricula.

The Current State of Affairs

The current reality of college CS education is that imperative OO is king. For roughly two decades, Java dominated the landscape of college CS courses, especially at the introductory level. In the last ~5 years, Python has made significant inroads, again, especially at the introductory level. Let's be honest, that is really a move in the wrong direction for those who want graduates that are prepared to work with Scala because not only is Python not generally taught in a functional way, the dynamic typing doesn't prepare students to think in the type-oriented manner that is ideal for Scala programming. (I have to note that this move to Python has been predicated on the idea that Python is a simpler language to learn, but an analysis of a large dataset of student exercise solutions for zyBooks indicates this might not actually be the case.)

It is also true that some aspects of functional programming, namely the use of higher-order functions, has moved into pretty much all major programming languages. However, it isn't at all clear that these concepts are currently appearing in college curricula. Part of the reason for this is that in nearly all cases, features added to languages late are more complex than those that were part of the original development. Yes, Java 8 has lambdas and you can use map and filter on Streams, but there is a lot of overhead, both syntactically and cognitively, that makes it much harder to teach to students in Java than it is in Scala. That overhead means professors are less likely to "get to it" during any particular course.

The bottom line is that the vast majority of students coming out of colleges today have never heard of Scala, nor have they ever been taught to think about problems in a functional way.

What Can You Do?

If you work for a company that uses Scala, it would probably benefit you greatly if this state of affairs could be changed so that you can hire kids straight out of college who are ready to move into your development environment. The question is, how do you do this? I have a few suggestions.

First, contact the CS departments at local colleges and/or your alma mater. Tell them the importance of Scala and functional programming to your organization and that you would love to hire graduates with certain skills. This has to be done in the right way though, so let's break it down a bit.

Who Should You Talk To?

My guess is that a lot of people will immediately wonder who they should be reaching out to, but I'm pretty sure it doesn't matter, as long as it is a faculty member in Computer Science. You can always start with the chair and ask them if there is someone else in the department that would be better to talk to, but no matter who you start with, you'll probably wind up being directed to the most applicable person.

What Types Of Schools?

I know that it will be tempting to focus on big schools with the idea that you can get more bang for the buck if you do something with the local state school that graduates 300+ CS majors a year. The problem here is that many of the stereotypes for big organizations not being agile apply to colleges as well. The bigger the school, the harder it likely is to create change. Smaller schools might not graduate as many students, but they are probably more adaptable and open to change. Departments with <10 faculty members might not crank out hundreds of majors, but they can probably produce a few very good ones each year that you would love to have on your team.

This doesn't mean you skip on the big schools. If you can convince your local state school, the payoffs are huge. I would just urge you to cast a wide net. Any school with a CS department is worth reaching out to.

How To Start The Conversation

It probably isn't the best idea to just go to a college and tell them that they are doing things wrong and that you have ideas on how they can do it better. One way to start a conversation would be to say that you are interested in talking about pedagogy and their curriculum to get a better idea of what they do and how they do it. However, an even better idea would be to initiate the conversation by offering to do something for them.

Most colleges will have some type of venue for outside entities to give talks to their majors. Ask if they have a seminar/colloquium that you might be able to speak at. I am the current organizer for the CS Colloquium at Trinity and I would love to have speakers from industry offer to come to give interesting talks to our majors (hint, hint). I'm quite certain that I'm not alone. This gives you a great venue to say all the things that I mention in the next section, both to faculty and students, efficiently.

If you have a local Scala Meetup, you might see if they would be willing to send out an invitation to their majors for that as well.

What Should You Say?

This is where things are a bit more tricky. Most colleges do not view themselves as vocational training. They don't just put things in their curricula because companies would like it, though they do feel some pressure to make sure their graduates have the skills needed to find employment. College curricula focus on fundamental concepts, not tools because we know that technology is ever-changing, and our students are going to have to continue learning throughout their careers. We have to build the foundation for that learning. So the key is to show them how the tools that you are using represent the paradigms of the future of the industry.

With this in mind, your goal isn't to convince them to teach Scala, at least not directly. There is a reason that your company uses Scala, and my guess is that you believe that the reasons your company chose Scala are generally indicative of the way that a lot of the industry needs to move. It might be that you see that data keeps growing in importance and size and that processing that data in a way that scales is critical for the future. You know that using a functional approach is key to writing these types of applications both today and in the future. You know that while the framework you are currently using probably won't be dominant in 10 years, the ideas behind how it works and how you use it to solve real problems will still be folded into the frameworks of the future.

I would say that your first goal is to establish that in the highly parallel and distributed environment we live in, the functional paradigm has moved from an academic play area into an essential real-world skill. You might also tell them why the pragmatic approach of Scala makes sense for bringing the functional paradigm into practical usage. Highlight how it impacts your use case and how you see that being a use case that will only grow with time.

Going back to the fact that colleges want to teach things that are foundational, I would point out how many other languages are following in the footsteps of Scala. This is key because it makes it clear that learning Scala isn't just learning one language. Instead, it is opening the door to where most older languages are evolving as well as where many new languages are going. Knowing Scala helps future-proof students in this regard, and Scala isn't just sitting still and getting older. It is a dynamic language with creators who are working to keep it on the cutting else of language development while maintaining an eye to the practical aspects of what makes a language usable in industry.

If you get to the point where they are convinced but aren't certain how to put Scala and functional programming into their curricula, point them to academics who have been using Scala for teaching. I would personally welcome all such inquiries. Just tell them to write to mlewis@trinity.edu. In the US both Konstantin Läufer and George K. Thiruvathukal at Loyola University Chicago have experience in teaching with Scala and have an interest in spreading the word. So does Brian Howard at DePauw University. While Martin has the most experience of anyone teaching Scala, personally, I'd rather he focus his time on Scala 3 development. Also, while Heather might be a freshly minted academic at CMU, she is definitely a Scala expert and her experience with the MOOCs means that she has taught more people Scala than almost anyone in the world.

Going Further

Giving a talk every so often is a nice way to let schools know what types of technologies you use and why you see them being important in the future. As such, it can provide a subtle way to influence the curriculum. However, if you are willing and able to put in a bit more time, you can have a more explicit impact on what is being taught by offering to teach a course as an adjunct professor the way Austin is doing at UT Dallas and the way Twitter does at CU Boulder.

The challenge with this approach is that not everyone is a naturally gifted teacher and it will be a fairly significant time sink. Technically, whoever does it will get paid, just not that much. (Teaching one class as an adjunct pays well under $10,000 dollars and is often as low as $3000.) The real question is whether your employer is willing and able to let you do this. I will note that for various reasons, a developer doing this who doesn't have a Master's degree will need to be reasonably senior to make it work at a lot of Universities.

The upside of teaching a course is that you can probably get into an upper division course where you have fairly complete control over the curriculum and you can teach exactly the topics that you would most like new hires to arrive with. Due to growing enrollments, the vast majority of colleges are likely quite open to having an adjunct help out, and many will be open to an interesting upper division offering as they can probably move faculty around to cover other courses. (At Trinity, we are looking for an adjunct for next fall and while they are currently scheduled for an introductory course, if someone wanted to come in and teach my Big Data course using Scala, I could certainly switch to the introductory course to make things work.) Of course, if you want to teach CS1/CS2 with Scala, I've got a bunch of materials you can use to help organize the class.

You might also see if the department has an advisory board. Having a seat on such a board will give your company insight into the inner workings of the department and how they think about the field while also giving you a venue to talk about what you value and where you see the future of the field going.

In The Meantime

Until other colleges catch on to the value of Scala, remember that at Trinity University we are teaching all of our students both Scala and how to think functionally, so let me know when you are looking for interns or junior devs.