Friday, August 27, 2010

x86 Open64 Compiler

I recently updated both my laptop and my home machine to Ubuntu 10.04. This had one significant negative side effect on my workstation in that it broke the Intel compiler I had installed by the system maker. I started doing a bit of searching and the problem I have is a known problem with the newer libraries so it isn't clear that paying Intel more money would actually fix the problem. However, during my searching I came across the x86 Open64 compiler. This is an optimizing compiler from AMD. It is fairly new. I have vague memories of seeing announcements of the release back in 2009, but I didn't look closely at it at the time. Not having a working compiler other than gcc on my system made me consider taking a second look.

While they say they have only really tested it under RedHat and Fedora, the binaries work just fine for me on Ubuntu. I have now also installed it on all the machines at Trinity. I need to do a direct speed test between it and the Intel compiler on my cluster, but I haven't done so yet. The set of optimizations that they list is quite impressive and from what I have seen it certainly isn't significantly slower than Intel on my home machine.

There is one very odd thing that I have noticed on my home machine though. My rings code uses OpenMP for multithreading and big simulations typically run on all eight cores of my home machine. Smaller simulations often sit at a load of 5-7. I started a large simulation using x86 Open64 and it keeps all 8 cores busy. However, smaller simulations that run through steps really quickly are running at loads of 3-4. The load level is also far less ragged than what I had seen with the Intel compiler. At this point I can't say if this is a good thing or not. At first I worried it wasn't properly parallelizing. Now I'm wondering if it just does a better job of keeping jobs to specific cores and keeping those cores loaded instead of having stuff jump around. That would certainly be a smart thing to do as it would improve cache performance, but only benchmarking and profiling will tell me for certain.

Tuesday, August 24, 2010


This blog has been dormant for a while because I spent a week in Oklahoma visiting my wife's dad with the family. It's a good place for me to get work done that can be done offline. Many of the tasks that I normally procrastinate get done there. This year I worked on papers and textbooks. The Scala book started getting a bit big and I thought maybe I should try out a different tool. There are lots of reasons why big documents should be done in LaTeX. I've been well aware of the arguments. I just haven't been willing to clutter my brain with the extra commands. LaTeX, like many other tools along those lines, has a decent learning curve and you can't do much of anything until you have climbed a fair bit of the curve. I also like OpenOffice for the GUI. I didn't want to edit my big documents in vi.

I had looked around for GUI programs that sit on top of LaTeX and already had this program called LyX installed. I had played with it just a bit, but not enough to figure out anything about it. They said read the manuals and after a few paragraphs they lost my attention so I stuck with OpenOffice. Oklahoma provided more time though and with the Scala book getting to 100 pages I decided it was time to seriously try it out. The book was long enough to benefit from the LaTeX advantages and if it got much bigger I wasn't going to be willing to convert it.

So I did a massive cut and paste and started reformatting. It took a fair bit of a day, but the result was definitely worth it. Seeing the wonderful and automatically generated Table of Contents for the first time made it worth while. As I have learned more about LyX I have been able to use more capabilities. It is a really nice way to interact with LaTeX. I highly recommend it to anyone who has thought that LaTeX might be nice but wasn't willing to climb the learning curve to get started. You can even have it show you the LaTeX source as you write so it can help you learn LaTeX along the way.

Friday, August 6, 2010

Another Huge Simulation

I've talked some in earlier posts about the F ring simulation that I recently started. I've now posted a fair bit from that simulation including plots and movies that are constantly being updated as the simulation advances. Just today I started another simulation. This one is related to my work on Negative Diffusion. I have a paper that I submitted to Icarus and got back a review for on the topic. The reviewers wanted a number of changes that I hope to get done before the beginning of classes.

One of the issues with this work is that I have to use some interesting boundary conditions to use a local cell with a perturbing moon. These boundary conditions wrap particles in the azimuthal direction such that they preserve the gradient in the epicyclic phase induced by the rate of passage of the cell by the moon. These boundary conditions aren't something that everyone uses. One reviewer wanted me to explain them better, which I can certainly do. However, I realized something that I had thought of before which was that to really convince people, I needed to do a large simulation without the periodic local cell, something more global, and show that the same thing happens. I know it will work because I have seen this behavior in global simulations of the F ring and nearly global simulations of the Keeler gap. I know that this isn't a result of the boundary conditions. However, I feel it would be good to do this with a simulation that basically matches what I had done for one of the local cell simulations for a direct confirmation. The only problem is, there was a reason I was using a local cell. With a local cell these simulations only need 10^5 - 10^6 particles. If you don't use a local cell, the requirements get a lot higher.

So I just started the simulation using 14 nodes of my cluster. That's 112 cores and 112 GB of RAM. I tried using a particle size at the top end of what I had used in the paper. Unfortunately, that ran into swapping. It had nearly 400 million particles and the master machine wasn't happy about that. It wasn't so big that it flat out crashed, but it was just big enough that the memory footprint caused it to run too slowly. So I increased the particle size just a bit to a 156 cm diameter. This gives me just under 266 million particles. That's still a huge simulation, but it is just small enough that it runs without the master doing significant swapping.

I started the simulation this morning and I looked at the first output and things seem to be going well. It looks like it will take about 5 hours per orbit and get to the point where conclusions can be made in a little over a month. That's pretty good for the biggest simulation I've ever run.

Tuesday, August 3, 2010

First snapshots of F ring simulation (and a comment on moonlets)

I described in an earlier post the mother of all F ring simulations. This has now gone far enough that I feel I can say it is working properly and I can make a plot of what it happening in the early going. I have put this up on my normal research site. I have a fair bit of discussion along with two plots. The plots don't show all that much beyond the initial configuration because there simply hasn't been that much time for the system to evolve yet. Still, I think they will give anyone with a feel for the F ring a feel for what this simulation might be able to show us.

It also hit me this evening that while I feel like this simulation isn't running all that fast, the reality is that it is running in close to real time. It finishes an orbit in about 7 hours. The real material in the F ring takes almost that long to get around Saturn. So on the whole I'd say that is pretty good. Maybe after this one has gone far enough to wipe out the transients I can consider scaling it up a bit more. I'd be really interested in seeing what happens if I make it so that Prometheus really does run into the apron of material at the edge of the ring. I expect the computers will be less than happy about such an experiment though.

As a side note, on the recommendation of Matt Tiscareno I did some simulations on moonlet stability that didn't include the background. Unlike previous work by Porco, et al., I am not putting a large central core into these moonlets. I am assuming all particles of nearly the same size. This is significant because the Hill sphere of the large body can enclose nearby smaller bodies when two bodies of the same size sit largely outside of one another's Hill spheres. I was able to find a configuration where a moonlet was stable without background material and got broken up when background material was present. The moonlet was made as a lattice roughly filling a triaxial ellipsoid at 130,000 km from Saturn. The lowest density I could get to be stable i this configuration with a 2:1:1 ratio was a bulk density of 0.7 g/cm^3 or 1.0 g/cm^3 for the constituent bodies. Even with that high a density, it got knocked apart when I put it in a background. I'm not going to be doing too much more work on this right now because Crosby found it interesting and will likely work on it as part of his senior research project in Physics.

Writing a Scala Textbook

So this coming fall I am going to be teaching the CS1 course at Trinity (CSCI 1320) using Scala. This is part of an experiment we are conducting to see how we might want to change the languages used in our introductory sequence. One of the challenges in using Scala is that there really aren't that many resources currently out for the language and none that work as a textbook for new programmers. All the current material assumes that students know how to program in one or more other languages and then builds on top of that. To address this, I am writing my own materials going into the beginning of the semester.

I have done this previously for Java with our CS2 (CSCI 1321) and CS0.5 (CSCI 1311), the latter using Greenfoot which also didn't have an existing text when I started using it. In some ways, the topics of introductory CS are the same no matter what language you use. However, the nature of the language can alter the ideal way to present information.

As it stands, we intend to keep our CS1 as a normal introduction to programming. We don't want this to be an objects-first type of class. The second semester will get into OO. The first semester is just about building logic and figuring out how to break programs down and use a programming language to construct solutions. So while Scala is 100% OO, that isn't being stressed early on. The functional aspects can be stressed early on.

Indeed, it is the more functional side that convinced me that Scala was worth trying to a CS1 course. Scala has a REPL and allows writing scripts. It works very well for programming in the small. However, unlike normal scripting languages, Scala is statically typed so students get to learn about types and see compile time error messages. Indeed, I'm hoping to run the class very much like what I have done in C for the last decade or so. We will work in Linux with the command line. They will edit scripts with vi. Unlike C, they can use the REPL to test out little things or load in the code from files instead of running them as standalone scripts.

Using Scala as the language though does change some aspects of what I see as the optimal order of presentation. It also adds richness to certain other topics. The big ones I've run into so far are functions and collections. Normally I do my introduction to Linux and the basics of C. Then I talk about binary numbers. That is followed by sequential execution, functions, conditional execution, and recursion. After that will come loops and arrays. In Scala, the functions aspect gets richer because it is a functional language. It has function literals that can be created on the fly. It is possible to easily pass functions as arguments and return them. In C, function pointers and void* were about the only topics I didn't cover. Those same ideas will be covered in Scala and early on because they are a lot smoother.

The biggest change so far though is where I am doing "arrays". In C there wasn't really any point introducing arrays until you had loops. You simply couldn't do much with an array without a loop. That isn't true in Scala. Collections in Scala, be they arrays, lists, whatever, have a rich set of methods defined for them. Indeed, the Scala for loop does nothing more than translate to calls of these other methods. So in Scala the logic goes the other way. Because they know functions, it makes sense to introduce collections and talk about the methods like map, filter, and reduce. Once those have been covered, then we talk about loops and see how they provide us with a nicer way to write a lot of the iteration behavior that we had done before using either recursion or higher order functions on collections.

One thing that I haven't figured out exactly how to include is Ranges. These are really significant in for loops. If you just want to have a counting for loop in Scala you would write for(i <- 1 to 10). The 1 to 10 creates a range which is an iterable over exactly those values. They have a lot more power too. The question is, do they go in the chapter on collections or in the chapter on loops with the for? The chapter on collections is already going to have quite a bit in it and right now I was planning to cover only arrays and lists. Those are basically the most fundamental mutable and immutable collections respectively. The range almost seems out of place at that point and it opens up a whole big area of the fact that there are a lot of other types of iterables, sequences, etc. in Scala. That will have to get opened at some point though. The question is whether to do it with the other collections or wait until they are really needed with the for loop.

If anyone has suggestions on this, please post a comment. One thing I would point out is that higher order functions on ranges can be fun to play with as well. The for loop isn't really needed. To see this, just consider the following definition of factorial for positive numbers.

def fact(n:Int) = (1 to n).product

So the range isn't just for counting in a loop. That just happens to be the wa most people will use it by default.

Monday, August 2, 2010

Update on F ring

So after watching the F ring simulation I described earlier go for the weekend, I made some changed and updates. While this forced me to stop the simulation and lose what I had done over the weekend, one of the changes was to use a longer time step because this is a sparse simulation with gravity. The result is that now it is doing an orbit in under 8 hours instead of 36. That factor of 4.5 is huge considering it was looking like this would require 2 years to get anything significant done. Now it should have nice results in 6 months and be in a very good spot by next summer when I will have the time to really go look at what is happening.

The early stuff is fascinating. In particular, I gave the moons inclination, something I had never done before, and they are giving it to the ring. Because gravity is a 1/r^2 force, proximity matters more than elevation. So the ring gets its maximum inclination near the point of closest approach and not when the moon is at maximum elevation.

One of the big questions is how the inclination will impact negative diffusion. It is possible that it could lead to more vertical displacement and less migration of semimajor axis. Only by going through the simulations will I know for certain. I expect that I will have some results to put on my main web site before classes start. Of course, when I do I will put a post here telling all about it.