Dynamics of Programming: May 2011

Wednesday, May 25, 2011

Computer Performance Future

The topics of automation, AI, and the impact these will have on society have been big on this blog. This is because they are things that I think about a fair bit. I'm not in AI, my interest are numerical work and programming languages, but I live in the world and I train the people who will be writing tomorrows computing software so these things interest me. I've been saying that things get interesting around 2020. I think the social changes become more visible around 2015 as automation begins to soak up more jobs and by 2025 we are in a very odd place where a lot of people simply aren't employable unless we find ways to augment humans in dramatic ways.

Cray just announced a new supercomputer line that they say will scale to 50 petaflops. No one has bought one yet so there isn't one in existence, but they will be selling them by the end of the year and I'm guessing by next year someone builds one that goes over 10 petaflops. That's on the high end for most estimates I've seen of the computing power of the human brain so this is significant.

Thinking of the Cray announcement it hit me that I can put my predicted dates to the test a bit to see how much I really believe them, and as a way to help others decide if they agree or not. We'll start with the following plot from top500.org. This shows computing power of the top 500 computers in the world since 1993.

What we see is a really nice exponential growth that grows by an order of magnitude every 4 years. I couldn't find exact numbers for the Flops of the Watson BlueGene computer, but what I found tells me it would probably come in between 100 and 800 TFlops though that might be too high.

The thing is, that the processing power of the top 500 machines in the world isn't really going to change the world. MacDonald's isn't going to replace the human employees if it costs several million to buy the machine that can do the AI. However, smaller machines are doing about the same thing as these big machines. Right now if you can get a machine that does ~1 TFlops for about $1k assuming you put in a good graphics card and utilize it through OpenCL or CUDA based programs. So workstation machines are less then 2 orders of magnitude behind the bottom of the Top500 list. That means in 8 years a workstation class machine should have roughly the power of today's low end supercomputer. To be specific, in 2021 for under $10000 you will probably be able to buy a machine that can pull 100 TFlops. So you can have roughly a Watson for a fraction of a humans annual salary, especially if you include employer contributions to taxes and such. I'm guessing that running a McDonald's doesn't require a Watson worth of computer power. So if the reliability is good, by 2021 fast food companies would be stupid to employ humans. The same will be true of lots of other businesses that currently don't pay well.

Comparing to Watson might not be the ideal comparison though. What about the Google self-driving car or the Microsoft virtual receptionist? In the latter case I know that it was a 2P machine with 8 cores and something like 16GB of RAM. That machine probably didn't do more than 100 GFlops max. Google wasn't as forthcoming about their car, but it wasn't a supercomputer so I'm guessing it was probably a current workstation class machine.

What about the next step down in the processor/computer hierarchy? The newest tablets and cell phones run dual core ARM processors that only run about 100 MFlops. That's the bottom of the chart so they are 3.5-4 orders of magnitude down from the workstation class machines. Keep in mind though that given the exponential growth rate, the low power machines that you carry around will hit 1 TFlops in 16 years, by 2027. That means they can run their own virtual receptionist.

Networking and the cloud make this even more interesting because the small device can simply collect data and send it to bigger computers that crunch the data and send back information on what to do. What is significant is that the chips required to do significant AI will extremely cheap within 8-16 years. Cheap enough that as long as the robots side can make devices that are durable and dependable, it will be very inexpensive to have machines performing basic tasks all over the place.

So back to my timeline, a standard workstation type machine should be able to pull 10 TFlops by 2015, four years from today. I think thins like the virtual receptionist and the Google cars demonstrate that that will be sufficient power to take over a lot of easy tasks and as prices come down, the automation will move in. By 2020 the cost of machines to perform basic tasks will be negligible (though I can't be as certain about the robots parts) and the machines you can put in a closet/office will be getting closer to 100 TFlops, enough to do Watson-like work, displacing quite a few jobs that require a fair knowledge base. By 2025 You are looking at petaflop desktops and virtual assistants that have processing power similar to your own brain.

So I think the timeline is sound from the processing side. I also have the feeling it will work on the software side. The robots are less clear to me and they might depend on some developments in materials. However, graphene really appears to have some potential as a game changer and if that pans out I don't see the material side being a problem at all.

Sunday, May 15, 2011

Scala 2.9 and Typesafe

It is remarkable how far Scala has come in the 18 months or so since I first started learning it. The final release of Scala 2.9 just came out and Odersky has started a company called Typesafe that is intended to get more companies on line with Scala. These things excite me because I see them being very beneficial for both my personal programming and what I do in the classroom.

Having Typesafe should make it easier for companies to use Scala more and right now that is one of the very valid points against Scala, it simply isn't used as much out in the market place as other options. I truly expect that to change with time and I see this being a step in that direction. It will also make it easier for our sys-admins to get everything set up nicely and that is a big plus.

The number of additions in Scala 2.9 is significant. You can read all about them on the Scala site, but I want to highlight the ones that I think will be good for my teaching. The first one is the additions to the REPL. The REPL is a great teaching tool. It truly allows the student to get started typing in single statements and then to keep playing around with things later on to see how they work. Through 2.8 the REPL in Scala had some rough spots. The list of fixes for 2.9 seems to cover most of the problems I've run into so I'm very hopeful that the students next semester will have a much better experience with it.

The key addition for most developers in 2.9 is parallel collections. These will impact the second semester and beyond because I introduce parallelism in the second semester. Early on, this makes it easier to to parallel loops in Scala than it would be with even OpenMP. Consider this code that calculates and prints Fibonacci numbers.

for(i <- 0 to 15 par) println(fib(30-i))

When you run this using the slower, recursive version of fib, you get the numbers back out of order with the biggest values near the end. Just adding the call to par is all it takes. Of course, the for loop and collections can do a whole lot more than this and they will also do their tasks with the simple addition of a call to the par method.

Not only did the collections get parallel, they got a new set of methods that come standard: collectFirst, maxBy, minBy, span, inits, tails, permutations, combinations, subsets. These just make the already rich set of methods on collections a bit richer. The last three, in particular, strike me as easily enabling some interesting problems.

The last addition I want to highlight is one that I really don't know all that much about and as such I'm not certain how much it will impact my teaching. However, I'm optimistic about it. This is the addition of the scala.sys and scala.sys.process packages. I use the scripting environment of Scala in the first semester. I love how this lets us write programs with a low overhead. Up to now though, Scala hasn't really been good for scripting in the sense of launching lots of processes and dealing with the OS. These packages look like they will help to bridge that so that I can use Scala for those types of things instead of having to move to the ugliness that is Perl.

Sunday, May 1, 2011

Real Implications of Automated Cars

In case anyone had forgotten, I really want a car that drives itself. I don't like to drive. I find it to be a waste of time. I liked buses in the Denver-Boulder area, but the system in SA, especially where I live, isn't up to the same level and my life here doesn't support that. Just the ability to read during commutes was great for me. I would get through a lot more of my intended readings if I could sit back and read during my morning and evening commutes.

The ability to do other things while the car drives is only a first-order effect though. It is using the car like we do now, but just more automated. Where things get interesting is when you look at the higher-order effects. The examples that come to my mind are uses of the car that occur without having a licensed driver present. It might take a while longer for these to take effect because people/society will have to truly trust the automated driving mechanism before it will be legal to have the cars really drive themselves without a person who is responsible for them being present. However, I think that point will be reached and that is when the full impact of this change will become apparent.

There are two main categories that jump out

No humans in the car at all.
Non-licensed drivers in the car.

The first thought I had related to the initial item was because I see automated cars as needing to have a high level of safety maintenance. Things like cleaning sensors and doing regular checks that they work will be required for legal reasons and for insurance. The work itself will be largely automated and it won't take long before people want to just send their car out to do it while they are at home or at work.

Ever had one of those late night cravings for some type of food that you don't have in the house? Maybe a fast-food pick-up. You don't really need to be there for that. You place the order online (possible on a phone or tablet) then send the car to pick it up. A little extra automation will be needed, but nothing too difficult.

Remember in the dotcom bubble there were companies that wanted to deliver groceries with online orders? It didn't scale well because of the cost of delivery. If people can simply send their car to pick it up that problem is solved.

This also leads to a new specialized product: mini-cars that don't carry humans, only other stuff. The vehicle that goes to get your combo meal doesn't need to be big enough to carry a human or have the nice seats. Same for all these other tasks. You can have a much smaller device that exists just to transport goods to end users.

The second bullet was normal cars driving without a licensed driver. Driving the kids to school in the morning? Why does a parent have to be there physically? I can see all types of bad social implications of this with parents sending their kids out all the time and never seeing them. Then again, how different is it to do that with your self-driving car versus their bike? Less exercise for the kid, but not less contact with the parent. In fact, with video call capabilities the parent could be interacting with the kid while they are being transported.

Of course, automation is going to alter all types of other things in the world in the coming years. This was just a few thoughts on some of the less obvious implications of cars that drive themselves.