The Southern California Earthquake Center/Community Modeling Environment (SCEC/CME) have run enhanced simulations at SDSC using the improved TeraShake 2 earthquake model. The new simulations, which used the Anelastic Wave Model (AWM), a fourth-order finite difference code developed by Kim Olsen, associate professor of geological sciences at San Diego State University (SDSU), are the most realistic yet of where the most intense ground motion may occur in Southern California during a magnitude 7.7 San Andreas Fault earthquake. To make the simulations more realistic a dynamic rupture is simulated with 100m grid size. The grid size for the wave propagation is 200 m. That is in a first step the dynamic rupture code is run in a smaller volume sourrounding the fault using a smaller grid size, then the computed model is feed into the wave propagation code which computes ground shaking for a lrge area using a 200m grid. The TeraShake 2 simulation of wave propagation run at 200 m resolution over the immense volume of 600 kilometers long by 300 kilometers wide and 80 km deep with some 1.8 billion grid points. The computation time was four days on 240 processors of the newly-expanded 15.6 Teraflops DataStar supercomputer. The run produced 10 Tb of data which were visualzied on DataStar with about 20,000 hr of computation time yielding about 100,000 images. Very impressing movies and images which are also of general interest for us living in "earthquake country" can be watched at http://visservices.sdsc.edu/projects/scec/terashake/ .
The next figure shows the decomposition of the computational domain

Currently the dynamic rupture code scales up to 1028 processors, there seem still to be issues in scaling the wave propagation code up to more than a couple of undred processors. Both codes are parallelized using MPI. The following figure from Cui (2006) shows the speedup of the dynamic rupture code:

For 1024 processors the speedup is about 50%. This value seems to be a good value for such a large amount of processors, the speedup is higher for a smaller number of processors. The main problem is the MPI-IO which does not scale well. This especially becomes a problem for the wave propagation which code produces a lot of data, whereas the dynamic code is intense in the computation itself producing less data. The next Figure shows the speedup of the wave propagation code for up to 240 processors (Cui, 2006)

The value of the speedup for this code and 240 processors with about 87% is very good. The project combines a lot of people from different fields in order to optimize the performance, hence in my opinion this project is a great succes and for sure will be further improved.