CSEP 1 STEPCoulombQuickFix
Contents
Introduction
RFC002 discusses the issues observed with the STEPCoulomb model in the CSEP New Zealand testing center. CSEP provides a computational infrastructure to perform and evaluate both prospective and retrospective earthquake forecasting experiments. In regards to a retrospective forecasting experiment, it was observed that the STEPCoulomb model installed in CSEP did not produce the expected output. This behavior was observed for the Darfield Experiment -- a retrospective forecasting experiment of the 2010 Christchurch Earthquake sequence addressing the information gained by including a physics-based component of the forecast.
The STEPCoulomb model represents a hybrid model consisting of a statistical STEP model and a physics-based Coulomb stress model. The executable that computes the STEP forecast obtains a Coulomb mask based on available slip models, and uses the information contained from the Coulomb stress model to provide physical constraints to the statistical STEP (short-term earthquake prediction) forecast. The STEPCoulomb forecast should produce two forecast outputs: (1) STEP forecast, and (2) STEPCoulomb forecast, which can be compared to determine the pairwise efficacy of the two forecasting models.
This document is laid out as follows: Problem Statement provides a description of the problems as communicated by the science groups to SCEC IT; and Proposed Solution provides an explanation of the proposed solution to the problem outlined in the problem statement. Implemented Solution outlines the specific steps taken to solve the problem.
Problem Statement
1. Overview
It was suspected the STEPCoulomb forecast, computed by the New Zealand CSEP testing center, did not produce the intended outputs. This issue affects an on-going manuscript review, so an urgent response is needed. The problem was confirmed by comparing the STEP forecast with the supposed STEPCoulomb forecast and determining that there were no differences between the forecasts. Computing differences between the two forecasts indicated that both files were identical. It is expected that the physics-based Coulomb stress model should influence the rates forecasted by the statistical STEP model.
2. Detailed Explanation
When invoking the STEPCoulomb model, CSEP makes sequential, external calls to two executables: (1) STEPCoulomb and (2) step-aftershock. In model (1), finite-fault slip models are used to produce bitwise masks based on Coulomb stress calculations. The masks produced by model (1) are meant to be read by model (2), and used to update the forecasted rates. To facilitate the interchange of these executables, CSEP creates 'CSEP-specific' input files to define the file paths for forecast files and associated metadata on the CSEP filesystem. For example, the file located at
/home/csep.canterbury/DarfieldExperiment/bestAvailableTC/batch/runs/csep/2015_11/20151109214739/pid_3368/scec.csep.STEPCOULOMBOneDay.CoulombPart_STEPCOULOMBOneDay_control.dat.1447165313.806906.1
provides the CSEP-specific inputs to forecast model (1). An example input file for (2) can be found at:
/home/csep.canterbury/DarfieldExperiment/bestAvailableTC/batch/runs/csep/2015_11/20151109214739/pid_3368/scec.csep.STEPCOULOMBOneDay.STEPCOULOMBOneDay_control.dat.1447165313.743799.1
If we compare the contents of this input file with the source code of executable (2), we find that there is a bug which would produce the issue described in the overview section. The contents of the linked input file for exectuable (2) are shown below:
1 1 2010 0 0 0 29 11 2010 16 35 42 1 /home/csep/DarfieldExperiment/bestAvailableTC/batch/runs/csep/2015_11/20151109214739/pid_3368/OneDayModelInputCatalog.dat /home/csep/DarfieldExperiment/bestAvailableTC/one-day-forecasts/forecasts/STEPCOULOMBOneDay_11_29_2010.16_35_42.dat /home/csep/csep-install/src/SCECModels/NewZealand/src/STEPCoulombModel/OpenSHA/NZZeroRate05.dat /home/csep/DarfieldExperiment/bestAvailableTC/one-day-forecasts/forecasts/CoulombPart_STEPCOULOMBOneDay_11_29_2010.16_35_42.dat
The code below shows the RegionDefaults.setCsepParams(...) function that is responsible for reading the CSEP generate input file and overwriting the default parameters of the model.
public static void setCsepParams(String filePath) throws FileNotFoundException, IOException { logger.info(">> setCsepParams filePath " + filePath); ArrayList<String> fileLines = FileUtils.loadFile(filePath); String line = (String)fileLines.get(0); logger.info("catalog start date " + line); EVENT_START_TIME = parseCsepTime2Cal(line); logger.info("EVENT_START_TIME " + dateformater.format(EVENT_START_TIME.getTime())); line = (String)fileLines.get(1); logger.info("catalog end date " + line); forecastStartTime = parseCsepTime2Cal(line); logger.info("forecastStartTime " + dateformater.format(forecastStartTime.getTime())); line = (String)fileLines.get(2); logger.info("forecast length " + line); forecastLengthDays = Integer.parseInt(line.trim()); line = (String)fileLines.get(3); logger.info("catalog input file " + line); cubeFilePath = line; line = (String)fileLines.get(4); logger.info("forecast output file " + line); outputAftershockRatePath = line; line = (String)fileLines.get(5); logger.info("background rate input file " + line); BACKGROUND_RATES_FILE_NAME = line; line = (String)fileLines.get(6); logger.info("Coulomb input file " + line); coulombFilterPath = line; }
We see that the output file name is assigned to the variable outputAftershockRatePath which defines the output for the pure STEP forecast. For more details on this aspect, view the method STEP_main.saveRatesFile(...). It shows that outputAftershockRatePath defines the output file path of the pure STEP forecast. The Coulomb portion of the STEPCoulomb model is implemented in "ApplyCoulombFilter.applyFilter(...)". The IO portion of this method is shown below:
try { FileWriter fr = new FileWriter(RegionDefaults.outputCoulombRatePath); bgRegionSize = bgLocList.size(); for (int k = 0; k < bgRegionSize; k++) { Location bgLoc = bgLocList.getLocationAt(k); HypoMagFreqDistAtLoc bgDistAtLoc = bggrid.getHypoMagFreqDistAtLoc(k); if (bgDistAtLoc != null) { int numForecastMags = 1 + (int)((RegionDefaults.maxForecastMag - RegionDefaults.minForecastMag) / RegionDefaults.deltaForecastMag); for (int index = 0; index < numForecastMags; index++) { double mag = RegionDefaults.minForecastMag + index * RegionDefaults.deltaForecastMag; double wrate = bgDistAtLoc.getFirstMagFreqDist().getIncrRate(mag); if (wrate == 0.0D) { wrate = 1.0E-11D; } fr.write(locFormat.format(bgLoc.getLongitude() - RegionDefaults.gridSpacing / 2.0D) + " " + locFormat.format(bgLoc.getLongitude() + RegionDefaults.gridSpacing / 2.0D) + " " + locFormat.format(bgLoc.getLatitude() - RegionDefaults.gridSpacing / 2.0D) + " " + locFormat.format(bgLoc.getLatitude() + RegionDefaults.gridSpacing / 2.0D) + " " + RegionDefaults.MIN_Z + " " + RegionDefaults.MAX_Z + " " + magFormat.format(mag - RegionDefaults.deltaForecastMag / 2.0D) + " " + magFormat.format(mag + RegionDefaults.deltaForecastMag / 2.0D) + " " + wrate + " 1\n"); } } } fr.close(); } catch (IOException ee) { ee.printStackTrace(); }
We see that a FileWriter is created using the filepath of "RegionDefaults.outputCoulombRatePath" which is hard-coded to be "./coulombRates.txt." As seen by the "RegionDefaults.setCsepParams(...)" code listen above, this location is not overwritten by the CSEP generate input files. This indicates that the Coulomb Rates are not being correctly stored to the CSEP file system and are being overwritten with each subsequent run.
Proposed Solution
There are two potential ways to solve this problem. The decision depends on how we want CSEP to treat the individual forecasting models, and considerations with modifying the step-aftershock.jar executable.
1. Output two forecast models from step-aftershock
This solution would require modifying STEPCoulombModel.writeSTEPFile(...) in the CSEP generic source code to accept the additional argument required to write out the STEPCoulomb. This would also require modifying RegionDefaults.setCsepParams(...) in the step-aftershock source code to accept this argument. It's currently uncertain if CSEP will respond well to a model producing two independent forecast files. This is not a concern for computing the forecasts, but could cause issues if trying to run evalulations within CSEP. This solution seems to be the most straightforward.
[Favored] 2. Have STEPCoulomb output one forecast and run STEP to output the STEP only forecast
This solution would only require changing the RegionDefaults.setCsepParams(...) method to read the CSEP command line argument to the correct variable "outputCoulombRatePath". We would also need to supply configuring the pure STEP model as an additional model in the forecast group configuration. This would require likely more than twice the computational time, because the currently installed STEP model is written in Matlab. I am unsure how long the forecasts take to run in Matlab. However, this solution seems more in line with the CSEP philosophy.
To implement this solution elegantly in CSEP we will register a new model that represents the true STEPCoulomb model. This model will use a separate executable containing the proposed solution outlined in this document. The CSEP system will treat the two models independently, and allow evaluations to be computed using the standard CSEP methodology.
Implemented Solution (Under review)
All changes made on Git branch wsavran/step-coulomb-rfc002. Branch has not yet been pushed to remote. Waiting for confirmation on proper way to proceed to uphold CSEP values of transparency and reproducibility.
1. Modifications to step-aftershock
git diff HEAD^ HEAD diff --git a/src/SCECModels/NewZealand/src/STEPCoulombModel/OpenSHA/classes/.gitignore b/src/SCECModels/NewZealand/src/STEPCoulombModel/OpenSHA/classes/.gitignore new file mode 100644 index 0000000..3f20248 --- /dev/null +++ b/src/SCECModels/NewZealand/src/STEPCoulombModel/OpenSHA/classes/.gitignore @@ -0,0 +1,6 @@ +/junk/ +/scratch/ +/Makefile +/log4j.properties +/org/ +/resources/ diff --git a/src/SCECModels/NewZealand/src/STEPCoulombModel/OpenSHA/src/org/opensha/step/calc/RegionDefaults.java b/src/SCECModels/NewZealand/src/STEPCoulombModel/OpenSHA/src/org/opensha/step/calc/RegionDefaults.java index 8943db8..37a3304 100644 --- a/src/SCECModels/NewZealand/src/STEPCoulombModel/OpenSHA/src/org/opensha/step/calc/RegionDefaults.java +++ b/src/SCECModels/NewZealand/src/STEPCoulombModel/OpenSHA/src/org/opensha/step/calc/RegionDefaults.java @@ -401,7 +401,7 @@ public class RegionDefaults { cubeFilePath = line; line = fileLines.get(4); logger.info("forecast output file " + line); - outputAftershockRatePath = line;//OUTPUT_DIR + "/" + props.getProperty("output.file.time.dep.rates", "TimeDepRates.txt"); + outputCoulombRatePath = line;//OUTPUT_DIR + "/" + props.getProperty("output.file.time.dep.rates", "TimeDepRates.txt"); line = fileLines.get(5); logger.info("background rate input file " + line);
This model must be compiled and be installed on the CSEP system where the reprocessing will occur.
2. Modifications to CSEP system
- We added the model src/SCECModels/NewZealand/STEPCoulombOneDayModel2.py to wrap the step-aftershock executable that writes out the true STEPCoulomb forecast. The implementation is shown below:
git diff HEAD^ src/SCECModels/NewZealand/STEPCoulombOneDayModel2.py | cat diff --git a/src/SCECModels/NewZealand/STEPCoulombOneDayModel2.py b/src/SCECModels/NewZealand/STEPCoulombOneDayModel2.py new file mode 100644 index 0000000..07dee7b --- /dev/null +++ b/src/SCECModels/NewZealand/STEPCoulombOneDayModel2.py @@ -0,0 +1,177 @@ +""" +Module STEPCoulombOneDayModel +""" + +__version__ = "$Revision$" +__revision__ = "$Id$" + + +import os + +import CSEPFile +from CSEPLogging import CSEPLogging +from Forecast import Forecast +from OneDayForecast import OneDayForecast +from ReproducibilityFiles import ReproducibilityFiles +from STEPCoulombModel import STEPCoulombModel + + +#------------------------------------------------------------------------------- +# +# STEPCoulombOneDayModel one-day forecast model. +# +# This class is designed to invoke one-day STEPCoulombOneDayModel forecast model +# by Matt Gerstenberger and Sandy Steacy. It prepares input catalog data, +# formats control file with model parameters, identifies list of available +# Coulomb and slip models and provides it as an input to the model. It places +# forecast file under user specified directory. +# +class STEPCoulombOneDayModel2 (OneDayForecast, STEPCoulombModel): + + # Static data of the class + + # Keyword identifying type of the class (Hacky implementation) + Type = STEPCoulombModel.Type + OneDayForecast.Type + '2' + + #--------------------------------------------------------------------------- + # + # Initialization. + # + # Input: + # dir_path - Directory to store forecast file to. + # args - Optional arguments for the model. Default is None. + # + def __init__ (self, dir_path, args = None): + """ Initialization for STEPCoulombOneDayModel class""" + + OneDayForecast.__init__(self, + dir_path) + STEPCoulombModel.__init__(self, + args) + + # Parameter file for the Coulomb part of the model + self.CoulombParamFile = None + + #--------------------------------------------------------------------------- + # + # Return keyword identifying the model. + # + # Input: None. + # + # Output: + # String identifying the type + # + def type (self): + """ Returns keyword identifying the forecast model type.""" + + return STEPCoulombOneDayModel.Type + + + #--------------------------------------------------------------------------- + # + # Write input parameter file for the model + # + def writeParameterFile (self, + filename = None): + """ Format input parameter file for the model. + Path to created input parameter file will be passed to the + model's executable.""" + + # Open STEPCoulomb param file + fhandle = Forecast.writeParameterFile(self, + filename) + + STEPCoulombModel.writeSTEPFile(self, + fhandle, + self.start_date, + self.numDays(self.start_date, + self.end_date), + os.path.join(self.catalogDir, + self.inputCatalogFilename()), + self.filename()) + + fhandle.close() + + # Write param file to invoke Coulomb part of the model first + param_path, param_file = os.path.split(self.parameterFile) + + self.CoulombParamFile = os.path.join(param_path, + "CoulombPart_" + param_file) + + # To avoid inheritance from two classes that are derived from the same + # base "Forecast" class, use this "pass all values" approach + STEPCoulombModel.writeCoulombFile(self, + self.CoulombParamFile, + self.start_date, + self.catalogDir) + + # Register input parameters file for reproducibility + info_msg = "Input parameters file used by Coulomb part of %s model \ +to generate '%s' forecast file for %s." %(self.type(), + self.rawFilename(), + self.start_date.date()) + + # Record parameter file with reproducibility registry + ReproducibilityFiles.add(self, + self.CoulombParamFile, + info_msg, + CSEPFile.Extension.ASCII) + + + #-------------------------------------------------------------------- + # + # Invoke the model to generate forecast. + # + # Input: None + # + def run(self): + """ Invoke model.""" + + coulomb_output = self.invokeModel(self, self.CoulombParamFile, self.parameterFile) + + # Move Coulomb file to the archive directory - should not be + # evaluated + if self.archive is not None: + + # Move file to the archive directory to avoid it + # in evaluation tests + CSEPLogging.getLogger(STEPCoulombOneDayModel.__name__).info("run(): moving %s to %s" + %(coulomb_output, + self.archive)) + + new_path = os.path.join(self.archive, + os.path.basename(coulomb_output)) + os.renames(coulomb_output, + new_path) + + + #-------------------------------------------------------------------------------------- + # + # Overwrites STEPModel.invokeForecast(...), to use different executable for STEPCoulomb + # + # Invokes different executable for the STEPCoulomb_2 model. + # + def invokeModel(self, parameter_file, step_parameter_file): + """ Invoke model.""" + command = "%s %s" %(os.path.join(STEPCoulombModel.Path, + STEPCoulombModel.__CoulombExecutableFile), + parameter_file) + Environment.invokeCommand(command) + + cwd = os.getcwd() + os.chdir(os.path.join(STEPCoulombModel.__JavaPath, + 'build')) + try: + # Model requests to change to the model installation directory + # since it's using relative paths to locate other files + __command = "java -Xms2048m -Xmx6144m -jar lib/step-aftershock_coul.jar 0 -f %s" %os.path.join(cwd, + step_parameter_file) + Environment.invokeCommand(__command) + finally: + # Go back to the directory + os.chdir(cwd) + + # Return path to the Coulomb part of the model that needs to be + # archived to avoid it's evaluation + return self.CoulombOutput +
- The expected executable should be placed in STEPCoulombModel.__JavaPath/build/lib/step-aftershock_coul.jar directory on the CSEP image. This would be in addition to the current "STEPCoulombModel.__JavaPath/build/lib/step-aftershock.jar". This new model was implemented by overriding the STEPCoulomb.invokeModel(...) in STEPCoulombOneDay2.py function to recognize the newly created executable. In addition the STEPCoulombOneDay2.run(...) is overwritten to properly called the new invokeModel(...) command. This was necessary due to STEPCoulomb calling the method on the parent and not self.
- ForecastFactory XML File was updated to include the new file. See updates below:
git diff HEAD^ src/SCECModels/NewZealand/DarfieldForecastFactory.init.xml | cat diff --git a/src/SCECModels/NewZealand/DarfieldForecastFactory.init.xml b/src/SCECModels/NewZealand/DarfieldForecastFactory.init.xml index a522dc5..fd5a138 100644 --- a/src/SCECModels/NewZealand/DarfieldForecastFactory.init.xml +++ b/src/SCECModels/NewZealand/DarfieldForecastFactory.init.xml @@ -42,6 +42,7 @@ <module>CattaniaOneYearModel.CattaniaOneYearModel</module> <module>STEPCoulombOneDayModel.STEPCoulombOneDayModel</module> + <module>STEPCoulombOneDayModel2.STEPCoulombOneDayModel2</module> <module>STEPCoulombOneMonthModel.STEPCoulombOneMonthModel</module> <module>STEPCoulombOneYearModel.STEPCoulombOneYearModel</module>
3. Reprocessing Specifics
Verification phase: Use the 03 September 2010 and 04 September 2010 using the bestAvailable and bestAvailable+10days data sets.