CyberShake DAX Generator

From SCECpedia
Jump to navigationJump to search

This page provides an overview of the CyberShake DAX generator, which includes all the Java classes associated with producing DAXes for any kind of CyberShake workflow.

The source for the DAX generator is available at http://source.usc.edu/svn/cybershake/import/trunk/dax-generator-3/ . All required dependencies are in the lib directory, though the Pegasus JAR may need to be updated if new Pegasus API features are used.

The DAX Generator is fairly complex, as we have retained backwards compatibility, even though some of the options may not have been used in several years.

Java classes

The DAX generator package consists of DAX creation classes:

  • CyberShake_Integrated_DAXGen.java: Entry point for generating a full ('integrated') workflow.
  • CyberShake_SGT_DAXGen.java: Generates an SGT-only workflow, by calling PreCVM and then CyberShake_AWP_SGT_DAXGen if running AWP-ODC-SGT.
  • CyberShake_AWP_SGT_DAXGen.java: Creates workflow with all SGT jobs except PreCVM. This is because, when running AWP-ODC-SGT in CPU mode, we need to figure out the box dimensions in PreCVM before we can determine the number of processors we'll need in each dimension, so after PreCVM an AWP subworkflow is generated.
  • CyberShake_PP_DAXGen.java: Entry point for generating a post-processing workflow. Is also called as part of creating an integrated workflow.
  • CyberShake_Stochastic_DAXGen.java: Entry point for generating a stochastic-only post-processing workflow.
  • CyberShake_Sub_Stoch_DAXGen.java: Most of the guts of generating a stochastic-only post-processing workflow. Since we need velocity model information as input to some of the stochastic codes, we generate the velocity model information in CyberShake_Stochastic_DAXGen, then pass the file to this class as an input at creation time. That way we can put the velocity in as a command-line parameter, rather than having to have every task read the same velocity file.
  • CyberShake_DB_DAXGen.java: Generates a sub-workflow with the database and data product jobs.

and helper classes:

  • CyberShake_Workflow_Container.java: This is used to hold the sub-workflows which are part of a PP workflow. It's needed to set up dependencies correctly when running an integrated workflow, or if there are multiple post-processing workflows per run (this case is not used anymore).
  • DBConnect.java: Contains code for accessing databases.
  • PP_DAXParameters.java: Container class to track all science and technical parameters for a post-processing workflow which impact either which codes are called, or what to pass in the command-line to tasks.
  • RunIDQuery.java: Container class to track all the database parameters for a run.
  • RuptureVariationDB.java: Supports writing an SQLite database to track mapping of which events go with which subworkflow. Since we've moved to a single subworkflow with a single DirectSynth job, we don't use this.
  • SGT_DAXParameters.java: Container class to track all science and technical parameters for an SGT workflow which impact either which codes are called, or what to pass in the command-line to tasks.
  • Stochastic_DAXParameters.java: Container class to track all science and technical parameters for a stochastic workflow which impact either which codes are called, or what to pass in the command-line to tasks.

Below are some diagrams to help explain the relationship between these classes (original file here: ).

SGT workflow classes diagram
Post-processing workflow classes diagram
Stochastic workflow classes diagram
Full workflow classes diagram

Call Graphs

SGT Generation

  • CyberShake_SGT_DAXGen.main(): after workflows are created in subMain(), writes DAX object to file
    • CyberShake_SGT_DAXGen.subMain()
      • CyberShake_SGT_DAXGen.parseCommandLine(): parses command-line arguments, populates SGT_DAXParameters and RunIDQuery objects
      • CyberShake_SGT_DAXGen.makeWorkflows(): initializes the CyberShake_SGT_DAXGen object
        • CyberShake_SGT_DAXGen.makeDAX(): based on parameters, calls appropriate sequence of functions to create individual jobs with dependencies
          • CyberShake_SGT_DAXGen.addPreCVM()
          • CyberShake_SGT_DAXGen.addGenSGTDAX(): job which creates a workflow which contains the rest of the SGT jobs. At runtime, CyberShake_AWP_SGT_DAXGen.main() is called.
            • CyberShake_AWP_SGT_DAXGen.main(): parses command-line options, calls appropriate sequence of functions to create individual jobs with dependencies, initializes RunIDQuery object
              • CyberShake_AWP_SGT_DAXGen.getVolume(): determines number of grid points in each dimension, needed for determining processor layout
              • CyberShake_AWP_SGT_DAXGen.getProcessors(): determines number of processors in each dimension for AWP code
              • CyberShake_AWP_SGT_DAXGen.addVMeshSingle()
              • CyberShake_AWP_SGT_DAXGen.addPreSGT()
              • ...
              • CyberShake_AWP_SGT_DAXGen.addUpdate()

Post-processing

  • CyberShake_PP_DAXGen.main(): Sets up dependencies between DAXes created by subMain(), writes DAXes to files
    • CyberShake_PP_DAXGen.subMain(): initializes PP_DAXParameters and CyberShake_PP_DAXGen objects
      • CyberShake_PP_DAXGen.parseCommandLine(): parses command-line arguments, populates PP_DAXParameters and RunIDQuery objects
      • CyberShake_PP_DAXGen.makeDAX(): initializes RunIDQuery and CyberShake_Workflow_Container objects, and depending on parameters, calls functions to create DirectSynth, extraction, and synthesis jobs
        • CyberShake_PP_DAXGen.putFreqInDB(): populates database with Max_Frequency value
        • CyberShake_PP_DAXGen.makePreDax(): creates Pre DAX with Update job, MD5sum check jobs, set PP host job
        • CyberShake_PP_DAXGen.addDirectSynth()
        • CyberShake_PP_DAXGen.genDBProductsDAX(): creates CyberShake_DB_DAXGen object, initializes with values from PP_DAXParameters
          • CyberShake_DB_DAXGen.makeDAX(): based on parameters, creates jobs for database insertion, checking, and data product generation
            • CyberShake_DB_DAXGen.createDBInsertionJob()
            • CyberShake_DB_DAXGen.createDBCheckJob()
            • ...
            • CyberShake_DB_DAXGen.createDBReportJob()
        • CyberShake_PP_DAXGen.makePostDax(): creates Post DAX with Update job

Stochastic post-processing

  • CyberShake_Stochastic_DAXGen.main(): create Stochastic_DAXParameters and CyberShake_Stochastic_DAXGen objects
    • CyberShake_Stochastic_DAXGen.parseCommandLine(): parse the command-line, populate Stochastic_DAXParameters
    • CyberShake_Stochastic_DAXGen.makeDax(): Check that low-frequency run has the same parameters as the proposed stochastic run; create starting jobs; write DAXes to files
      • CyberShake_Stochastic_DAXGen.addUpdate()
      • CyberShake_Stochastic_DAXGen.getVelInfo(): Job to query UCVM and calculate velocity parameters needed by site response jobs
      • CyberShake_Stochastic_DAXGen.genStochDAX(): job which creates a workflow which contains the rest of the stochastic jobs, which need the velocity information as input. At runtime, CyberShake_Sub_Stoch_DAXGen.main() is called.
        • CyberShake_Sub_Stoch_DAXGen.main(): parses command-line, populates Stochastic_DAXParameters, creates CyberShake_Sub_Stoch_DAXGen object, writes DAX to file
          • CyberShake_Sub_Stoch_DAXGen.createJobs(): based on parameters, creates stochastic jobs
            • CyberShake_Sub_Stoch_DAXGen.processVelocityFile(): reads velocity information to use in stochastic jobs as command-line arguments
            • CyberShake_Sub_Stoch_DAXGen.addUpdate()
            • CyberShake_Sub_Stoch_DAXGen.createLocalVMJob()
            • CyberShake_Sub_Stoch_DAXGen.createHFSynthJob()
            • ...
            • CyberShake_Sub_Stoch_DAXGen.createLFSiteResponse()
      • CyberShake_DB_DAXGen.makeDAX(): based on parameters, creates jobs for database insertion, checking, and data product generation
        • CyberShake_DB_DAXGen.createDBInsertionJob()
        • CyberShake_DB_DAXGen.createDBCheckJob()
        • ...
        • CyberShake_DB_DAXGen.createDBReportJob()

Full integrated workflow

  • CyberShake_Integrated_DAXGen.main(): calls functions to create workflows, sets up dependencies between sub-workflows, writes to files
    • CyberShake_Integrated_DAXGen.parseCommandLine(): parse the command-line, create RunIDQuery object, separate arguments into SGT and PP args
    • CyberShake_SGT_DAXGen.subMain(): Create SGT workflow
    • <Same call graph as in SGT workflow, above>
    • CyberShake_Integrated_DAXGen.createJobIDJob(): sets JobID in the database between SGT and PP, since the last SGT workflow job removes the Job ID
    • CyberShake_PP_DAXGen.subMain(): Create PP workflow
    • <Same call graph as in PP workflow, above>