Difference between revisions of "Rupture Variation Generator v5.5.2 code changes"

Latest revision as of 20:18, 2 November 2022

This page details the changes that were made to the Graves & Pitarka rupture generator v5.5.2, as shipped in the BBP, to make it CyberShake-ready. These changes are required to:

Create an API
Support using the code in two different modes:
- Determine the number of rupture variations for a given rupture
- Generate a rupture variation from a given geometry
Support the use of memcached
Pass in the rvfrac and seed value, if desired

In this document, $ROOT_DIR is taken to be the <CyberShake home>/software/RuptureCodes/RupGen-api-5.5.2 directory. GenRandV5.0 is the code as obtained from the BBP v22.4.

Installing files

Create src, include, lib, and bin directories inside $ROOT_DIR .
Inside src, create GenRandV5.0.
Copy rupgen_api.c from a previous API version into src.
Copy rupgen_defs.h from a previous API version into src.
Create a symlink to rupgen_defs.h in src/GenRand5.0.
Copy rupgen_api.h from a previous API version into src.
Create a symlink to rupgen_api.h in src/GenRand5.0.
Confusingly, there are two files named structure.h. One is in StandRupFormat and contains the structures needed for an SRF. The other is in GenRandV5.0 and contains the definition for complex, along with a few other supporting structures.
1. Copy the BBP StandRupFormat/structure.h into src/srf_structure.h .
2. Copy the BBP GenRandV5.0/structure.h into src/structure.h .
3. Create a symlink to both files in src/GenRandV5.0.
There are also two files named defs.h.
1. Copy StandRupFormat/defs.h into src/GenRandV5.0/srf_defs.h .
2. Copy GenRandV5.0/defs.h into src/GenRandV5.0/defs.h .
From the code provided by Rob, copy the following files into src/GenRandV5.0. Some of them are only lightly used, but it's easier to copy in the entire file rather than only taking the needed functions:
- function.h
- include.h
- fourg.f
- genslip-v5.5.2.c
- gslip_srf_subs.c
- iofunc.c
- misc.c
- ruptime.c
- slip.c
- sliprate_subs.c
- srf_subs.c (from StandRupFormat)
- gslip_sliprate_subs.c
- geoproj_subs.c
- wafront2d-rwg.f
- makefile
Copy Makefile from a previous API version into src.

Editing files

In $ROOT_DIR/src

structure.h

At the top of the file, change '#include "StandRupFormat/structure.h"' to '#include srf_structure.h'
Above that line, add:
```
#ifndef STRUCT_H
#define STRUCT_H
```
At the end of the file, add:
```
#endif
```

srf_structure.h

At the top of the file, add:

#ifndef SRF_STRUCT_H
#define SRF_STRUCT_H

At the end of the file, add:
```
#endif
```

Makefile

Edit the GETPAR values to point to the getpar install.
Edit GENSLIP to have the same name as the GenRandV5.0 directory.
Edit the library creation line (under the librupgen.a target, starts with $(AR)) to point to the correct version of genslip.

In GenRandV5.0

defs.h

At the top of the file, add:

#ifndef DEFS_H
#define DEFS_H

#include "srf_defs.h"

Delete DEFAULT_SHAL_VRUP_FRAC, as it is not used anymore.
At the end of the file, add:
```
#endif
```

srf_defs.h

At the top of the file, add:
```
#ifndef SRF_DEFS_H
#define SRF_DEFS_H
```
Delete DEFAULT_SHAL_VRUP_FRAC, as it is not used anymore.
Delete the lines corresponding to RDONLY_FLAGS, RDWR_FLAGS, CROPTR_FLAGS, and the _FILE_OFFSET_BITS if statement, as this already exists in defs.h.
At the end of the file, add:
```
#endif
```

include.h

At the top of the file, add:

#ifndef RUPGEN_INCLUDE_H
#define RUPGEN_INCLUDE_H

At the end of the file, add:

#include <string.h>
#include "fftw3.h"

#include "rupgen_api.h"
#include "rupgen_defs.h"

#ifdef _USE_MEMCACHED
#include <libmemcached/memcached.h>
#endif

#endif

function.h

At the top of the file, add:

#ifndef RUPGEN_FUNC_H
#define RUPGEN_FUNC_H

#include "structure.h"

Change the entry for read_ruppars to _read_ruppars, like:

struct pointsource *_read_ruppars(char *,struct pointsource *,float *,int *,int *,float *,float *,float *,float *,float *,float *,float *);

Add an entry for get_rupt:

void get_rupt(struct velmodel* vm, float* h, float* srcd, float* recd, float* srcr, float* recr, double* p, double* rad, float* tt);

Add an entry for zapit:
```
void zapit(float* s,int n);
```

Add entries for load_command_srf() and load_seed_srf():

void load_command_srf(struct standrupformat *srf,int ac,char **av);
void load_seed_srf(struct standrupformat *srf,int starting_seed,int ending_seed);

At the end of the file, add:

void write_srf1(struct standrupformat *srf,char *file,int bflag);
void write_srf2(struct standrupformat *srf,char *file,int bflag);

#ifdef _USE_MEMCACHED
int mc_genslip(int ac,char **av, rg_stats_t *stats, struct standrupformat* srf, int state, char* mc_server);
struct pointsource* _mc_read_ruppars(char *file,struct pointsource *psrc,float *mag,int *nx,int *ny,float *dx,float *dy,float *dtop,float *stk,float *dip,float *elon,float *elat, char* mc_server);
#else
int genslip(int ac,char **av, rg_stats_t *stats, struct standrupformat* srf, int state);
#endif

#endif

slip.c

In line 850, just before

arr = check_realloc(arr, n2*sizeof(fftwf_complex));

insert

fftwf_destroy_plan(plan);

In line 881, change the 'fftwf_free(arr)' to 'free(arr)'; since arr was allocated using check_malloc/check_realloc it should be freed with just free().

makefile

Edit the 2nd line that begins with OBJS so it contains

OBJS = iofunc.o misc.o slip.o ruptime.o srf_subs.o gslip_srf_subs.o geoproj_subs.o gslip_sliprate_subs.o wafront2d-rwg.o fourg.o

Delete the SRF_OBJS and FDRT_OBJS lines, 3-4.
Edit GETPAR and INCPAR so they correctly point to the Getpar install in <CyberShake home>/software/Getpar .
Modify FFTW_INCDIR and FFTW_LIBDIR to point to the FFTW directories on the system.

Add:

# Uncomment for memcached support
MEMC = ../../../../../utils/libmemcached_1.0.18
CFLAGS := $(CFLAGS) -I${MEMC}/include -D_USE_MEMCACHED
LDLIBS := ${LDLIBS} -L${MEMC}/lib -lmemcached

Edit the 'all' target to only be genslip-v5.5.2.
Modify the genslip-v5.5.2 target. We need to remove unnecessary objects, and compile it to an object rather than an executable for inclusion in the genslip library. It should look like:

genslip-v5.4.2.o : genslip-v5.4.2.c ${OBJS}
        ${CC} ${CFLAGS} -c -o genslip-v5.4.2.o genslip-v5.4.2.c ${LDLIBS}

If you choose to keep the other targets, remove ${SRF_OBJS} and ${FDRT_OBJS} and replace with just ${OBJS}.

gslip_srf_subs.c

In line 188, at the end of init_plane_srf(), add initialization for srf.np_seg, so we can free it without worry:
```
if (strcmp(srf[0].version, "2.0")==0) {
        srf[0].np_seg = NULL;
        srf[0].srf_hcmnt.cbuf = NULL;
}
```

Enclose lines 1188-1198 in an if statement:

if (strcmp(srf[0].version, "2.0")==0) {
        srf[0].srf_hcmnt.nline = 0;

        /* 2018-04-02 updated for multiple segs */
        srf[0].nseg = srf[0].srf_prect.nseg;
        srf[0].np_seg = (int *)check_malloc((srf[0].nseg)*sizeof(int));

        for(iseg=0;iseg<nseg;iseg++)
           srf[0].np_seg[iseg] = prseg_ptr[iseg].nstk*prseg_ptr[0].ndip;

        /* end version 2.0 stuff, just in case */
}

iofunc.c

In line 179, change the function name read_ruppars to _read_ruppars:

struct pointsource *_read_ruppars(char *file,struct pointsource *psrc,float *mag,int *nx,int *ny,float *dx,float *dy,float *dtop,float *stk,float *dip,float *elon,float *elat)

Add the function _mc_add_file to the end of the file:

#ifdef _USE_MEMCACHED
void _mc_add_file(char* filename, char** file_buffer, memcached_st* mst) {
        memcached_return ret;
        uint32_t flags = 0;

        FILE* fpr = fopfile(filename, "rb");
        //Get file size so we know how big to make the file buffer
        fseek(fpr, 0, SEEK_END);
        long file_length = ftell(fpr);
        fseek(fpr, 0, SEEK_SET);
        *file_buffer = check_malloc(sizeof(char) * (file_length+1));
        memset(*file_buffer, '\0', file_length+1);
        int part_id = 0;
        long read = fread(*file_buffer, 1, file_length, fpr);
        if (read!=file_length) {
                fprintf(stderr, "READ ERROR - %ld attempted %ld read.\n", file_length, read);
                exit(1);
        }
        char* file_key = check_malloc(sizeof(char) * (strlen(filename) + 4));
        char buffer[1000*1000];
        //Memcached thinks this is 1 MB
        int ONE_MB = 1000*1000;
        int i=0;
        while (i*ONE_MB < file_length) {
                int to_copy = ONE_MB;
                if (file_length - i*ONE_MB < ONE_MB) {
                        to_copy = file_length - i*ONE_MB;
                }
                sprintf(file_key, "%s.%d", filename, i);
                memcpy(buffer, (*file_buffer)+i*ONE_MB, to_copy);

                ret = memcached_set(mst, file_key, strlen(file_key), buffer, strlen(buffer), 0, flags);
                if (ret!=MEMCACHED_SUCCESS) {
                        printf("Caching of %s failed.\n", file_key);
                        printf("%s\n", memcached_strerror(mst, ret));
                        printf("value length: %d bytes\n", strlen(buffer));
                }
                i++;
                memset(buffer, '\0', ONE_MB);
        }
        //Commit number of parts
        int num_parts = i;
        char* parts_key = check_malloc(sizeof(char) * (strlen(filename) + 1));
        strcpy(parts_key, filename);
        ret = memcached_set(mst, parts_key, strlen(parts_key), (char*)(&num_parts), sizeof(int), 0, flags);
        if (ret!=MEMCACHED_SUCCESS) {
                printf("Caching of %s failed.\n", parts_key);
                printf("%s\n", memcached_strerror(mst, ret));
        }
        free(file_key);
        free(parts_key);
        fclose(fpr);
}

Add the definition of _mc_read_ruppars:

struct pointsource* _mc_read_ruppars(char *file,struct pointsource *psrc,float *mag,int *nx,int *ny,float *dx,float *dy,float *dtop,float *stk,float *dip,float *elon,float *elat, char* mc_server)
{
       FILE *fpr, *fopfile();
       float area;
       int i, nn;
       char str[1024];

       double rperd = 0.017453293;
       int ONE_MB = 1000*1000; //This is what memcached thinks 1 MB is
       *dtop = 1.0e+15;
       *stk = 0.0;
       *dip = 0.0;
       *elon = 0.0;
       *elat = 0.0;

       char* buffer;
       char* file_data;
       int* num_parts;

        printf("Using %s as memcached server.\n", mc_server);

        memcached_return ret;
        memcached_st* mst;

        mst = memcached_create(NULL);
        memcached_server_st *server = memcached_server_list_append(NULL, mc_server, 11211, &ret);

        ret = memcached_server_push(mst, server);
        memcached_server_free(server);

        uint32_t flags = 0;

        //See if file parts is in cache
        //key is filename + "_" + part
        char* parts_key = check_malloc(sizeof(char) * (strlen(file) + 1));
        strcpy(parts_key, file);

        size_t incoming_size;
        num_parts = (int*) memcached_get(mst, parts_key, strlen(parts_key), &incoming_size, &flags, &ret);
        if (num_parts==NULL) {
                printf("Key %s hasn't been cached, adding.\n", parts_key);
                _mc_add_file(file, &file_data, mst);
        } else {
                printf("Found key %s, num parts is %d.\n", parts_key, *num_parts);
                printf("Looking for parts.\n");
                //See if all the parts are actually around
                file_data = check_malloc(sizeof(char) * (ONE_MB * *num_parts + 1));
                memset(file_data, '\0', ONE_MB* *num_parts + 1);
                char* file_key = check_malloc(sizeof(char) * (strlen(file) + 4));
                size_t incoming;
                for (i=0; i<*num_parts; i++) {
                        sprintf(file_key, "%s.%d", file, i);
                        buffer = (char*) memcached_get(mst, file_key, strlen(file_key), &incoming, &flags, &ret);
                        if (buffer==NULL) {
                                printf("Piece %s was missing, adding file %s.\n", file_key, file);
                                //This piece is missing
                                //Easiest to add the whole file
                                free(file_data);
                                _mc_add_file(file, &file_data, mst);
                                break;
                        } else {
                                memcpy(file_data+i*ONE_MB, buffer, incoming);
                        }
                }
                free(file_key);
        }

        free(parts_key);
        memcached_free(mst);

        char* tok;
        tok = strtok(file_data, "\n"); /* Probability = <float> */

        tok = strtok(NULL, "\n"); /* Magnitude = <float> */
        sscanf(tok,"%*s %*s %f",mag);

        tok = strtok(NULL, "\n");   /* GridSpacing = <float> */
        sscanf(tok,"%*s %*s %f",dx);

        if(*dx == (float)(0.0))
           {
           fprintf(stderr,"***** input error\n");
           fprintf(stderr,"      GridSpacing = 0.0, exiting...\n");
           exit(-1);
           }

        *dy = *dx;

        tok = strtok(NULL, "\n");   /* NumRows = <int> */
        sscanf(tok,"%*s %*s %d",ny);

        tok = strtok(NULL, "\n");   /* NumCols = <int> */
        sscanf(tok,"%*s %*s %d",nx);

        tok = strtok(NULL, "\n");/* header comment */
        psrc = (struct pointsource *)check_realloc(psrc,(*nx)*(*ny)*sizeof(struct pointsource));

        area = (*dx)*(*dy)*1.0e+10;  /* km -> cm */

        for(i=0;i<(*nx)*(*ny);i++)
           {
           tok = strtok(NULL, "\n");   /* Lat , Lon , Depth , Rake , Dip , Strike */
           sscanf(tok,"%f %f %f %f %f %f",&psrc[i].lat,
                                          &psrc[i].lon,
                                          &psrc[i].dep,
                                          &psrc[i].rak,
                                          &psrc[i].dip,
                                          &psrc[i].stk);

           psrc[i].area = area;

           if(psrc[i].dep < *dtop)
              *dtop = psrc[i].dep;

           *stk = *stk + psrc[i].stk;
           *dip = *dip + psrc[i].dip;
           }
        *stk = *stk/((*nx)*(*ny));
        *dip = *dip/((*nx)*(*ny));

        nn = 0;
        for(i=0;i<(*nx)*(*ny);i++)
           {
           if(psrc[i].dep < (*dtop + 0.01))
              {
              *elon = *elon + psrc[i].lon;
              *elat = *elat + psrc[i].lat;
              nn++;
              }
           }

        /* adjust for half subfault width */
        *dtop = (*dtop) - 0.5*(*dx)*sin((*dip)*rperd);

        if(nn == 0)
           nn++;

        *elon = *elon/nn;
        *elat = *elat/nn;

        free(file_data);

        return(psrc);
}
#endif

misc.c

In line 68, modify zapit to use a for-loop and specify a return type. The function should become:
```
void zapit(float* s,int n)
{
int i;
for (i=0; i<n; i++) {
        s[i] = 0.0;
}
}
```

Add the free_srf_ptrs function to clean up everything:

void free_srf_ptrs(struct standrupformat* srf) {
        struct srf_planerectangle srf_prect = srf->srf_prect;
        struct srf_allpoints srf_apnts = srf->srf_apnts;

        free(srf_prect.prectseg);
        int i;
        for (i=0; i<srf_apnts.np; i++) {
                free(srf_apnts.apntvals[i].stf1);
                free(srf_apnts.apntvals[i].stf2);
                free(srf_apnts.apntvals[i].stf3);
        }
        free(srf_apnts.apntvals);
        if (strcmp(srf->version,"2.0")==0) {
                free(srf->np_seg);
                free(srf->srf_hcmnt.cbuf);
        }
}

genslip-v5.5.2.c

This is where the majority of changes are made to turn this into an API. I recommend having an original copy for reference.

Note that line numbers are approximate, and reflect the line number after having made the previous edits.

In line 352, change from

int main(int ac,char **av)

to

#ifdef _USE_MEMCACHED
int mc_genslip(int ac, char** av, rg_stats_t *stats, struct standrupformat* srf, int state, char* memcached_server)
#else
int genslip(int ac,char **av, rg_stats_t *stats, struct standrupformat* srf, int state)
#endif

In line 415, after

char seedfile[1024];

add

//Used to bypass iterating over slips to get to the 'right' seed
int use_unmodified_seed = 0;

In line 444, after

struct velmodel rvmod;

add initializations:

rvmod.vs = rvmod.invb2 = NULL;
rvmod.th = NULL;

In line 451, after

float delh, hx, hg, gwid2, *gbnd, *gwid, shypo_mseg, *rvfac_seg;

add initializations:

rvfac_seg = gbnd = gwid = NULL;

Delete SRF definition in line 602:
```
struct standrupformat srf;
```

After line 621:

int outbin = 0;

Add

int doslip = -1;
int dohypo = -1;

In line 638, after

double dh, *fdrt, *rslw, *fspace;

add initializations:

fdrt = rslw = fspace = NULL;

In line 642, after

int ixs, iys, nsring, ntot, *ispace;

add initialization:

ispace = NULL;

Change 'srf.version' to 'srf->version' in line 653.
In line 653, change the default version to 2.0:
```
sprintf(srf->version,"2.0");
```

In line 741, after 'getpar("outbin","d",&outbin);', add:

getpar("doslip","d",&doslip);
getpar("dohypo","d",&dohypo);

In line 750-52, change all the 'srf.version' to 'srf->version'.

In line 915, after

getpar("seed","d",&seed);

add

getpar("use_unmodified_seed","d",&use_unmodified_seed);

In line 1025, change

 psrc_orig = read_ruppars(infile,psrc_orig,&mag,&nstk,&ndip,&dstk,&ddip,&dtop,&avgstk,&avgdip,&elon,&elat);

to

#ifdef _USE_MEMCACHED
   psrc_orig = _mc_read_ruppars(infile,psrc_orig,&mag,&nstk,&ndip,&dstk,&ddip,&dtop,&avgstk,&avgdip,&elon,&elat,memcached_server);
#else
   psrc_orig = _read_ruppars(infile,psrc_orig,&mag,&nstk,&ndip,&dstk,&ddip,&dtop,&avgstk,&avgdip,&elon,&elat);
#endif

In line 1077, in the init_plane_srf() call, change '&srf' to 'srf'.
In line 1081, in the get_seg_bounds() call, change '&srf' to 'srf'.

In line 1393, after 'dhypo = dhypo_frac*fwid;', add:

stats->numslip = ns;
stats->numhypo = nh;

if (state==GET_STATS) {
   free(rvfac_seg);
   free(gbnd);
   free(gwid);
   free(psrc);
   free(psrc_orig);
   free(slip_c);
   free(rake_c);
   free(tsfac1_c);
   free(rtime1_c);
   free(rough_c);
   free(tsfac2_c);
   free(rtime2_c);
   free(slip_r);
   free(rake_r);
   free(tsfac1_r);
   free(rtime1_r);
   free(rough_r);
   free(tsfac2_r);
   free(rtime2_r);
   free(stk_r);
   free(dip_r);
   free(psrc_rake);
   free(vmod.vp);
   free(vmod.vs);
   free(vmod.den);
   free(vmod.th);
   free(vmod.dep);
   free(vmod.mu);
   free(vmod.invb2);
   free_srf_ptrs(srf);
   return 0;
}


if (dohypo>=nh) {
        fprintf(stderr,"Error:  You requested hypocenter ID %d but from infile %s genslip calculates %d hypocenters.\n", dohypo, infile, nh);
        exit(1);
}

if (doslip>=ns) {
        fprintf(stderr,"Error:  You requested slip ID %d but from infile %s genslip calculates %d slips.\n", doslip, infile, ns);
        exit(2);
}

if (ns<-1 || nh<-1) {
        fprintf(stderr, "Error: hypocenter and slip ID need to be >= -1.");
        exit(3);
}

In line 1470, after:

for(js=0;js<ns;js++)    /* loop over slip/rupture realizations */
   {

add

   if (doslip!=-1 && js>doslip) {
        //We've already done the slip we need, break out of the loop
        break;
   }

   if (doslip!=-1 && js<doslip) {
        //Haven't gotten to the right slip yet, continue
        continue;
   }

In lines 1500-1509, change iterative calculation of shypo and dhypo to explicit. Delete

if(ih_scnt == nhypo_s)
         {
         ih_scnt = 0;
         ih_dcnt++;
         }

      shypo = hypo_s0 + ih_scnt*hypo_ds;
      dhypo = hypo_d0 + ih_dcnt*hypo_dd;

      ih_scnt++;

and replace with

      shypo = hypo_s0 + (js%nhypo_s)*hypo_ds;
      dhypo = hypo_d0 + (js/nhypo_s)*hypo_dd;

In line 1496, put the for loop that calls _sfrand() into an if statement. It should look like:

   if (use_unmodified_seed==0) {
       for(k=0;k<10*js;k++)
          sval =  _sfrand(&seed);
   }

In line 2653, in the load_slip_srf_dd4() call, change '&srf' to 'srf'.

In the for loop starting in line 2742, starting with for(j=0;j<ndip;j++), we replace some of these calculations with refactored ones to pull out constants and take advantage of FMA. Specifically, make the following changes:

Before the for loop, define the following variables:

float half_ddip = 0.5*ddip;
float half_dstk_minus_half_flen = 0.5*dstk - 0.5*flen;
float half_flen_plus_shypo = 0.5*flen + shypo;
y0 = (deep_vrdep1-dtop)/sin(avgdip*rperd);
float deep_alpha = 1.0 + (deep_vrdep1-deep_vrup*deep_vrdep1)/(deep_vrdep2-deep_vrdep1);
float deep_const = (deep_vrup - 1.0)/(deep_vrdep2-deep_vrdep1);

Replace

yy = (j + 0.5)*ddip;

with

yy = j*ddip + half_ddip;

Replace

xx = (i+0.5)*dstk - 0.5*flen;

with

xx = i*dstk + half_dstk_minus_half_flen;

Replace

hg = (gbnd[ig]-0.5*flen)-shypo;

with

hg = gbnd[ig] - half_flen_plus_shypo;

Replace

vrfac = 1.0 + (deep_vrup - 1.0)*((psrc[ip].dep)-deep_vrdep1)/(deep_vrdep2-deep_vrdep1);

with

vrfac = deep_alpha + deep_const*psrc[ip].dep;

In line 2862, after 'psrc[j].rupt = psrc[j].rupt - tsmin + rupture_delay;', add:
```
if (dohypo!=ih || doslip!=js) {
    continue;
}
```
In line 2869, in the load_rupt_srf() call, change '&srf' to 'srf'.
In line 2892, remove:
```
 write2srf(srf,str,outbin);
```
Writing out SRFs is done by calling write_srf() on the srf object.
In line 2906, remove
```
free_srf_stf(&srf);
```
In lines 2887 and 2890, change '&srf' to 'srf'.

At the very end of the function, we want to free everything not needed by the SRF object. Add:

free(rvfac_seg);
free(gbnd);
free(gwid);
free(psrc);
free(psrc_orig);
free(slip_c);
free(rake_c);
free(tsfac1_c);
free(rtime1_c);
free(rough_c);
free(tsfac2_c);
free(rtime2_c);
free(slip_r);
free(rake_r);
free(tsfac1_r);
free(rtime1_r);
free(rough_r);
free(tsfac2_r);
free(rtime2_r);
free(stk_r);
free(dip_r);
free(psrc_rake);
free(rspd);
free(vmod.vp);
free(vmod.vs);
free(vmod.den);
free(vmod.th);
free(vmod.dep);
free(vmod.mu);
free(vmod.invb2);
free(rvmod.vs);
free(rvmod.th);
free(rvmod.invb2);
free(rslw);
free(fdrt);
free(fspace);
free(ispace);
fftwf_cleanup();

return 0;

Once verification is complete, you may wish to comment out some of the print statements.

Function name changes

Since Rob uses some of the same helper functions in his genslip code as in his jbsim3d code, we prepend an underscore to some of the function names in the genslip library to avoid namespace conflicts when compiling DirectSynth.

From the src directory, run the following sed command

sed -i 's/<function name>/_<function name>/g' *.c *.h */*.c */*.h

for each of these function names:

fopfile
opfile_ro
opfile
croptrfile
reed
rite
check_malloc
check_realloc
sfrand
gaus_rand
zapit
set_ne
set_ll
init_plane_srf
load_slip_srf
load_rupt_srf
gen_2tri_stf
gen_brune_stf

This might accidentally convert a 'write' to 'w_rite', and you might end up with _f_opfile and __opfile_ro; fix those with

sed -i 's/w_rite/write/g' *.c *.h */*.c */*.h
sed -i 's/__opfile_ro/_opfile_ro/g' *.c *.h */*.c */*.h
sed -i 's/_f_opfile/_fopfile/g' *.c *.h */*.c */*.h

and also run (since Fortran function names can't start with an underscore, but are case insensitive):

sed -i 's/sort/sort1/g' */*.f
sed -i 's/SORT/SORT1/g' */*.f

Optimizations

Based on work done with v5.4.2, we found that we can reduce the runtime of kfilt_beta2() by 75%, speeding up the entire code for large ruptures by 15-20%.

Specifically, we took (slip.c, line 1469)

k2 = kx*kx + ky*ky;
fac = 1.0/((1.0 + exp(ord*log(k2*lmin2)))*(1.0 + exp(-ord*log(k2*lmax2))));
amp = fac*exp(-0.5*beta2*log(k2));

and modified it via algebraic manipulation and pulling constants out of the loops:

//New constants
float lmin2_ord = pow(lmin2, ord);
float lmax2_neg_ord = pow(lmax2, -1.0*ord);
float k2_ord;

for(j=0;j<=ny0/2;j++)  /* only do positive half, then use symmetry */
<snip>
        k2 = kx*kx + ky*ky;
        k2_ord = k2*k2*k2*k2;
        fac = 1.0/((1.0+k2_ord*lmin2_ord)*(1.0+1.0/k2_ord*lmax2_neg_ord));
        amp = fac*1.0/k2;

These optimizations depend on beta=2 and ord=4, so before the for loops I added checks that this is in fact true:

if (ord!=4) {
        fprintf(stderr, "The optimized calculation of amp in kfilt_beta2 depends on ord=4.  Ord is actually %d, so we are aborting and you should change this calculation back to the unoptimized version.\n", ord);
        exit(5);
}

if (fabs(beta2-2.0)>0.000001) {
        fprintf(stderr, "The optimized calculation of amp in kfilt_beta2 depends on beta2 = 2.  beta2 is actually %d, so we are aborting and you should change this calculation.\n", beta2);
        exit(6);
}

Compiling

Go back into the src directory and try to compile it with 'make'.

Difference between revisions of "Rupture Variation Generator v5.5.2 code changes"

Latest revision as of 20:18, 2 November 2022

Contents

Installing files

Editing files

In $ROOT_DIR/src

structure.h

srf_structure.h

Makefile

In GenRandV5.0

defs.h

srf_defs.h

include.h

function.h

slip.c

makefile

gslip_srf_subs.c

iofunc.c

misc.c

genslip-v5.5.2.c

Function name changes

Optimizations

Compiling

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 44: / Line 44: @@
 <li>sliprate_subs.c</li>
 <li>srf_subs.c (from StandRupFormat)</li>
+<li>gslip_sliprate_subs.c</li>
+<li>geoproj_subs.c</li>
 <li>wafront2d-rwg.f</li>
 <li>makefile</li>
@@ Line 167: / Line 169: @@
 <li>Add an entry for zapit:
 <pre>void zapit(float* s,int n);</pre></li>
+<li>Add entries for load_command_srf() and load_seed_srf():
+<pre>void load_command_srf(struct standrupformat *srf,int ac,char **av);
+void load_seed_srf(struct standrupformat *srf,int starting_seed,int ending_seed);</pre></li>
 <li>At the end of the file, add:
 <pre>
@@ Line 509: / Line 514: @@
 #endif
 </pre></li>
+<li>In line 415, after
+<pre>char seedfile[1024];</pre>
+add
+<pre>//Used to bypass iterating over slips to get to the 'right' seed
+int use_unmodified_seed = 0;</pre>
 <li>In line 444, after
 <pre>struct velmodel rvmod;</pre>
@@ Line 520: / Line 530: @@
 <li>Delete SRF definition in line 602:
 <pre>struct standrupformat srf;</pre></li>
-<li>After line 601:
+<li>After line 621:
 <pre>int outbin = 0;</pre>
 Add
@@ Line 527: / Line 537: @@
 int dohypo = -1;
 </pre></li>
-<li>In line 622, after
+<li>In line 638, after
 <pre>double dh, *fdrt, *rslw, *fspace;</pre>
 add initializations:
 <pre>fdrt = rslw = fspace = NULL;</pre></li>
-<li>In line 626, after
+<li>In line 642, after
 <pre>int ixs, iys, nsring, ntot, *ispace;</pre>
 add initialization:
 <pre>ispace = NULL;</pre></li>
-<li>Change 'srf.version' to 'srf->version' in line 631.</li>
+<li>Change 'srf.version' to 'srf->version' in line 653.</li>
-<li>In line 636, change the default version to 2.0:
+<li>In line 653, change the default version to 2.0:
 <pre>sprintf(srf->version,"2.0");</pre></li>
-<li>In line 699, after 'getpar("outbin","d",&outbin);', add:
+<li>In line 741, after 'getpar("outbin","d",&outbin);', add:
 <pre>
 getpar("doslip","d",&doslip);
 getpar("dohypo","d",&dohypo);
 </pre></li>
-<li>In line 707-9, change all the 'srf.version' to 'srf->version'.
+<li>In line 750-52, change all the 'srf.version' to 'srf->version'.
-<li>In line 959, change
+<li>In line 915, after
+<pre>getpar("seed","d",&seed);</pre>
+add
+<pre>getpar("use_unmodified_seed","d",&use_unmodified_seed);</pre>
+<li>In line 1025, change
 <pre> psrc_orig = read_ruppars(infile,psrc_orig,&mag,&nstk,&ndip,&dstk,&ddip,&dtop,&avgstk,&avgdip,&elon,&elat);</pre>
 to
@@ Line 554: / Line 568: @@
 #endif
 </pre></li>
-<li>In line 1012, change '&srf' to 'srf'.</li>
+<li>In line 1077, in the init_plane_srf() call, change '&srf' to 'srf'.</li>
-<li>In line 1016, change '&srf' to 'srf'.</li>
+<li>In line 1081, in the get_seg_bounds() call, change '&srf' to 'srf'.</li>
-<li>In line 1315, after 'dhypo = dhypo_frac*fwid;', add:
+<li>In line 1393, after 'dhypo = dhypo_frac*fwid;', add:
 <pre>
 stats->numslip = ns;
@@ Line 584: / Line 598: @@
     free(dip_r);
     free(psrc_rake);
+   free(vmod.vp);
+   free(vmod.vs);
+   free(vmod.den);
+   free(vmod.th);
+   free(vmod.dep);
+   free(vmod.mu);
+   free(vmod.invb2);
+   free_srf_ptrs(srf);
     return 0;
 }
 if (dohypo>=nh) {
@@ Line 602: / Line 625: @@
 }
 </pre></li>
-<li>In line 1363, after:
+<li>In line 1470, after:
 <pre>
 for(js=0;js<ns;js++)    /* loop over slip/rupture realizations */
@@ Line 619: / Line 642: @@
     }
 </pre></li>
-<li>In lines 1394-1403, change iterative calculation of shypo and dhypo to explicit.  Delete
+<li>In lines 1500-1509, change iterative calculation of shypo and dhypo to explicit.  Delete
 <pre>
 if(ih_scnt == nhypo_s)
@@ Line 637: / Line 660: @@
        dhypo = hypo_d0 + (js/nhypo_s)*hypo_dd;
 </pre></li>
-<li>In line 2394, change '&srf' to 'srf'.</li>
+<li>In line 1496, put the for loop that calls _sfrand() into an if statement.  It should look like:
-<li>In the for loop starting in line 2452, we replace some of these calculations with refactored ones to pull out constants and take advantage of FMA.  Specifically, make the following changes:
+<pre>
+   if (use_unmodified_seed==0) {
+       for(k=0;k<10*js;k++)
+          sval =  _sfrand(&seed);
+   }
+</pre>
+<li>In line 2653, in the load_slip_srf_dd4() call, change '&srf' to 'srf'.</li>
+<li>In the for loop starting in line 2742, starting with for(j=0;j<ndip;j++), we replace some of these calculations with refactored ones to pull out constants and take advantage of FMA.  Specifically, make the following changes:
 <ol>
 <li>Before the for loop, define the following variables:
@@ Line 646: / Line 676: @@
 float half_flen_plus_shypo = 0.5*flen + shypo;
 y0 = (deep_vrdep1-dtop)/sin(avgdip*rperd);
-float deep_alpha = 1.0 + (deep_vrdep1-deep_rup*deep_vrdep1)/(deep_vrdep2-deep_vrdep1);
+float deep_alpha = 1.0 + (deep_vrdep1-deep_vrup*deep_vrdep1)/(deep_vrdep2-deep_vrdep1);
-float deep_const = (deep_rup - 1.0)/(deep_vrdep2-deep_vrdep1);
+float deep_const = (deep_vrup - 1.0)/(deep_vrdep2-deep_vrdep1);
 </pre></li>
 <li>Replace
@@ Line 671: / Line 701: @@
 </ol>
 </li>
-<li>In line 2573, after 'psrc[j].rupt = psrc[j].rupt - tsmin + rupture_delay;', add:
+<li>In line 2862, after 'psrc[j].rupt = psrc[j].rupt - tsmin + rupture_delay;', add:
 <pre>
 if (dohypo!=ih || doslip!=js) {
@@ Line 677: / Line 707: @@
 }
 </pre></li>
-<li>In line 2601, change '&srf' to 'srf'.</li>
+<li>In line 2869, in the load_rupt_srf() call, change '&srf' to 'srf'.</li>
-<li>In line 2611, remove
+<li>In line 2892, remove:
-<pre>free_srf_stf(&srf);</pre></li>
-<li>In line 2618, change '&srf' to 'srf'.</li>
-<li>In line 2631, remove:
 <pre> write2srf(srf,str,outbin);</pre>
 Writing out SRFs is done by calling write_srf() on the srf object.</li>
+<li>In line 2906, remove
+<pre>free_srf_stf(&srf);</pre></li>
+<li>In lines 2887 and 2890, change '&srf' to 'srf'.
 <li>At the very end of the function, we want to free everything not needed by the SRF object.  Add:
 <pre>
@@ Line 719: / Line 749: @@
 free(rvmod.th);
 free(rvmod.invb2);
+free(rslw);
+free(fdrt);
+free(fspace);
+free(ispace);
+fftwf_cleanup();
 return 0;
@@ Line 753: / Line 788: @@
 *gen_brune_stf
-This might accidentally convert a 'write' to 'w_rite'; fix that with
+This might accidentally convert a 'write' to 'w_rite', and you might end up with _f_opfile and __opfile_ro; fix those with
-<pre> sed -i 's/w_rite/write/g' *.c *.h */*.c */*.h</pre>
+<pre>sed -i 's/w_rite/write/g' *.c *.h */*.c */*.h
+sed -i 's/__opfile_ro/_opfile_ro/g' *.c *.h */*.c */*.h
+sed -i 's/_f_opfile/_fopfile/g' *.c *.h */*.c */*.h
+</pre>
 and also run (since Fortran function names can't start with an underscore, but are case insensitive):
@@ Line 760: / Line 798: @@
 sed -i 's/SORT/SORT1/g' */*.f
 </pre>
+=== Optimizations ===
+Based on work done with v5.4.2, we found that we can reduce the runtime of kfilt_beta2() by 75%, speeding up the entire code for large ruptures by 15-20%.
+Specifically, we took (slip.c, line 1469)
+<pre>
+k2 = kx*kx + ky*ky;
+fac = 1.0/((1.0 + exp(ord*log(k2*lmin2)))*(1.0 + exp(-ord*log(k2*lmax2))));
+amp = fac*exp(-0.5*beta2*log(k2));
+</pre>
+and modified it via algebraic manipulation and pulling constants out of the loops:
+<pre>
+//New constants
+float lmin2_ord = pow(lmin2, ord);
+float lmax2_neg_ord = pow(lmax2, -1.0*ord);
+float k2_ord;
+for(j=0;j<=ny0/2;j++)  /* only do positive half, then use symmetry */
+<snip>
+        k2 = kx*kx + ky*ky;
+        k2_ord = k2*k2*k2*k2;
+        fac = 1.0/((1.0+k2_ord*lmin2_ord)*(1.0+1.0/k2_ord*lmax2_neg_ord));
+        amp = fac*1.0/k2;
+</pre>
+These optimizations depend on beta=2 and ord=4, so before the for loops I added checks that this is in fact true:
+<pre>
+if (ord!=4) {
+        fprintf(stderr, "The optimized calculation of amp in kfilt_beta2 depends on ord=4.  Ord is actually %d, so we are aborting and you should change this calculation back to the unoptimized version.\n", ord);
+        exit(5);
+}
+if (fabs(beta2-2.0)>0.000001) {
+        fprintf(stderr, "The optimized calculation of amp in kfilt_beta2 depends on beta2 = 2.  beta2 is actually %d, so we are aborting and you should change this calculation.\n", beta2);
+        exit(6);
+}
+</pre>
 == Compiling ==
 Go back into the src directory and try to compile it with 'make'.
-With v5.4.2, we are not including the -fno-expensive-optimizations flag, since backwards compatibility with Blue Waters and Titan isn't an issue.