Difference between revisions of "CSEP Powell Center 2018"

From SCECpedia
Jump to navigationJump to search
Line 160: Line 160:
 
:# Using R&J and ETAS to simulate "real" observations
 
:# Using R&J and ETAS to simulate "real" observations
 
:# Fit R&J to ETAS model: fit is the mode of the individual ETAS runs
 
:# Fit R&J to ETAS model: fit is the mode of the individual ETAS runs
" Using overlapping windows CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
+
* Using overlapping windows.. CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
" ETAS observations fails bc they are not Poisson distributed
+
* ETAS observations fails bc they are not Poisson distributed
" R&J fails when considering large magnitude main shock
+
* R&J fails when considering large magnitude main shock
" Solution: all ETAS all the time
+
* Solution: all ETAS all the time
" Takeaway: forecasts and observations must be consistent
+
* Takeaway: forecasts and observations must be consistent
" Conclusion
+
* Conclusion
o Non-Poissonian behavior
+
** Non-Poissonian behavior
o Simulation based forecasts could address some issues
+
** Simulation based forecasts could address some issues
o Will also handle overlapping time windows
+
** Will also handle overlapping time windows
o RJ will fail assuming that the world is like ETAS
+
** RJ will fail assuming that the world is like ETAS
" CSEP must take the forecasts in as simulations in order to test  
+
* CSEP must take the forecasts in as simulations in order to test  
  
Moving past Poisson
+
; Moving past Poisson
" Poisson likelihood does not allow for clustering
+
* Poisson likelihood does not allow for clustering
" Three-ways to eliminate:
+
* Three-ways to eliminate:
o Adjusted likelihood simulations (in other words, remove likelihood)
+
** Adjusted likelihood simulations (in other words, remove likelihood)
o Normal approximation
+
** Normal approximation
o K-S test (could work with Turing style tests too)
+
** K-S test (could work with Turing style tests too)
" N-test could be fixed by using negative binomial if dispersion is supplied
+
* N-test could be fixed by using negative binomial if dispersion is supplied
" Accommodating simulation-based models better solution
+
* Accommodating simulation-based models better solution
" Simulations can preserve space-time clustering
+
* Simulations can preserve space-time clustering
" (CSEP needs to separate forecasting from modelling)
+
* (CSEP needs to separate forecasting from modelling)
" Consistency tests of simulation-based tests
+
* Consistency tests of simulation-based tests
o General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
+
** General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
o For example, inter-event time distribution
+
** For example, inter-event time distribution
o P-values should be uniform on [0,1]
+
** P-values should be uniform on [0,1]
" Interesting in improving models: looking at information gained
+
* Interesting in improving models: looking at information gained
o Not obvious way to transparently estimate without gridding
+
** Not obvious way to transparently estimate without gridding
o Standard CSEP information not restricted to Poisson
+
** Standard CSEP information not restricted to Poisson
" CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
+
* CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
" Moving past parametric distribution functions in favor of non-parametric simulation-based models
+
* Moving past parametric distribution functions in favor of non-parametric simulation-based models
" CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats
+
* CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats
  
Current CSEP Testing Approaches
+
; Current CSEP Testing Approaches
" Two approaches
+
* Two approaches
o Establishing discrepancies/agreement with observations
+
** Establishing discrepancies/agreement with observations
" E.g., number of earthquakes
+
*** E.g., number of earthquakes
" Likelihood
+
*** Likelihood
o Comparing against other models
+
** Comparing against other models
" How much better or worse does one model do
+
*** How much better or worse does one model do
" Installed methods
+
* Installed methods
o Number test: compares number of epicenter forecasts in bin
+
** Number test: compares number of epicenter forecasts in bin
o Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
+
** Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
o Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
+
** Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
o Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
+
** Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
" One of the more interesting tests
+
*** One of the more interesting tests
o Magnitude test:
+
** Magnitude test:
" Same as S test but integrating over space.  
+
*** Same as S test but integrating over space.  
" Not particularly powerful, could be using a more powerful KS test.
+
*** Not particularly powerful, could be using a more powerful KS test.
o Information gain per earthquake: is "rate-corrected" information gain significant greater than 0.
+
** Information gain per earthquake: is "rate-corrected" information gain significane greater than 0.
" Paired t-test (T-test)
+
*** Paired t-test (T-test)
" Differences must be approximately independent
+
**** Differences must be approximately independent
" If differences are not iid normal, CLT!
+
**** If differences are not iid normal, CLT!
" Wilcoxon signed rank test (W-test)
+
*** Wilcoxon signed rank test (W-test)
" Less powerful
+
**** Less powerful
" Require symmetric data  
+
**** Require symmetric data  
" Differences are proportional to error bounds, ie., large difference -> large error bounds
+
*** Differences are proportional to error bounds, ie., large difference -> large error bounds
" Error bounds only apply to forecast pairs
+
*** Error bounds only apply to forecast pairs
" Residuals based:  
+
* Residuals based:  
o Residual: difference between local forecast and observation
+
** Residual: difference between local forecast and observation
o Raw residual: bin-wise difference between observed # and forecast
+
** Raw residual: bin-wise difference between observed # and forecast
o Pearson residuals: normalized cell-wise difference between rate and observed number
+
** Pearson residuals: normalized cell-wise difference between rate and observed number
o Deviance residuals: difference between (point-process) log-likelihood Scores.  
+
** Deviance residuals: difference between (point-process) log-likelihood Scores.  
" Hit&miss tests
+
* Hit&miss tests
o Receiver-operating characteristic
+
** Receiver-operating characteristic
o Molchan error diagram
+
** Molchan error diagram
o Area-skill-score
+
** Area-skill-score
" Goal is to evaluate different aspects of the forecasting model
+
* Goal is to evaluate different aspects of the forecasting model
" Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
+
* Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
" 10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
+
* 10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
o 1 day forecasting for California.
+
** 1 day forecasting for California.
o Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…  
+
** Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…  
o Curated dataset valuable resource for the scientific community.
+
** Curated dataset valuable resource for the scientific community.
" Next steps:
+
* Next steps:
o Ensemble modeling
+
** Ensemble modeling
" Marzochi et al, 2012
+
*** Marzochi et al, 2012
" BMA: averaged based on previously best performing model, which makes it better for selecting models
+
*** BMA: averaged based on previously best performing model, which makes it better for selecting models
" Using additive or multiplicative models for combining models
+
*** Using additive or multiplicative models for combining models
o Simulated-based forecasts
+
** Simulated-based forecasts
" See previous lecture from Morgan Page and David Rhoades
+
*** See previous lecture from Morgan Page and David Rhoades
" NSIM: number of target eqs  
+
*** NSIM: number of target eqs  
" Earthquake rate distribution
+
*** Earthquake rate distribution
" Inter-event time distribution
+
*** Inter-event time distribution
" Inter-event distance distribution
+
*** Inter-event distance distribution
o External forecasts and predictions
+
** External forecasts and predictions
" Quakefinder type predictions.
+
*** Quakefinder type predictions.
" No implemented evaluation method.
+
*** No implemented evaluation method.
" Critical for real-time forecast and predictions that are generated externally to CSEP platform.
+
*** Critical for real-time forecast and predictions that are generated externally to CSEP platform.
  
 
Testing Fault Based Models
 
Testing Fault Based Models

Revision as of 22:41, 14 March 2018

CSEP2 Challenges

  • Testing fault and simulation-based models
  • Care about low prob. events
  • Should we be testing something other nucleation?
  • Is UCERF3-ETAS more valuable given the alternatives?
  • Epistemic Uncertainties

Day 1

Reasenberg & Jones for USGS OAF
  • Rate of ≥M aftershocks at time t after mainshock with given magnitude
  • Improvements made to reasenberg & jones model to update generic parameters for California
  • Aftershock forecast for Mw ≥ 5 using improved R&J model
  • Automating aftershock forecasts for the US (in progress w/ code development challenges)
  • Moving past R&J in favor of ETAS, but could be useful for UCERF3-ETAS testing
  • Testability challenges:
    • Overlapping, non-independent forecasts
    • EQ prob. Dist. Not necessarily Poissonian
    • Temporal forecasts with poorly defined spatial area
    • R&J not great with substantial triggering (e.g., swarms)
Update on ETAS Forecasting
  • GUI interface to compute manual forecasts for external uses.
  • AIC prefers ETAS, however more complicated models not favored over simple 3 param model
  • Performs better than R&J
  • Issues with "supercriticality"
    • Could solve by fitting mainshock separately
  • For global problem (and local): estimating magnitude of completeness and b-value
  • Need to limit supercriticality before OEF can be given to non-experts
  • "Similarity forecast" can be implemented as mask/failsafe to reduce surprises
    • Defined as having "similar number of earthquakes in binned magnitudes"
    • ETAS has ½ the surprise rate of R&J
    • Or could be included in ensemble
Spatial ETAS
  • ETAS type models can zero in on aftershock hot-spots
    • Using spatial omori type
    • Need some spatial kernel
  • Moving from spatial rates to hazards
    • Couple forecasts with GMPE to produce ground motions
    • MMI regression-based models
  • Testability
    • Challenges associated with incorporating hazard, because it eliminates some granularity in the forecast model
    • Worried about Type II error
Time-Dependent Background seismicity
  • Particularly useful for earthquake swarms where background seismicity differs from 'normal' rate
    • Could determine rate from previous swarms
    • Potential issues:
      • Swarm duration
      • Considerable variability in swarm durations
      • Solving using "life expectancy" table, but limited data in southern California
      • Likely need some physical constraints on distribution functions
  • Using STETAS to use standard catalog without needing declustering
    • Hydro mechanical models for stressing-rate can be used for induced seismicity
    • rate-and-state framework
  • Testing strategy:
    • Given a swarm; how long should we provide forecasts?

Day 2

UCERF3
  • Three models
    • Time-independent
      • Fault-based approach that splits faults into subsections
      • Rate of rupture computed from Grand Inversion (see pub for details)
      • Add gridded off-fault seismicity
      • Logic tree used to capture the epistemic uncertainty
      • Fault participation most important. (ie., what is the prob. of a particular fault hosting an eq ≥ Mw)
    • Time-dependent
      • Based on reed renewal statistics
      • Additional logic tree branches added
    • ETAS
      • Ignoring faults gives rise to discrepancy between ETAS and elastic rebound type models
      • Combines UCERF3-TD with an ETAS model and produces synthetic catalog
      • Issues:
        • Variability of MFD throughout CA
        • GR not consistent with data
        • Main question: what is the conditional prob of observing large eq given an observed small eq?
        • Determined that elastic rebound necessary
        • Rate of small events not always consistent with rate of expected aftershocks
      • Operationalizable, but needs significant resources
        • Major question: Does it have value?
        • HayWired scenario recently published in SRL
        • Shows value if interested in severe shaking.
        • Faults important for low prob high ground motions.
      • Testing UCERF3-ETAS
        • Fault participation, not nucleation
        • Logic-tree branches
        • Elastic rebound/aperiodicity
        • Characteristic behavior near faults
        • Retrospective testing
        • Aleatory variability and sequence specific etas parameters
RSQSim Rate-State earthquake Simulator
  • Physics-based forecasting model based on R&S statistics
  • Using RSQsim ruptures in hazard assessments
    • Need to create ucerf3 style ruptures
    • Do RSQSim ruptures pass ucerf3 plausibility criteria?
      • Surprisingly, most RSQsim ruptures didn't pass the coulomb criterion. ~17.5% did not pass.
      • Multi-fault ruptures tend to agree between UCERF3 and RSQsim
  • RSQsim agrees well with UCERF3 without specific tuning of recurrence intervals
    • Also: repeat times and short period ground motions
    • but starts to disagree at longer spectral periods
  • interesting conditional probs:
    • what is the prob of having 2 mw 7 on the Mojave within 1 week?
      • 4.5% in UCERF3 and 5.6% in RSQSim
  • Could look at two-point statistics pairwise difference between centroids
  • CSEP2 needs to be able to compare synthetic catalogs coming from RSQSim, ie., the ability to handle non-standard catalog sources.
Objectives & Challenges of Model Testing
  • Not enough data
  • Expand data sources in space and time
    • i.e., incorp. South America
    • retrospective testing experiments
    • time-dependence
    • extend authoritative data sets
  • issue 2: primitive models based on point-based models (ie, hypocentral/nucleation)
    • need: simulation- , fault-, and physics-based models
    • 3d models
    • Need to account for non-poissonian & correlation structures
    • [DJ] expand ucerf3 approach to new locations to understand principal components
  • Issue 3: need to build complete prob. models accounting for ontological error (unknown unknowns)
    • How to go from logic tree to continuous pdf?
    • Need to consider correlations within tree
  • Issue 4: testing fault-based models
    • Lots of work to be done
    • 'Turing' style testing can help… Page (2018)
Turing Tests of UCERF3
  • properly accounting for spatial diffusivity
  • Inter-sequence aftershock productivity
  • Foreshock and aftershock productivity as function of differential magnitude
  • Nearest neighbor separations
  • Analysis of clusters
  • Paleo hiatus
  • [NF] what are the possible explanations of hiatus?
  • Super cycle: extreme clustering over extreme period
  • CSEP testing should be more visual and include into CSEP2
Comparing R&J with ETAS
  • R&J -> ETAS
    • Secondary sequences
    • Faster adaptation
    • Spatial forecasts
    • Better estimates of the range of outcomes
  • CSEP 1day forecasts begin at start of day; lose some power
  • Challenges for USGS testing:
    • Overlapping windows
    • Update forecasts within window
    • New RJ89 method no longer Poissonian
    • ETAS forecasts are not Poissonian
    • All violate CSEP testing methods
  • Likelihood based on Poisson distribution using standard statistical test
  • CSEP strategy:
    • Poisson numbers based on RJ forecasts
  • Strategy:
  1. Want dist. Of events in window that starts at t with duration d
  2. Using R&J and ETAS to simulate "real" observations
  3. Fit R&J to ETAS model: fit is the mode of the individual ETAS runs
  • Using overlapping windows.. CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
  • ETAS observations fails bc they are not Poisson distributed
  • R&J fails when considering large magnitude main shock
  • Solution: all ETAS all the time
  • Takeaway: forecasts and observations must be consistent
  • Conclusion
    • Non-Poissonian behavior
    • Simulation based forecasts could address some issues
    • Will also handle overlapping time windows
    • RJ will fail assuming that the world is like ETAS
  • CSEP must take the forecasts in as simulations in order to test
Moving past Poisson
  • Poisson likelihood does not allow for clustering
  • Three-ways to eliminate:
    • Adjusted likelihood simulations (in other words, remove likelihood)
    • Normal approximation
    • K-S test (could work with Turing style tests too)
  • N-test could be fixed by using negative binomial if dispersion is supplied
  • Accommodating simulation-based models better solution
  • Simulations can preserve space-time clustering
  • (CSEP needs to separate forecasting from modelling)
  • Consistency tests of simulation-based tests
    • General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
    • For example, inter-event time distribution
    • P-values should be uniform on [0,1]
  • Interesting in improving models: looking at information gained
    • Not obvious way to transparently estimate without gridding
    • Standard CSEP information not restricted to Poisson
  • CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
  • Moving past parametric distribution functions in favor of non-parametric simulation-based models
  • CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats
Current CSEP Testing Approaches
  • Two approaches
    • Establishing discrepancies/agreement with observations
      • E.g., number of earthquakes
      • Likelihood
    • Comparing against other models
      • How much better or worse does one model do
  • Installed methods
    • Number test: compares number of epicenter forecasts in bin
    • Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
    • Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
    • Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
      • One of the more interesting tests
    • Magnitude test:
      • Same as S test but integrating over space.
      • Not particularly powerful, could be using a more powerful KS test.
    • Information gain per earthquake: is "rate-corrected" information gain significane greater than 0.
      • Paired t-test (T-test)
        • Differences must be approximately independent
        • If differences are not iid normal, CLT!
      • Wilcoxon signed rank test (W-test)
        • Less powerful
        • Require symmetric data
      • Differences are proportional to error bounds, ie., large difference -> large error bounds
      • Error bounds only apply to forecast pairs
  • Residuals based:
    • Residual: difference between local forecast and observation
    • Raw residual: bin-wise difference between observed # and forecast
    • Pearson residuals: normalized cell-wise difference between rate and observed number
    • Deviance residuals: difference between (point-process) log-likelihood Scores.
  • Hit&miss tests
    • Receiver-operating characteristic
    • Molchan error diagram
    • Area-skill-score
  • Goal is to evaluate different aspects of the forecasting model
  • Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
  • 10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
    • 1 day forecasting for California.
    • Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…
    • Curated dataset valuable resource for the scientific community.
  • Next steps:
    • Ensemble modeling
      • Marzochi et al, 2012
      • BMA: averaged based on previously best performing model, which makes it better for selecting models
      • Using additive or multiplicative models for combining models
    • Simulated-based forecasts
      • See previous lecture from Morgan Page and David Rhoades
      • NSIM: number of target eqs
      • Earthquake rate distribution
      • Inter-event time distribution
      • Inter-event distance distribution
    • External forecasts and predictions
      • Quakefinder type predictions.
      • No implemented evaluation method.
      • Critical for real-time forecast and predictions that are generated externally to CSEP platform.

Testing Fault Based Models " Association problem: mapping an eq to the ucerf3 fault model " Need to understand the stopping probabilities associated with stopping between fault segments " Proposed Procedure: 1. Separate linear fault into sections 2. For each section: estimate nucleation rate for eqs of interest 3. Estimate conditional probabilities of earthquake stopping 4. Evaluate frequency of eqs for each pair of section " Could rely on aftershocks to determine the extent of the rupture plane or maybe a finite-fault inversion " Fault participation is the most important this to test for fault-based models. " Null-hypothesis can be established using the following assumptions o Known magnitude distribution o Known scaling between mw and length o Uniform distribution of rupture locations on fault

Considering Epistemic Uncertainty " Aleatory variability: inherent complexity or randomness in some physical process " Epistemic uncertainty: comes from our lack of knowledge about the process " An exchangeable event allow testing of Bayesian models in frequentist framework " Modifying experimental concept allows for ontological testing of exchangeable sequences " Hierarchy of uncert. Necessary for testing o Aleatory variability -> frequentist o Epistemic uncertainty -> Bayesian methods o Ontological error -> rejection of 'ontological' null hypothesis " States that the true hazard is a realization of the extended experts distribution (EED) " Rejection of this null hypothesis implies ontological error o Ontological tests requires 'experimental concept' that conditions the aleatory variability of the natural system.

Ensemble Modeling and Hybrid model " Definition: inferring the extended expert's distribution from the sample provided by any set of models that sample the epistemic uncertainty " Combining models allows the ensemble to perform only slightly worse than the best performing model. Useful when not sure what is 'correct' model. " Two main features: o Describe epistemic uncertainty o Significantly increases the skill of forecast " Hybrid models to increase information gain o Additive hybrid " Best fitting linear combination of models o Maximum hybrid o Multiplicative hybrids " Exploit independent information ie., GPS and smoothed seismicity " Form hybrids for better information gain " Does not require a choice of best model but leverages all models " Could be a target for CSEP to help gain hybrid models -> improve collaborations

Event based testing " Could move to solution where you can make updates to forecast during the forecast period. Likely more important for long-term models " Might want to update forecast when event happens and when event doesn't happen.

Milestones for UCERF3 Testing Program (U3-TP) 1. Goals: o Verify o Validate o Valuate 2. Milestones " Develop infrastructure " Retrospective testing of UCERF3 " Prospective testing of UCERF3 " Comparatively evalulate U3 against empirical models and physics-based models 3. 5 types of testing o Exploratory testing: Turing o Comparatively: T and W tests o Mean-Hazard testing: null hypothesis significance testing o Ontological testing: including epistemic uncertainty o Sequence-specific testing: testing U3-ETAS against observed aftershock sequences " Guiding principal: all OEF models should be under continual prospective testing, put ETAS under operational testing. Need to find the value for UCERF3-ETAS testing. " Could possibly find out what aspects of U3 are superfluous and could simplify the model for using in other locales. " Slip-rate data are special for California data sets " Important to compare against physics-based earthquake simulations such as RSQsim to evaluate certain assumptions in the model. " U3 drivers of CSEP2 o Datasets used in prospective testing must be versioned and archived. Should include analyzed datasets and raw catalogs o Testing on simulated event catalogs " Benefits of U3-TP to the USGS o Scientific value o Software infrastructure

List of Possible milestones " CSEP1.0. What products would be useful? o Do we need to keep operationalizing CSEP1.0? " [Matt] Mistake to shutdown CSEP1.0 o New models necessary? " [Max] CSEP1 big achievement/success was incorporating new models. " [Ned] Should support new models but need to be more selective. o [Max] Wants to published the CA 1-day forecast results. o [Morgan] Value in providing CSEP1.0 data set publicly. o [Mike B.] Need some clear scientific findings and provocative about this. Need to find models that can be rejected and not worth pursuing. Need to curated in such a way that allows scientists to access and work. Would build community. o [Peter] Need products that are digestible by the public. " Simulation based testing o Methods: " ETAS, U3-ETAS, o Need: Process for defining timelines, and which models we would be evaluating. " Event-triggered/sequence specific testing " Comparative valuations o Turing tests o Verification o Inter-comparison of models " Modeling catalog completeness " Modeling epistemic uncertainties: Important, but IT challenges. " Fault/cell participation " [Ned] Testing usefulness! " Rupture association problem " Fault characteristics " U3-TD elastic-rebound testing " [Peter] Declustering work could help with the USGS. Lots of ambiguity on different declustering models. " [Phil] CSEP Should expose its methods so users could leverage the algorithm " [Mike B.] Valuation question most important. Should physics-based simulations be used for Hazard? " [Kevin] Errors need to be propagated moving forward! " [Phil] Need to figure out how the data will be provided to the scientist. " [All] Web based interface would be valuable.


Notes: Targeting publication for July/August issue of SRL. Have some time to develop the webpage. " Expect Press material surrounding the SRL release, so need to be prepared with digestible figures. " Schedule IT call with Peter, Ned, Phil, Bill, Max, Fabio