Difference between revisions of "CSEP Powell Center 2018"

From SCECpedia
Jump to navigationJump to search
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
; CSEP2 Challenges
+
==  CSEP2 Challenges ==
* Testing fault and simulation-based models
+
 
* Care about low prob. events
+
;* Testing fault and simulation-based models
* Should we be testing something other nucleation?
+
;* Care about low prob. events
* Is UCERF3-ETAS more valuable given the alternatives?
+
;* Should we be testing something other nucleation?
* Epistemic Uncertainties
+
;* Is UCERF3-ETAS more valuable given the alternatives?
 +
;* Epistemic Uncertainties
  
 
== Day 1 ==
 
== Day 1 ==
Line 93: Line 94:
 
**** Aleatory variability and sequence specific etas parameters
 
**** Aleatory variability and sequence specific etas parameters
  
RSQSim Rate-State earthquake Simulator
+
; RSQSim Rate-State earthquake Simulator
" Physics-based forecasting model based on R&S statistics
+
* Physics-based forecasting model based on R&S statistics
" Using RSQsim ruptures in hazard assessments
+
* Using RSQsim ruptures in hazard assessments
o Need to create ucerf3 style ruptures
+
** Need to create ucerf3 style ruptures
o Do RSQSim ruptures pass ucerf3 plausibility criteria?
+
** Do RSQSim ruptures pass ucerf3 plausibility criteria?
" Surprisingly, most RSQsim ruptures didn't pass the coulomb criterion. ~17.5% did not pass.
+
*** Surprisingly, most RSQsim ruptures didn't pass the coulomb criterion. ~17.5% did not pass.
" Multi-fault ruptures tend to agree between UCERF3 and RSQsim
+
*** Multi-fault ruptures tend to agree between UCERF3 and RSQsim
" RSQsim agrees well with UCERF3 without specific tuning of recurrence intervals
+
* RSQsim agrees well with UCERF3 without specific tuning of recurrence intervals
o Also: repeat times and short period ground motions
+
** Also: repeat times and short period ground motions
o but starts to disagree at longer spectral periods
+
** but starts to disagree at longer spectral periods
" interesting conditional probs:
+
* interesting conditional probs:
o what is the prob of having 2 mw 7 on the Mojave within 1 week?
+
** what is the prob of having 2 mw 7 on the Mojave within 1 week?
" 4.5% in UCERF3 and 5.6% in RSQSim
+
*** 4.5% in UCERF3 and 5.6% in RSQSim
" Could look at two-point statistics pairwise difference between centroids
+
* Could look at two-point statistics pairwise difference between centroids
" CSEP2 needs to be able to compare synthetic catalogs coming from RSQSim, ie., the ability to handle non-standard catalog sources.
+
* CSEP2 needs to be able to compare synthetic catalogs coming from RSQSim, ie., the ability to handle non-standard catalog sources.
  
Objectives & Challenges of Model Testing
+
; Objectives & Challenges of Model Testing
" Not enough data
+
* Not enough data
" Expand data sources in space and time
+
* Expand data sources in space and time
o i.e., incorp. South America
+
** i.e., incorp. South America
o retrospective testing experiments
+
** retrospective testing experiments
o time-dependence
+
** time-dependence
o extend authoritative data sets
+
** extend authoritative data sets
" issue 2: primitive models based on point-based models (ie, hypocentral/nucleation)
+
* issue 2: primitive models based on point-based models (ie, hypocentral/nucleation)
o need: simulation- , fault-, and physics-based models
+
** need: simulation- , fault-, and physics-based models
o 3d models
+
** 3d models
o Need to account for non-poissonian & correlation structures
+
** Need to account for non-poissonian & correlation structures
o DJ: expand ucerf3 approach to new locations to understand principal components
+
** [DJ] expand ucerf3 approach to new locations to understand principal components
" Issue 3: need to build complete prob. models accounting for ontological error (unknown unknowns)
+
* Issue 3: need to build complete prob. models accounting for ontological error (unknown unknowns)
o How to go from logic tree to continuous pdf?
+
** How to go from logic tree to continuous pdf?
o Need to consider correlations within tree
+
** Need to consider correlations within tree
" Issue 4: testing fault-based models
+
* Issue 4: testing fault-based models
o Lots of work to be done
+
** Lots of work to be done
o 'Turing' style testing can help… Page (2018)
+
** 'Turing' style testing can help… Page (2018)
  
Turing Tests of UCERF3
+
; Turing Tests of UCERF3
" properly accounting for spatial diffusivity
+
* properly accounting for spatial diffusivity
" Inter-sequence aftershock productivity
+
* Inter-sequence aftershock productivity
" Foreshock and aftershock productivity as function of differential magnitude  
+
* Foreshock and aftershock productivity as function of differential magnitude  
" Nearest neighbor separations
+
* Nearest neighbor separations
" Analysis of clusters
+
* Analysis of clusters
" Paleo hiatus
+
* Paleo hiatus
" NF: what are the possible explanations of hiatus?  
+
* [NF] what are the possible explanations of hiatus?  
" Super cycle: extreme clustering over extreme period
+
* Super cycle: extreme clustering over extreme period
" CSEP testing should be more visual and include into CSEP2
+
* CSEP testing should be more visual and include into CSEP2
  
Comparing R&J with ETAS
+
; Comparing R&J with ETAS
" R&J -> ETAS  
+
* R&J -> ETAS  
o Secondard sequences
+
** Secondary sequences
o Faster adaptation
+
** Faster adaptation
o Spatial forecasts
+
** Spatial forecasts
o Better estimates of the range of outcomes
+
** Better estimates of the range of outcomes
" CSEP 1day forecasts begin at start of day; lose some power
+
* CSEP 1day forecasts begin at start of day; lose some power
" Challenges for USGS testing:
+
* Challenges for USGS testing:
o Overlapping windows
+
** Overlapping windows
o Update forecasts within window
+
** Update forecasts within window
o New RJ89 method no longer Poissonian
+
** New RJ89 method no longer Poissonian
o ETAS forecasts are not Poissonian
+
** ETAS forecasts are not Poissonian
o All violate CSEP testing methods
+
** All violate CSEP testing methods
" Likelihood based on Poisson distribution using standard statistical test
+
* Likelihood based on Poisson distribution using standard statistical test
" CSEP strategy:
+
* CSEP strategy:
o Poisson numbers based on RJ forecasts
+
** Poisson numbers based on RJ forecasts
" Strategy:
+
* Strategy:
1. Want dist. Of events in window that starts at t with duration d
+
:# Want dist. Of events in window that starts at t with duration d
2. Using R&J and ETAS to simulate "real" observations
+
:# Using R&J and ETAS to simulate "real" observations
3. Fit R&J to ETAS model: fit is the mode of the individual ETAS runs
+
:# Fit R&J to ETAS model: fit is the mode of the individual ETAS runs
" Using overlapping windows CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
+
* Using overlapping windows.. CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
" ETAS observations fails bc they are not Poisson distributed
+
* ETAS observations fails bc they are not Poisson distributed
" R&J fails when considering large magnitude main shock
+
* R&J fails when considering large magnitude main shock
" Solution: all ETAS all the time
+
* Solution: all ETAS all the time
" Takeaway: forecasts and observations must be consistent
+
* Takeaway: forecasts and observations must be consistent
" Conclusion
+
* Conclusion
o Non-Poissonian behavior
+
** Non-Poissonian behavior
o Simulation based forecasts could address some issues
+
** Simulation based forecasts could address some issues
o Will also handle overlapping time windows
+
** Will also handle overlapping time windows
o RJ will fail assuming that the world is like ETAS
+
** RJ will fail assuming that the world is like ETAS
" CSEP must take the forecasts in as simulations in order to test  
+
* CSEP must take the forecasts in as simulations in order to test  
  
Moving past Poisson
+
; Moving past Poisson
" Poisson likelihood does not allow for clustering
+
* Poisson likelihood does not allow for clustering
" Three-ways to eliminate:
+
* Three-ways to eliminate:
o Adjusted likelihood simulations (in other words, remove likelihood)
+
** Adjusted likelihood simulations (in other words, remove likelihood)
o Normal approximation
+
** Normal approximation
o K-S test (could work with Turing style tests too)
+
** K-S test (could work with Turing style tests too)
" N-test could be fixed by using negative binomial if dispersion is supplied
+
* N-test could be fixed by using negative binomial if dispersion is supplied
" Accommodating simulation-based models better solution
+
* Accommodating simulation-based models better solution
" Simulations can preserve space-time clustering
+
* Simulations can preserve space-time clustering
" (CSEP needs to separate forecasting from modelling)
+
* (CSEP needs to separate forecasting from modelling)
" Consistency tests of simulation-based tests
+
* Consistency tests of simulation-based tests
o General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
+
** General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
o For example, inter-event time distribution
+
** For example, inter-event time distribution
o P-values should be uniform on [0,1]
+
** P-values should be uniform on [0,1]
" Interesting in improving models: looking at information gained
+
* Interesting in improving models: looking at information gained
o Not obvious way to transparently estimate without gridding
+
** Not obvious way to transparently estimate without gridding
o Standard CSEP information not restricted to Poisson
+
** Standard CSEP information not restricted to Poisson
" CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
+
* CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
" Moving past parametric distribution functions in favor of non-parametric simulation-based models
+
* Moving past parametric distribution functions in favor of non-parametric simulation-based models
" CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats
+
* CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats
  
Current CSEP Testing Approaches
+
; Current CSEP Testing Approaches
" Two approaches
+
* Two approaches
o Establishing discrepancies/agreement with observations
+
** Establishing discrepancies/agreement with observations
" E.g., number of earthquakes
+
*** E.g., number of earthquakes
" Likelihood
+
*** Likelihood
o Comparing against other models
+
** Comparing against other models
" How much better or worse does one model do
+
*** How much better or worse does one model do
" Installed methods
+
* Installed methods
o Number test: compares number of epicenter forecasts in bin
+
** Number test: compares number of epicenter forecasts in bin
o Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
+
** Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
o Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
+
** Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
o Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
+
** Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
" One of the more interesting tests
+
*** One of the more interesting tests
o Magnitude test:
+
** Magnitude test:
" Same as S test but integrating over space.  
+
*** Same as S test but integrating over space.  
" Not particularly powerful, could be using a more powerful KS test.
+
*** Not particularly powerful, could be using a more powerful KS test.
o Information gain per earthquake: is "rate-corrected" information gain significant greater than 0.
+
** Information gain per earthquake: is "rate-corrected" information gain significane greater than 0.
" Paired t-test (T-test)
+
*** Paired t-test (T-test)
" Differences must be approximately independent
+
**** Differences must be approximately independent
" If differences are not iid normal, CLT!
+
**** If differences are not iid normal, CLT!
" Wilcoxon signed rank test (W-test)
+
*** Wilcoxon signed rank test (W-test)
" Less powerful
+
**** Less powerful
" Require symmetric data  
+
**** Require symmetric data  
" Differences are proportional to error bounds, ie., large difference -> large error bounds
+
*** Differences are proportional to error bounds, ie., large difference -> large error bounds
" Error bounds only apply to forecast pairs
+
*** Error bounds only apply to forecast pairs
" Residuals based:  
+
* Residuals based:  
o Residual: difference between local forecast and observation
+
** Residual: difference between local forecast and observation
o Raw residual: bin-wise difference between observed # and forecast
+
** Raw residual: bin-wise difference between observed # and forecast
o Pearson residuals: normalized cell-wise difference between rate and observed number
+
** Pearson residuals: normalized cell-wise difference between rate and observed number
o Deviance residuals: difference between (point-process) log-likelihood Scores.  
+
** Deviance residuals: difference between (point-process) log-likelihood Scores.  
" Hit&miss tests
+
* Hit&miss tests
o Receiver-operating characteristic
+
** Receiver-operating characteristic
o Molchan error diagram
+
** Molchan error diagram
o Area-skill-score
+
** Area-skill-score
" Goal is to evaluate different aspects of the forecasting model
+
* Goal is to evaluate different aspects of the forecasting model
" Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
+
* Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
" 10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
+
* 10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
o 1 day forecasting for California.
+
** 1 day forecasting for California.
o Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…  
+
** Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…  
o Curated dataset valuable resource for the scientific community.
+
** Curated dataset valuable resource for the scientific community.
" Next steps:
+
* Next steps:
o Ensemble modeling
+
** Ensemble modeling
" Marzochi et al, 2012
+
*** Marzochi et al, 2012
" BMA: averaged based on previously best performing model, which makes it better for selecting models
+
*** BMA: averaged based on previously best performing model, which makes it better for selecting models
" Using additive or multiplicative models for combining models
+
*** Using additive or multiplicative models for combining models
o Simulated-based forecasts
+
** Simulated-based forecasts
" See previous lecture from Morgan Page and David Rhoades
+
*** See previous lecture from Morgan Page and David Rhoades
" NSIM: number of target eqs  
+
*** NSIM: number of target eqs  
" Earthquake rate distribution
+
*** Earthquake rate distribution
" Inter-event time distribution
+
*** Inter-event time distribution
" Inter-event distance distribution
+
*** Inter-event distance distribution
o External forecasts and predictions
+
** External forecasts and predictions
" Quakefinder type predictions.
+
*** Quakefinder type predictions.
" No implemented evaluation method.
+
*** No implemented evaluation method.
" Critical for real-time forecast and predictions that are generated externally to CSEP platform.
+
*** Critical for real-time forecast and predictions that are generated externally to CSEP platform.
  
Testing Fault Based Models
+
; Testing Fault Based Models
" Association problem: mapping an eq to the ucerf3 fault model
+
* Association problem: mapping an eq to the ucerf3 fault model
" Need to understand the stopping probabilities associated with stopping between fault segments
+
* Need to understand the stopping probabilities associated with stopping between fault segments
" Proposed Procedure:
+
* Proposed Procedure:
1. Separate linear fault into sections
+
:# Separate linear fault into sections
2. For each section: estimate nucleation rate for eqs of interest
+
:# For each section: estimate nucleation rate for eqs of interest
3. Estimate conditional probabilities of earthquake stopping
+
:# Estimate conditional probabilities of earthquake stopping
4. Evaluate frequency of eqs for each pair of section
+
:# Evaluate frequency of eqs for each pair of section
" Could rely on aftershocks to determine the extent of the rupture plane or maybe a finite-fault inversion
+
* Could rely on aftershocks to determine the extent of the rupture plane or maybe a finite-fault inversion
" Fault participation is the most important this to test for fault-based models.
+
* Fault participation is the most important this to test for fault-based models.
" Null-hypothesis can be established using the following assumptions
+
* Null-hypothesis can be established using the following assumptions
o Known magnitude distribution
+
** Known magnitude distribution
o Known scaling between mw and length
+
** Known scaling between mw and length
o Uniform distribution of rupture locations on fault
+
** Uniform distribution of rupture locations on fault
  
Considering Epistemic Uncertainty
+
; Considering Epistemic Uncertainty
" Aleatory variability: inherent complexity or randomness in some physical process
+
* Aleatory variability: inherent complexity or randomness in some physical process
" Epistemic uncertainty: comes from our lack of knowledge about the process
+
* Epistemic uncertainty: comes from our lack of knowledge about the process
" An exchangeable event allow testing of Bayesian models in frequentist framework
+
* An exchangeable event allow testing of Bayesian models in frequentist framework
" Modifying experimental concept allows for ontological testing of exchangeable sequences
+
* Modifying experimental concept allows for ontological testing of exchangeable sequences
" Hierarchy of uncert. Necessary for testing
+
* Hierarchy of uncert. Necessary for testing
o Aleatory variability -> frequentist
+
** Aleatory variability -> frequentist
o Epistemic uncertainty -> Bayesian methods
+
** Epistemic uncertainty -> Bayesian methods
o Ontological error -> rejection of 'ontological' null hypothesis
+
** Ontological error -> rejection of 'ontological' null hypothesis
" States that the true hazard is a realization of the extended experts distribution (EED)
+
*** States that the true hazard is a realization of the extended experts distribution (EED)
" Rejection of this null hypothesis implies ontological error
+
*** Rejection of this null hypothesis implies ontological error
o Ontological tests requires 'experimental concept' that conditions the aleatory variability of the natural system.
+
** Ontological tests requires 'experimental concept' that conditions the aleatory variability of the natural system.
  
Ensemble Modeling and Hybrid model
+
; Ensemble Modeling and Hybrid model
" Definition: inferring the extended expert's distribution from the sample provided by any set of models that sample the epistemic uncertainty
+
* Definition: inferring the extended expert's distribution from the sample provided by any set of models that sample the epistemic uncertainty
" Combining models allows the ensemble to perform only slightly worse than the best performing model. Useful when not sure what is 'correct' model.
+
* Combining models allows the ensemble to perform only slightly worse than the best performing model. Useful when not sure what is 'correct' model.
" Two main features:
+
* Two main features:
o Describe epistemic uncertainty
+
** Describe epistemic uncertainty
o Significantly increases the skill of forecast  
+
** Significantly increases the skill of forecast  
" Hybrid models to increase information gain
+
* Hybrid models to increase information gain
o Additive hybrid
+
** Additive hybrid
" Best fitting linear combination of models
+
*** Best fitting linear combination of models
o Maximum hybrid
+
** Maximum hybrid
o Multiplicative hybrids
+
** Multiplicative hybrids
" Exploit independent information ie., GPS and smoothed seismicity
+
*** Exploit independent information ie., GPS and smoothed seismicity
" Form hybrids for better information gain
+
* Form hybrids for better information gain
" Does not require a choice of best model but leverages all models
+
* Does not require a choice of best model but leverages all models
" Could be a target for CSEP to help gain hybrid models -> improve collaborations
+
* Could be a target for CSEP to help gain hybrid models -> improve collaborations
  
Event based testing
+
; Event based testing
" Could move to solution where you can make updates to forecast during the forecast period. Likely more important for long-term models
+
* Could move to solution where you can make updates to forecast during the forecast period. Likely more important for long-term models
" Might want to update forecast when event happens and when event doesn't happen.
+
* Might want to update forecast when event happens and when event doesn't happen.
  
Milestones for UCERF3 Testing Program (U3-TP)
+
== Day 3 ==
1. Goals:
 
o Verify
 
o Validate
 
o Valuate
 
2. Milestones
 
" Develop infrastructure
 
" Retrospective testing of UCERF3
 
" Prospective testing of UCERF3
 
" Comparatively evalulate U3 against empirical models and physics-based models
 
3. 5 types of testing
 
o Exploratory testing: Turing
 
o Comparatively: T and W tests
 
o Mean-Hazard testing: null hypothesis significance testing
 
o Ontological testing: including epistemic uncertainty
 
o Sequence-specific testing: testing U3-ETAS against observed aftershock sequences
 
" Guiding principal: all OEF models should be under continual prospective testing, put ETAS under operational testing. Need to find the value for UCERF3-ETAS testing.
 
" Could possibly find out what aspects of U3 are superfluous and could simplify the model for using in other locales.
 
" Slip-rate data are special for California data sets
 
" Important to compare against physics-based earthquake simulations such as RSQsim to evaluate certain assumptions in the model.
 
" U3 drivers of CSEP2
 
o Datasets used in prospective testing must be versioned and archived. Should include analyzed datasets and raw catalogs
 
o Testing on simulated event catalogs
 
" Benefits of U3-TP to the USGS
 
o Scientific value
 
o Software infrastructure
 
  
List of Possible milestones
+
; Milestones for UCERF3 Testing Program (U3-TP)
" CSEP1.0. What products would be useful?
+
# Goals:
o Do we need to keep operationalizing CSEP1.0?
+
#* Verify
" [Matt] Mistake to shutdown CSEP1.0
+
#* Validate
o New models necessary?
+
#* Valuate
" [Max] CSEP1 big achievement/success was incorporating new models.
+
# Milestones
" [Ned] Should support new models but need to be more selective.
+
#* Develop infrastructure
o [Max] Wants to published the CA 1-day forecast results.
+
#* Retrospective testing of UCERF3
o [Morgan] Value in providing CSEP1.0 data set publicly.
+
#* Prospective testing of UCERF3
o [Mike B.] Need some clear scientific findings and provocative about this. Need to find models that can be rejected and not worth pursuing. Need to curated in such a way that allows scientists to access and work. Would build community.
+
#* Comparatively evalulate U3 against empirical models and physics-based models
o [Peter] Need products that are digestible by the public.
+
# 5 types of testing
" Simulation based testing
+
#* Exploratory testing: Turing
o Methods:
+
#* Comparatively: T and W tests
" ETAS, U3-ETAS,
+
#* Mean-Hazard testing: null hypothesis significance testing
o Need: Process for defining timelines, and which models we would be evaluating.
+
#* Ontological testing: including epistemic uncertainty
" Event-triggered/sequence specific testing
+
#* Sequence-specific testing: testing U3-ETAS against observed aftershock sequences
" Comparative valuations
+
* Guiding principal: all OEF models should be under continual prospective testing, put ETAS under operational testing. Need to find the value for UCERF3-ETAS testing.
o Turing tests
+
* Could possibly find out what aspects of U3 are superfluous and could simplify the model for using in other locales.  
o Verification
+
* Slip-rate data are special for California data sets
o Inter-comparison of models
+
* Important to compare against physics-based earthquake simulations such as RSQsim to evaluate certain assumptions in the model.
" Modeling catalog completeness
+
* UCERF3 drivers of CSEP2
" Modeling epistemic uncertainties: Important, but IT challenges.
+
** Datasets used in prospective testing must be versioned and archived. Should include analyzed datasets and raw catalogs
" Fault/cell participation
+
** Testing on simulated event catalogs
" [Ned] Testing usefulness!
+
* Benefits of UCERF3-TP to the USGS
" Rupture association problem
+
** Scientific value
" Fault characteristics
+
** Software infrastructure
" U3-TD elastic-rebound testing
 
" [Peter] Declustering work could help with the USGS. Lots of ambiguity on different declustering models.
 
" [Phil] CSEP Should expose its methods so users could leverage the algorithm
 
" [Mike B.] Valuation question most important. Should physics-based simulations be used for Hazard?
 
" [Kevin] Errors need to be propagated moving forward!
 
" [Phil] Need to figure out how the data will be provided to the scientist.
 
" [All] Web based interface would be valuable.
 
  
 +
;List of Possible milestones
 +
* CSEP1.0. What products would be useful?
 +
** Do we need to keep operationalizing CSEP1.0?
 +
** New models necessary?
 +
*** [Max] CSEP1 big achievement/success was incorporating new models.
 +
*** [Ned] Should support new models but need to be more selective.
 +
*** [Max] Wants to published the CA 1-day forecast results.
 +
*** [Morgan] Value in providing CSEP1.0 data set publicly.
 +
*** [Mike B.] Need some clear scientific findings and be provocative about this. Need to find models that can be rejected and not worth pursuing. Need to curated in such a way that allows scientists to access and work. Would build community.
 +
*** [Peter] Need products that are digestible by the public.
 +
* Simulation based testing
 +
** Methods:
 +
*** ETAS, U3-ETAS,
 +
** Need: Process for defining timelines, and which models we would be evaluating.
 +
* Event-triggered/sequence specific testing
 +
* Comparative valuations
 +
** Turing tests
 +
** Verification
 +
** Inter-comparison of models
 +
* Modeling catalog completeness
 +
* Modeling epistemic uncertainties: Important, but IT challenges.
 +
* Fault/cell participation
 +
* [Ned] Testing usefulness!
 +
* Rupture association problem
 +
* Fault characteristics
 +
* UCERF3-TD elastic-rebound testing
 +
* [Peter] Declustering work could help with the USGS. Lots of ambiguity on different declustering models.
 +
* [Phil] CSEP Should expose its methods so users could leverage the algorithm
 +
* [Mike B.] Valuation question most important. Should physics-based simulations be used for Hazard?
 +
* [Kevin] Errors need to be propagated moving forward!
 +
* [Phil] Need to figure out how the data will be provided to the scientist.
 +
* [All] Web based interface would be valuable.
  
Notes: Targeting publication for July/August issue of SRL. Have some time to develop the webpage.  
+
; Notes about SRL special issue:  
" Expect Press material surrounding the SRL release, so need to be prepared with digestible figures.
+
* Targeting publication for July/August issue of SRL. Have some time to develop the webpage.  
" Schedule IT call with Peter, Ned, Phil, Bill, Max, Fabio
+
* Expect Press material surrounding the SRL release, so need to be prepared with digestible figures.

Latest revision as of 22:54, 14 March 2018

CSEP2 Challenges

  • Testing fault and simulation-based models
  • Care about low prob. events
  • Should we be testing something other nucleation?
  • Is UCERF3-ETAS more valuable given the alternatives?
  • Epistemic Uncertainties

Day 1

Reasenberg & Jones for USGS OAF
  • Rate of ≥M aftershocks at time t after mainshock with given magnitude
  • Improvements made to reasenberg & jones model to update generic parameters for California
  • Aftershock forecast for Mw ≥ 5 using improved R&J model
  • Automating aftershock forecasts for the US (in progress w/ code development challenges)
  • Moving past R&J in favor of ETAS, but could be useful for UCERF3-ETAS testing
  • Testability challenges:
    • Overlapping, non-independent forecasts
    • EQ prob. Dist. Not necessarily Poissonian
    • Temporal forecasts with poorly defined spatial area
    • R&J not great with substantial triggering (e.g., swarms)
Update on ETAS Forecasting
  • GUI interface to compute manual forecasts for external uses.
  • AIC prefers ETAS, however more complicated models not favored over simple 3 param model
  • Performs better than R&J
  • Issues with "supercriticality"
    • Could solve by fitting mainshock separately
  • For global problem (and local): estimating magnitude of completeness and b-value
  • Need to limit supercriticality before OEF can be given to non-experts
  • "Similarity forecast" can be implemented as mask/failsafe to reduce surprises
    • Defined as having "similar number of earthquakes in binned magnitudes"
    • ETAS has ½ the surprise rate of R&J
    • Or could be included in ensemble
Spatial ETAS
  • ETAS type models can zero in on aftershock hot-spots
    • Using spatial omori type
    • Need some spatial kernel
  • Moving from spatial rates to hazards
    • Couple forecasts with GMPE to produce ground motions
    • MMI regression-based models
  • Testability
    • Challenges associated with incorporating hazard, because it eliminates some granularity in the forecast model
    • Worried about Type II error
Time-Dependent Background seismicity
  • Particularly useful for earthquake swarms where background seismicity differs from 'normal' rate
    • Could determine rate from previous swarms
    • Potential issues:
      • Swarm duration
      • Considerable variability in swarm durations
      • Solving using "life expectancy" table, but limited data in southern California
      • Likely need some physical constraints on distribution functions
  • Using STETAS to use standard catalog without needing declustering
    • Hydro mechanical models for stressing-rate can be used for induced seismicity
    • rate-and-state framework
  • Testing strategy:
    • Given a swarm; how long should we provide forecasts?

Day 2

UCERF3
  • Three models
    • Time-independent
      • Fault-based approach that splits faults into subsections
      • Rate of rupture computed from Grand Inversion (see pub for details)
      • Add gridded off-fault seismicity
      • Logic tree used to capture the epistemic uncertainty
      • Fault participation most important. (ie., what is the prob. of a particular fault hosting an eq ≥ Mw)
    • Time-dependent
      • Based on reed renewal statistics
      • Additional logic tree branches added
    • ETAS
      • Ignoring faults gives rise to discrepancy between ETAS and elastic rebound type models
      • Combines UCERF3-TD with an ETAS model and produces synthetic catalog
      • Issues:
        • Variability of MFD throughout CA
        • GR not consistent with data
        • Main question: what is the conditional prob of observing large eq given an observed small eq?
        • Determined that elastic rebound necessary
        • Rate of small events not always consistent with rate of expected aftershocks
      • Operationalizable, but needs significant resources
        • Major question: Does it have value?
        • HayWired scenario recently published in SRL
        • Shows value if interested in severe shaking.
        • Faults important for low prob high ground motions.
      • Testing UCERF3-ETAS
        • Fault participation, not nucleation
        • Logic-tree branches
        • Elastic rebound/aperiodicity
        • Characteristic behavior near faults
        • Retrospective testing
        • Aleatory variability and sequence specific etas parameters
RSQSim Rate-State earthquake Simulator
  • Physics-based forecasting model based on R&S statistics
  • Using RSQsim ruptures in hazard assessments
    • Need to create ucerf3 style ruptures
    • Do RSQSim ruptures pass ucerf3 plausibility criteria?
      • Surprisingly, most RSQsim ruptures didn't pass the coulomb criterion. ~17.5% did not pass.
      • Multi-fault ruptures tend to agree between UCERF3 and RSQsim
  • RSQsim agrees well with UCERF3 without specific tuning of recurrence intervals
    • Also: repeat times and short period ground motions
    • but starts to disagree at longer spectral periods
  • interesting conditional probs:
    • what is the prob of having 2 mw 7 on the Mojave within 1 week?
      • 4.5% in UCERF3 and 5.6% in RSQSim
  • Could look at two-point statistics pairwise difference between centroids
  • CSEP2 needs to be able to compare synthetic catalogs coming from RSQSim, ie., the ability to handle non-standard catalog sources.
Objectives & Challenges of Model Testing
  • Not enough data
  • Expand data sources in space and time
    • i.e., incorp. South America
    • retrospective testing experiments
    • time-dependence
    • extend authoritative data sets
  • issue 2: primitive models based on point-based models (ie, hypocentral/nucleation)
    • need: simulation- , fault-, and physics-based models
    • 3d models
    • Need to account for non-poissonian & correlation structures
    • [DJ] expand ucerf3 approach to new locations to understand principal components
  • Issue 3: need to build complete prob. models accounting for ontological error (unknown unknowns)
    • How to go from logic tree to continuous pdf?
    • Need to consider correlations within tree
  • Issue 4: testing fault-based models
    • Lots of work to be done
    • 'Turing' style testing can help… Page (2018)
Turing Tests of UCERF3
  • properly accounting for spatial diffusivity
  • Inter-sequence aftershock productivity
  • Foreshock and aftershock productivity as function of differential magnitude
  • Nearest neighbor separations
  • Analysis of clusters
  • Paleo hiatus
  • [NF] what are the possible explanations of hiatus?
  • Super cycle: extreme clustering over extreme period
  • CSEP testing should be more visual and include into CSEP2
Comparing R&J with ETAS
  • R&J -> ETAS
    • Secondary sequences
    • Faster adaptation
    • Spatial forecasts
    • Better estimates of the range of outcomes
  • CSEP 1day forecasts begin at start of day; lose some power
  • Challenges for USGS testing:
    • Overlapping windows
    • Update forecasts within window
    • New RJ89 method no longer Poissonian
    • ETAS forecasts are not Poissonian
    • All violate CSEP testing methods
  • Likelihood based on Poisson distribution using standard statistical test
  • CSEP strategy:
    • Poisson numbers based on RJ forecasts
  • Strategy:
  1. Want dist. Of events in window that starts at t with duration d
  2. Using R&J and ETAS to simulate "real" observations
  3. Fit R&J to ETAS model: fit is the mode of the individual ETAS runs
  • Using overlapping windows.. CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
  • ETAS observations fails bc they are not Poisson distributed
  • R&J fails when considering large magnitude main shock
  • Solution: all ETAS all the time
  • Takeaway: forecasts and observations must be consistent
  • Conclusion
    • Non-Poissonian behavior
    • Simulation based forecasts could address some issues
    • Will also handle overlapping time windows
    • RJ will fail assuming that the world is like ETAS
  • CSEP must take the forecasts in as simulations in order to test
Moving past Poisson
  • Poisson likelihood does not allow for clustering
  • Three-ways to eliminate:
    • Adjusted likelihood simulations (in other words, remove likelihood)
    • Normal approximation
    • K-S test (could work with Turing style tests too)
  • N-test could be fixed by using negative binomial if dispersion is supplied
  • Accommodating simulation-based models better solution
  • Simulations can preserve space-time clustering
  • (CSEP needs to separate forecasting from modelling)
  • Consistency tests of simulation-based tests
    • General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
    • For example, inter-event time distribution
    • P-values should be uniform on [0,1]
  • Interesting in improving models: looking at information gained
    • Not obvious way to transparently estimate without gridding
    • Standard CSEP information not restricted to Poisson
  • CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
  • Moving past parametric distribution functions in favor of non-parametric simulation-based models
  • CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats
Current CSEP Testing Approaches
  • Two approaches
    • Establishing discrepancies/agreement with observations
      • E.g., number of earthquakes
      • Likelihood
    • Comparing against other models
      • How much better or worse does one model do
  • Installed methods
    • Number test: compares number of epicenter forecasts in bin
    • Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
    • Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
    • Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
      • One of the more interesting tests
    • Magnitude test:
      • Same as S test but integrating over space.
      • Not particularly powerful, could be using a more powerful KS test.
    • Information gain per earthquake: is "rate-corrected" information gain significane greater than 0.
      • Paired t-test (T-test)
        • Differences must be approximately independent
        • If differences are not iid normal, CLT!
      • Wilcoxon signed rank test (W-test)
        • Less powerful
        • Require symmetric data
      • Differences are proportional to error bounds, ie., large difference -> large error bounds
      • Error bounds only apply to forecast pairs
  • Residuals based:
    • Residual: difference between local forecast and observation
    • Raw residual: bin-wise difference between observed # and forecast
    • Pearson residuals: normalized cell-wise difference between rate and observed number
    • Deviance residuals: difference between (point-process) log-likelihood Scores.
  • Hit&miss tests
    • Receiver-operating characteristic
    • Molchan error diagram
    • Area-skill-score
  • Goal is to evaluate different aspects of the forecasting model
  • Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
  • 10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
    • 1 day forecasting for California.
    • Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…
    • Curated dataset valuable resource for the scientific community.
  • Next steps:
    • Ensemble modeling
      • Marzochi et al, 2012
      • BMA: averaged based on previously best performing model, which makes it better for selecting models
      • Using additive or multiplicative models for combining models
    • Simulated-based forecasts
      • See previous lecture from Morgan Page and David Rhoades
      • NSIM: number of target eqs
      • Earthquake rate distribution
      • Inter-event time distribution
      • Inter-event distance distribution
    • External forecasts and predictions
      • Quakefinder type predictions.
      • No implemented evaluation method.
      • Critical for real-time forecast and predictions that are generated externally to CSEP platform.
Testing Fault Based Models
  • Association problem: mapping an eq to the ucerf3 fault model
  • Need to understand the stopping probabilities associated with stopping between fault segments
  • Proposed Procedure:
  1. Separate linear fault into sections
  2. For each section: estimate nucleation rate for eqs of interest
  3. Estimate conditional probabilities of earthquake stopping
  4. Evaluate frequency of eqs for each pair of section
  • Could rely on aftershocks to determine the extent of the rupture plane or maybe a finite-fault inversion
  • Fault participation is the most important this to test for fault-based models.
  • Null-hypothesis can be established using the following assumptions
    • Known magnitude distribution
    • Known scaling between mw and length
    • Uniform distribution of rupture locations on fault
Considering Epistemic Uncertainty
  • Aleatory variability: inherent complexity or randomness in some physical process
  • Epistemic uncertainty: comes from our lack of knowledge about the process
  • An exchangeable event allow testing of Bayesian models in frequentist framework
  • Modifying experimental concept allows for ontological testing of exchangeable sequences
  • Hierarchy of uncert. Necessary for testing
    • Aleatory variability -> frequentist
    • Epistemic uncertainty -> Bayesian methods
    • Ontological error -> rejection of 'ontological' null hypothesis
      • States that the true hazard is a realization of the extended experts distribution (EED)
      • Rejection of this null hypothesis implies ontological error
    • Ontological tests requires 'experimental concept' that conditions the aleatory variability of the natural system.
Ensemble Modeling and Hybrid model
  • Definition: inferring the extended expert's distribution from the sample provided by any set of models that sample the epistemic uncertainty
  • Combining models allows the ensemble to perform only slightly worse than the best performing model. Useful when not sure what is 'correct' model.
  • Two main features:
    • Describe epistemic uncertainty
    • Significantly increases the skill of forecast
  • Hybrid models to increase information gain
    • Additive hybrid
      • Best fitting linear combination of models
    • Maximum hybrid
    • Multiplicative hybrids
      • Exploit independent information ie., GPS and smoothed seismicity
  • Form hybrids for better information gain
  • Does not require a choice of best model but leverages all models
  • Could be a target for CSEP to help gain hybrid models -> improve collaborations
Event based testing
  • Could move to solution where you can make updates to forecast during the forecast period. Likely more important for long-term models
  • Might want to update forecast when event happens and when event doesn't happen.

Day 3

Milestones for UCERF3 Testing Program (U3-TP)
  1. Goals:
    • Verify
    • Validate
    • Valuate
  2. Milestones
    • Develop infrastructure
    • Retrospective testing of UCERF3
    • Prospective testing of UCERF3
    • Comparatively evalulate U3 against empirical models and physics-based models
  3. 5 types of testing
    • Exploratory testing: Turing
    • Comparatively: T and W tests
    • Mean-Hazard testing: null hypothesis significance testing
    • Ontological testing: including epistemic uncertainty
    • Sequence-specific testing: testing U3-ETAS against observed aftershock sequences
  • Guiding principal: all OEF models should be under continual prospective testing, put ETAS under operational testing. Need to find the value for UCERF3-ETAS testing.
  • Could possibly find out what aspects of U3 are superfluous and could simplify the model for using in other locales.
  • Slip-rate data are special for California data sets
  • Important to compare against physics-based earthquake simulations such as RSQsim to evaluate certain assumptions in the model.
  • UCERF3 drivers of CSEP2
    • Datasets used in prospective testing must be versioned and archived. Should include analyzed datasets and raw catalogs
    • Testing on simulated event catalogs
  • Benefits of UCERF3-TP to the USGS
    • Scientific value
    • Software infrastructure
List of Possible milestones
  • CSEP1.0. What products would be useful?
    • Do we need to keep operationalizing CSEP1.0?
    • New models necessary?
      • [Max] CSEP1 big achievement/success was incorporating new models.
      • [Ned] Should support new models but need to be more selective.
      • [Max] Wants to published the CA 1-day forecast results.
      • [Morgan] Value in providing CSEP1.0 data set publicly.
      • [Mike B.] Need some clear scientific findings and be provocative about this. Need to find models that can be rejected and not worth pursuing. Need to curated in such a way that allows scientists to access and work. Would build community.
      • [Peter] Need products that are digestible by the public.
  • Simulation based testing
    • Methods:
      • ETAS, U3-ETAS,
    • Need: Process for defining timelines, and which models we would be evaluating.
  • Event-triggered/sequence specific testing
  • Comparative valuations
    • Turing tests
    • Verification
    • Inter-comparison of models
  • Modeling catalog completeness
  • Modeling epistemic uncertainties: Important, but IT challenges.
  • Fault/cell participation
  • [Ned] Testing usefulness!
  • Rupture association problem
  • Fault characteristics
  • UCERF3-TD elastic-rebound testing
  • [Peter] Declustering work could help with the USGS. Lots of ambiguity on different declustering models.
  • [Phil] CSEP Should expose its methods so users could leverage the algorithm
  • [Mike B.] Valuation question most important. Should physics-based simulations be used for Hazard?
  • [Kevin] Errors need to be propagated moving forward!
  • [Phil] Need to figure out how the data will be provided to the scientist.
  • [All] Web based interface would be valuable.
Notes about SRL special issue
  • Targeting publication for July/August issue of SRL. Have some time to develop the webpage.
  • Expect Press material surrounding the SRL release, so need to be prepared with digestible figures.