Difference between revisions of "CSEP Powell Center 2018"
From SCECpedia
Jump to navigationJump to search(10 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | + | == CSEP2 Challenges == | |
− | * Testing fault and simulation-based models | + | |
− | * Care about low prob. events | + | ;* Testing fault and simulation-based models |
− | * Should we be testing something other nucleation? | + | ;* Care about low prob. events |
− | * Is UCERF3-ETAS more valuable given the alternatives? | + | ;* Should we be testing something other nucleation? |
− | * Epistemic Uncertainties | + | ;* Is UCERF3-ETAS more valuable given the alternatives? |
+ | ;* Epistemic Uncertainties | ||
+ | |||
+ | == Day 1 == | ||
; Reasenberg & Jones for USGS OAF: | ; Reasenberg & Jones for USGS OAF: | ||
Line 19: | Line 22: | ||
; Update on ETAS Forecasting | ; Update on ETAS Forecasting | ||
− | + | * GUI interface to compute manual forecasts for external uses. | |
− | + | * AIC prefers ETAS, however more complicated models not favored over simple 3 param model | |
− | + | * Performs better than R&J | |
− | + | * Issues with "supercriticality" | |
− | + | ** Could solve by fitting mainshock separately | |
− | + | * For global problem (and local): estimating magnitude of completeness and b-value | |
− | + | * Need to limit supercriticality before OEF can be given to non-experts | |
− | + | * "Similarity forecast" can be implemented as mask/failsafe to reduce surprises | |
− | + | ** Defined as having "similar number of earthquakes in binned magnitudes" | |
− | + | ** ETAS has ½ the surprise rate of R&J | |
− | + | ** Or could be included in ensemble | |
; Spatial ETAS | ; Spatial ETAS | ||
Line 56: | Line 59: | ||
** Given a swarm; how long should we provide forecasts? | ** Given a swarm; how long should we provide forecasts? | ||
− | UCERF3 | + | == Day 2 == |
− | + | ||
− | + | ; UCERF3 | |
− | + | * Three models | |
− | + | ** Time-independent | |
− | + | *** Fault-based approach that splits faults into subsections | |
− | + | *** Rate of rupture computed from Grand Inversion (see pub for details) | |
− | + | *** Add gridded off-fault seismicity | |
− | + | *** Logic tree used to capture the epistemic uncertainty | |
− | + | *** Fault participation most important. (ie., what is the prob. of a particular fault hosting an eq ≥ Mw) | |
− | + | ** Time-dependent | |
− | + | *** Based on reed renewal statistics | |
− | + | *** Additional logic tree branches added | |
− | + | ** ETAS | |
− | + | *** Ignoring faults gives rise to discrepancy between ETAS and elastic rebound type models | |
− | + | *** Combines UCERF3-TD with an ETAS model and produces synthetic catalog | |
− | + | *** Issues: | |
− | + | **** Variability of MFD throughout CA | |
− | + | **** GR not consistent with data | |
− | + | **** Main question: what is the conditional prob of observing large eq given an observed small eq? | |
− | + | **** Determined that elastic rebound necessary | |
− | + | **** Rate of small events not always consistent with rate of expected aftershocks | |
− | + | *** Operationalizable, but needs significant resources | |
− | + | **** Major question: Does it have value? | |
− | + | **** HayWired scenario recently published in SRL | |
− | + | **** Shows value if interested in severe shaking. | |
− | + | **** Faults important for low prob high ground motions. | |
− | + | *** Testing UCERF3-ETAS | |
− | + | **** Fault participation, not nucleation | |
− | + | **** Logic-tree branches | |
− | + | **** Elastic rebound/aperiodicity | |
− | + | **** Characteristic behavior near faults | |
+ | **** Retrospective testing | ||
+ | **** Aleatory variability and sequence specific etas parameters | ||
− | RSQSim Rate-State earthquake Simulator | + | ; RSQSim Rate-State earthquake Simulator |
− | + | * Physics-based forecasting model based on R&S statistics | |
− | + | * Using RSQsim ruptures in hazard assessments | |
− | + | ** Need to create ucerf3 style ruptures | |
− | + | ** Do RSQSim ruptures pass ucerf3 plausibility criteria? | |
− | + | *** Surprisingly, most RSQsim ruptures didn't pass the coulomb criterion. ~17.5% did not pass. | |
− | + | *** Multi-fault ruptures tend to agree between UCERF3 and RSQsim | |
− | + | * RSQsim agrees well with UCERF3 without specific tuning of recurrence intervals | |
− | + | ** Also: repeat times and short period ground motions | |
− | + | ** but starts to disagree at longer spectral periods | |
− | + | * interesting conditional probs: | |
− | + | ** what is the prob of having 2 mw 7 on the Mojave within 1 week? | |
− | + | *** 4.5% in UCERF3 and 5.6% in RSQSim | |
− | + | * Could look at two-point statistics pairwise difference between centroids | |
− | + | * CSEP2 needs to be able to compare synthetic catalogs coming from RSQSim, ie., the ability to handle non-standard catalog sources. | |
− | Objectives & Challenges of Model Testing | + | ; Objectives & Challenges of Model Testing |
− | + | * Not enough data | |
− | + | * Expand data sources in space and time | |
− | + | ** i.e., incorp. South America | |
− | + | ** retrospective testing experiments | |
− | + | ** time-dependence | |
− | + | ** extend authoritative data sets | |
− | + | * issue 2: primitive models based on point-based models (ie, hypocentral/nucleation) | |
− | + | ** need: simulation- , fault-, and physics-based models | |
− | + | ** 3d models | |
− | + | ** Need to account for non-poissonian & correlation structures | |
− | + | ** [DJ] expand ucerf3 approach to new locations to understand principal components | |
− | + | * Issue 3: need to build complete prob. models accounting for ontological error (unknown unknowns) | |
− | + | ** How to go from logic tree to continuous pdf? | |
− | + | ** Need to consider correlations within tree | |
− | + | * Issue 4: testing fault-based models | |
− | + | ** Lots of work to be done | |
− | + | ** 'Turing' style testing can help… Page (2018) | |
− | Turing Tests of UCERF3 | + | ; Turing Tests of UCERF3 |
− | + | * properly accounting for spatial diffusivity | |
− | + | * Inter-sequence aftershock productivity | |
− | + | * Foreshock and aftershock productivity as function of differential magnitude | |
− | + | * Nearest neighbor separations | |
− | + | * Analysis of clusters | |
− | + | * Paleo hiatus | |
− | + | * [NF] what are the possible explanations of hiatus? | |
− | + | * Super cycle: extreme clustering over extreme period | |
− | + | * CSEP testing should be more visual and include into CSEP2 | |
− | Comparing R&J with ETAS | + | ; Comparing R&J with ETAS |
− | + | * R&J -> ETAS | |
− | + | ** Secondary sequences | |
− | + | ** Faster adaptation | |
− | + | ** Spatial forecasts | |
− | + | ** Better estimates of the range of outcomes | |
− | + | * CSEP 1day forecasts begin at start of day; lose some power | |
− | + | * Challenges for USGS testing: | |
− | + | ** Overlapping windows | |
− | + | ** Update forecasts within window | |
− | + | ** New RJ89 method no longer Poissonian | |
− | + | ** ETAS forecasts are not Poissonian | |
− | + | ** All violate CSEP testing methods | |
− | + | * Likelihood based on Poisson distribution using standard statistical test | |
− | + | * CSEP strategy: | |
− | + | ** Poisson numbers based on RJ forecasts | |
− | + | * Strategy: | |
− | + | :# Want dist. Of events in window that starts at t with duration d | |
− | + | :# Using R&J and ETAS to simulate "real" observations | |
− | + | :# Fit R&J to ETAS model: fit is the mode of the individual ETAS runs | |
− | + | * Using overlapping windows.. CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations | |
− | + | * ETAS observations fails bc they are not Poisson distributed | |
− | + | * R&J fails when considering large magnitude main shock | |
− | + | * Solution: all ETAS all the time | |
− | + | * Takeaway: forecasts and observations must be consistent | |
− | + | * Conclusion | |
− | + | ** Non-Poissonian behavior | |
− | + | ** Simulation based forecasts could address some issues | |
− | + | ** Will also handle overlapping time windows | |
− | + | ** RJ will fail assuming that the world is like ETAS | |
− | + | * CSEP must take the forecasts in as simulations in order to test | |
− | Moving past Poisson | + | ; Moving past Poisson |
− | + | * Poisson likelihood does not allow for clustering | |
− | + | * Three-ways to eliminate: | |
− | + | ** Adjusted likelihood simulations (in other words, remove likelihood) | |
− | + | ** Normal approximation | |
− | + | ** K-S test (could work with Turing style tests too) | |
− | + | * N-test could be fixed by using negative binomial if dispersion is supplied | |
− | + | * Accommodating simulation-based models better solution | |
− | + | * Simulations can preserve space-time clustering | |
− | + | * (CSEP needs to separate forecasting from modelling) | |
− | + | * Consistency tests of simulation-based tests | |
− | + | ** General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog | |
− | + | ** For example, inter-event time distribution | |
− | + | ** P-values should be uniform on [0,1] | |
− | + | * Interesting in improving models: looking at information gained | |
− | + | ** Not obvious way to transparently estimate without gridding | |
− | + | ** Standard CSEP information not restricted to Poisson | |
− | + | * CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center. | |
− | + | * Moving past parametric distribution functions in favor of non-parametric simulation-based models | |
− | + | * CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats | |
− | Current CSEP Testing Approaches | + | ; Current CSEP Testing Approaches |
− | + | * Two approaches | |
− | + | ** Establishing discrepancies/agreement with observations | |
− | + | *** E.g., number of earthquakes | |
− | + | *** Likelihood | |
− | + | ** Comparing against other models | |
− | + | *** How much better or worse does one model do | |
− | + | * Installed methods | |
− | + | ** Number test: compares number of epicenter forecasts in bin | |
− | + | ** Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process. | |
− | + | ** Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates | |
− | + | ** Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores | |
− | + | *** One of the more interesting tests | |
− | + | ** Magnitude test: | |
− | + | *** Same as S test but integrating over space. | |
− | + | *** Not particularly powerful, could be using a more powerful KS test. | |
− | + | ** Information gain per earthquake: is "rate-corrected" information gain significane greater than 0. | |
− | + | *** Paired t-test (T-test) | |
− | + | **** Differences must be approximately independent | |
− | + | **** If differences are not iid normal, CLT! | |
− | + | *** Wilcoxon signed rank test (W-test) | |
− | + | **** Less powerful | |
− | + | **** Require symmetric data | |
− | + | *** Differences are proportional to error bounds, ie., large difference -> large error bounds | |
− | + | *** Error bounds only apply to forecast pairs | |
− | + | * Residuals based: | |
− | + | ** Residual: difference between local forecast and observation | |
− | + | ** Raw residual: bin-wise difference between observed # and forecast | |
− | + | ** Pearson residuals: normalized cell-wise difference between rate and observed number | |
− | + | ** Deviance residuals: difference between (point-process) log-likelihood Scores. | |
− | + | * Hit&miss tests | |
− | + | ** Receiver-operating characteristic | |
− | + | ** Molchan error diagram | |
− | + | ** Area-skill-score | |
− | + | * Goal is to evaluate different aspects of the forecasting model | |
− | + | * Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models. | |
− | + | * 10 years of data collected by testing centers @ SCEC and GNS science. Need more results. | |
− | + | ** 1 day forecasting for California. | |
− | + | ** Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch… | |
− | + | ** Curated dataset valuable resource for the scientific community. | |
− | + | * Next steps: | |
− | + | ** Ensemble modeling | |
− | + | *** Marzochi et al, 2012 | |
− | + | *** BMA: averaged based on previously best performing model, which makes it better for selecting models | |
− | + | *** Using additive or multiplicative models for combining models | |
− | + | ** Simulated-based forecasts | |
− | + | *** See previous lecture from Morgan Page and David Rhoades | |
− | + | *** NSIM: number of target eqs | |
− | + | *** Earthquake rate distribution | |
− | + | *** Inter-event time distribution | |
− | + | *** Inter-event distance distribution | |
− | + | ** External forecasts and predictions | |
− | + | *** Quakefinder type predictions. | |
− | + | *** No implemented evaluation method. | |
− | + | *** Critical for real-time forecast and predictions that are generated externally to CSEP platform. | |
− | Testing Fault Based Models | + | ; Testing Fault Based Models |
− | + | * Association problem: mapping an eq to the ucerf3 fault model | |
− | + | * Need to understand the stopping probabilities associated with stopping between fault segments | |
− | + | * Proposed Procedure: | |
− | + | :# Separate linear fault into sections | |
− | + | :# For each section: estimate nucleation rate for eqs of interest | |
− | + | :# Estimate conditional probabilities of earthquake stopping | |
− | + | :# Evaluate frequency of eqs for each pair of section | |
− | + | * Could rely on aftershocks to determine the extent of the rupture plane or maybe a finite-fault inversion | |
− | + | * Fault participation is the most important this to test for fault-based models. | |
− | + | * Null-hypothesis can be established using the following assumptions | |
− | + | ** Known magnitude distribution | |
− | + | ** Known scaling between mw and length | |
− | + | ** Uniform distribution of rupture locations on fault | |
− | Considering Epistemic Uncertainty | + | ; Considering Epistemic Uncertainty |
− | + | * Aleatory variability: inherent complexity or randomness in some physical process | |
− | + | * Epistemic uncertainty: comes from our lack of knowledge about the process | |
− | + | * An exchangeable event allow testing of Bayesian models in frequentist framework | |
− | + | * Modifying experimental concept allows for ontological testing of exchangeable sequences | |
− | + | * Hierarchy of uncert. Necessary for testing | |
− | + | ** Aleatory variability -> frequentist | |
− | + | ** Epistemic uncertainty -> Bayesian methods | |
− | + | ** Ontological error -> rejection of 'ontological' null hypothesis | |
− | + | *** States that the true hazard is a realization of the extended experts distribution (EED) | |
− | + | *** Rejection of this null hypothesis implies ontological error | |
− | + | ** Ontological tests requires 'experimental concept' that conditions the aleatory variability of the natural system. | |
− | Ensemble Modeling and Hybrid model | + | ; Ensemble Modeling and Hybrid model |
− | + | * Definition: inferring the extended expert's distribution from the sample provided by any set of models that sample the epistemic uncertainty | |
− | + | * Combining models allows the ensemble to perform only slightly worse than the best performing model. Useful when not sure what is 'correct' model. | |
− | + | * Two main features: | |
− | + | ** Describe epistemic uncertainty | |
− | + | ** Significantly increases the skill of forecast | |
− | + | * Hybrid models to increase information gain | |
− | + | ** Additive hybrid | |
− | + | *** Best fitting linear combination of models | |
− | + | ** Maximum hybrid | |
− | + | ** Multiplicative hybrids | |
− | + | *** Exploit independent information ie., GPS and smoothed seismicity | |
− | + | * Form hybrids for better information gain | |
− | + | * Does not require a choice of best model but leverages all models | |
− | + | * Could be a target for CSEP to help gain hybrid models -> improve collaborations | |
− | Event based testing | + | ; Event based testing |
− | + | * Could move to solution where you can make updates to forecast during the forecast period. Likely more important for long-term models | |
− | + | * Might want to update forecast when event happens and when event doesn't happen. | |
− | + | == Day 3 == | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | 3 | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | ; Milestones for UCERF3 Testing Program (U3-TP) | |
− | + | # Goals: | |
− | + | #* Verify | |
− | + | #* Validate | |
− | + | #* Valuate | |
− | + | # Milestones | |
− | + | #* Develop infrastructure | |
− | + | #* Retrospective testing of UCERF3 | |
− | + | #* Prospective testing of UCERF3 | |
− | + | #* Comparatively evalulate U3 against empirical models and physics-based models | |
− | + | # 5 types of testing | |
− | + | #* Exploratory testing: Turing | |
− | + | #* Comparatively: T and W tests | |
− | + | #* Mean-Hazard testing: null hypothesis significance testing | |
− | + | #* Ontological testing: including epistemic uncertainty | |
− | + | #* Sequence-specific testing: testing U3-ETAS against observed aftershock sequences | |
− | + | * Guiding principal: all OEF models should be under continual prospective testing, put ETAS under operational testing. Need to find the value for UCERF3-ETAS testing. | |
− | + | * Could possibly find out what aspects of U3 are superfluous and could simplify the model for using in other locales. | |
− | + | * Slip-rate data are special for California data sets | |
− | + | * Important to compare against physics-based earthquake simulations such as RSQsim to evaluate certain assumptions in the model. | |
− | + | * UCERF3 drivers of CSEP2 | |
− | + | ** Datasets used in prospective testing must be versioned and archived. Should include analyzed datasets and raw catalogs | |
− | + | ** Testing on simulated event catalogs | |
− | + | * Benefits of UCERF3-TP to the USGS | |
− | + | ** Scientific value | |
− | + | ** Software infrastructure | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
+ | ;List of Possible milestones | ||
+ | * CSEP1.0. What products would be useful? | ||
+ | ** Do we need to keep operationalizing CSEP1.0? | ||
+ | ** New models necessary? | ||
+ | *** [Max] CSEP1 big achievement/success was incorporating new models. | ||
+ | *** [Ned] Should support new models but need to be more selective. | ||
+ | *** [Max] Wants to published the CA 1-day forecast results. | ||
+ | *** [Morgan] Value in providing CSEP1.0 data set publicly. | ||
+ | *** [Mike B.] Need some clear scientific findings and be provocative about this. Need to find models that can be rejected and not worth pursuing. Need to curated in such a way that allows scientists to access and work. Would build community. | ||
+ | *** [Peter] Need products that are digestible by the public. | ||
+ | * Simulation based testing | ||
+ | ** Methods: | ||
+ | *** ETAS, U3-ETAS, | ||
+ | ** Need: Process for defining timelines, and which models we would be evaluating. | ||
+ | * Event-triggered/sequence specific testing | ||
+ | * Comparative valuations | ||
+ | ** Turing tests | ||
+ | ** Verification | ||
+ | ** Inter-comparison of models | ||
+ | * Modeling catalog completeness | ||
+ | * Modeling epistemic uncertainties: Important, but IT challenges. | ||
+ | * Fault/cell participation | ||
+ | * [Ned] Testing usefulness! | ||
+ | * Rupture association problem | ||
+ | * Fault characteristics | ||
+ | * UCERF3-TD elastic-rebound testing | ||
+ | * [Peter] Declustering work could help with the USGS. Lots of ambiguity on different declustering models. | ||
+ | * [Phil] CSEP Should expose its methods so users could leverage the algorithm | ||
+ | * [Mike B.] Valuation question most important. Should physics-based simulations be used for Hazard? | ||
+ | * [Kevin] Errors need to be propagated moving forward! | ||
+ | * [Phil] Need to figure out how the data will be provided to the scientist. | ||
+ | * [All] Web based interface would be valuable. | ||
− | Notes: Targeting publication for July/August issue of SRL. Have some time to develop the webpage. | + | ; Notes about SRL special issue: |
− | + | * Targeting publication for July/August issue of SRL. Have some time to develop the webpage. | |
− | + | * Expect Press material surrounding the SRL release, so need to be prepared with digestible figures. |
Latest revision as of 22:54, 14 March 2018
Contents
CSEP2 Challenges
- Testing fault and simulation-based models
- Care about low prob. events
- Should we be testing something other nucleation?
- Is UCERF3-ETAS more valuable given the alternatives?
- Epistemic Uncertainties
Day 1
- Reasenberg & Jones for USGS OAF
- Rate of ≥M aftershocks at time t after mainshock with given magnitude
- Improvements made to reasenberg & jones model to update generic parameters for California
- Aftershock forecast for Mw ≥ 5 using improved R&J model
- Automating aftershock forecasts for the US (in progress w/ code development challenges)
- Moving past R&J in favor of ETAS, but could be useful for UCERF3-ETAS testing
- Testability challenges:
- Overlapping, non-independent forecasts
- EQ prob. Dist. Not necessarily Poissonian
- Temporal forecasts with poorly defined spatial area
- R&J not great with substantial triggering (e.g., swarms)
- Update on ETAS Forecasting
- GUI interface to compute manual forecasts for external uses.
- AIC prefers ETAS, however more complicated models not favored over simple 3 param model
- Performs better than R&J
- Issues with "supercriticality"
- Could solve by fitting mainshock separately
- For global problem (and local): estimating magnitude of completeness and b-value
- Need to limit supercriticality before OEF can be given to non-experts
- "Similarity forecast" can be implemented as mask/failsafe to reduce surprises
- Defined as having "similar number of earthquakes in binned magnitudes"
- ETAS has ½ the surprise rate of R&J
- Or could be included in ensemble
- Spatial ETAS
- ETAS type models can zero in on aftershock hot-spots
- Using spatial omori type
- Need some spatial kernel
- Moving from spatial rates to hazards
- Couple forecasts with GMPE to produce ground motions
- MMI regression-based models
- Testability
- Challenges associated with incorporating hazard, because it eliminates some granularity in the forecast model
- Worried about Type II error
- Time-Dependent Background seismicity
- Particularly useful for earthquake swarms where background seismicity differs from 'normal' rate
- Could determine rate from previous swarms
- Potential issues:
- Swarm duration
- Considerable variability in swarm durations
- Solving using "life expectancy" table, but limited data in southern California
- Likely need some physical constraints on distribution functions
- Using STETAS to use standard catalog without needing declustering
- Hydro mechanical models for stressing-rate can be used for induced seismicity
- rate-and-state framework
- Testing strategy:
- Given a swarm; how long should we provide forecasts?
Day 2
- UCERF3
- Three models
- Time-independent
- Fault-based approach that splits faults into subsections
- Rate of rupture computed from Grand Inversion (see pub for details)
- Add gridded off-fault seismicity
- Logic tree used to capture the epistemic uncertainty
- Fault participation most important. (ie., what is the prob. of a particular fault hosting an eq ≥ Mw)
- Time-dependent
- Based on reed renewal statistics
- Additional logic tree branches added
- ETAS
- Ignoring faults gives rise to discrepancy between ETAS and elastic rebound type models
- Combines UCERF3-TD with an ETAS model and produces synthetic catalog
- Issues:
- Variability of MFD throughout CA
- GR not consistent with data
- Main question: what is the conditional prob of observing large eq given an observed small eq?
- Determined that elastic rebound necessary
- Rate of small events not always consistent with rate of expected aftershocks
- Operationalizable, but needs significant resources
- Major question: Does it have value?
- HayWired scenario recently published in SRL
- Shows value if interested in severe shaking.
- Faults important for low prob high ground motions.
- Testing UCERF3-ETAS
- Fault participation, not nucleation
- Logic-tree branches
- Elastic rebound/aperiodicity
- Characteristic behavior near faults
- Retrospective testing
- Aleatory variability and sequence specific etas parameters
- Time-independent
- RSQSim Rate-State earthquake Simulator
- Physics-based forecasting model based on R&S statistics
- Using RSQsim ruptures in hazard assessments
- Need to create ucerf3 style ruptures
- Do RSQSim ruptures pass ucerf3 plausibility criteria?
- Surprisingly, most RSQsim ruptures didn't pass the coulomb criterion. ~17.5% did not pass.
- Multi-fault ruptures tend to agree between UCERF3 and RSQsim
- RSQsim agrees well with UCERF3 without specific tuning of recurrence intervals
- Also: repeat times and short period ground motions
- but starts to disagree at longer spectral periods
- interesting conditional probs:
- what is the prob of having 2 mw 7 on the Mojave within 1 week?
- 4.5% in UCERF3 and 5.6% in RSQSim
- what is the prob of having 2 mw 7 on the Mojave within 1 week?
- Could look at two-point statistics pairwise difference between centroids
- CSEP2 needs to be able to compare synthetic catalogs coming from RSQSim, ie., the ability to handle non-standard catalog sources.
- Objectives & Challenges of Model Testing
- Not enough data
- Expand data sources in space and time
- i.e., incorp. South America
- retrospective testing experiments
- time-dependence
- extend authoritative data sets
- issue 2: primitive models based on point-based models (ie, hypocentral/nucleation)
- need: simulation- , fault-, and physics-based models
- 3d models
- Need to account for non-poissonian & correlation structures
- [DJ] expand ucerf3 approach to new locations to understand principal components
- Issue 3: need to build complete prob. models accounting for ontological error (unknown unknowns)
- How to go from logic tree to continuous pdf?
- Need to consider correlations within tree
- Issue 4: testing fault-based models
- Lots of work to be done
- 'Turing' style testing can help… Page (2018)
- Turing Tests of UCERF3
- properly accounting for spatial diffusivity
- Inter-sequence aftershock productivity
- Foreshock and aftershock productivity as function of differential magnitude
- Nearest neighbor separations
- Analysis of clusters
- Paleo hiatus
- [NF] what are the possible explanations of hiatus?
- Super cycle: extreme clustering over extreme period
- CSEP testing should be more visual and include into CSEP2
- Comparing R&J with ETAS
- R&J -> ETAS
- Secondary sequences
- Faster adaptation
- Spatial forecasts
- Better estimates of the range of outcomes
- CSEP 1day forecasts begin at start of day; lose some power
- Challenges for USGS testing:
- Overlapping windows
- Update forecasts within window
- New RJ89 method no longer Poissonian
- ETAS forecasts are not Poissonian
- All violate CSEP testing methods
- Likelihood based on Poisson distribution using standard statistical test
- CSEP strategy:
- Poisson numbers based on RJ forecasts
- Strategy:
- Want dist. Of events in window that starts at t with duration d
- Using R&J and ETAS to simulate "real" observations
- Fit R&J to ETAS model: fit is the mode of the individual ETAS runs
- Using overlapping windows.. CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
- ETAS observations fails bc they are not Poisson distributed
- R&J fails when considering large magnitude main shock
- Solution: all ETAS all the time
- Takeaway: forecasts and observations must be consistent
- Conclusion
- Non-Poissonian behavior
- Simulation based forecasts could address some issues
- Will also handle overlapping time windows
- RJ will fail assuming that the world is like ETAS
- CSEP must take the forecasts in as simulations in order to test
- Moving past Poisson
- Poisson likelihood does not allow for clustering
- Three-ways to eliminate:
- Adjusted likelihood simulations (in other words, remove likelihood)
- Normal approximation
- K-S test (could work with Turing style tests too)
- N-test could be fixed by using negative binomial if dispersion is supplied
- Accommodating simulation-based models better solution
- Simulations can preserve space-time clustering
- (CSEP needs to separate forecasting from modelling)
- Consistency tests of simulation-based tests
- General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
- For example, inter-event time distribution
- P-values should be uniform on [0,1]
- Interesting in improving models: looking at information gained
- Not obvious way to transparently estimate without gridding
- Standard CSEP information not restricted to Poisson
- CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
- Moving past parametric distribution functions in favor of non-parametric simulation-based models
- CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats
- Current CSEP Testing Approaches
- Two approaches
- Establishing discrepancies/agreement with observations
- E.g., number of earthquakes
- Likelihood
- Comparing against other models
- How much better or worse does one model do
- Establishing discrepancies/agreement with observations
- Installed methods
- Number test: compares number of epicenter forecasts in bin
- Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
- Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
- Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
- One of the more interesting tests
- Magnitude test:
- Same as S test but integrating over space.
- Not particularly powerful, could be using a more powerful KS test.
- Information gain per earthquake: is "rate-corrected" information gain significane greater than 0.
- Paired t-test (T-test)
- Differences must be approximately independent
- If differences are not iid normal, CLT!
- Wilcoxon signed rank test (W-test)
- Less powerful
- Require symmetric data
- Differences are proportional to error bounds, ie., large difference -> large error bounds
- Error bounds only apply to forecast pairs
- Paired t-test (T-test)
- Residuals based:
- Residual: difference between local forecast and observation
- Raw residual: bin-wise difference between observed # and forecast
- Pearson residuals: normalized cell-wise difference between rate and observed number
- Deviance residuals: difference between (point-process) log-likelihood Scores.
- Hit&miss tests
- Receiver-operating characteristic
- Molchan error diagram
- Area-skill-score
- Goal is to evaluate different aspects of the forecasting model
- Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
- 10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
- 1 day forecasting for California.
- Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…
- Curated dataset valuable resource for the scientific community.
- Next steps:
- Ensemble modeling
- Marzochi et al, 2012
- BMA: averaged based on previously best performing model, which makes it better for selecting models
- Using additive or multiplicative models for combining models
- Simulated-based forecasts
- See previous lecture from Morgan Page and David Rhoades
- NSIM: number of target eqs
- Earthquake rate distribution
- Inter-event time distribution
- Inter-event distance distribution
- External forecasts and predictions
- Quakefinder type predictions.
- No implemented evaluation method.
- Critical for real-time forecast and predictions that are generated externally to CSEP platform.
- Ensemble modeling
- Testing Fault Based Models
- Association problem: mapping an eq to the ucerf3 fault model
- Need to understand the stopping probabilities associated with stopping between fault segments
- Proposed Procedure:
- Separate linear fault into sections
- For each section: estimate nucleation rate for eqs of interest
- Estimate conditional probabilities of earthquake stopping
- Evaluate frequency of eqs for each pair of section
- Could rely on aftershocks to determine the extent of the rupture plane or maybe a finite-fault inversion
- Fault participation is the most important this to test for fault-based models.
- Null-hypothesis can be established using the following assumptions
- Known magnitude distribution
- Known scaling between mw and length
- Uniform distribution of rupture locations on fault
- Considering Epistemic Uncertainty
- Aleatory variability: inherent complexity or randomness in some physical process
- Epistemic uncertainty: comes from our lack of knowledge about the process
- An exchangeable event allow testing of Bayesian models in frequentist framework
- Modifying experimental concept allows for ontological testing of exchangeable sequences
- Hierarchy of uncert. Necessary for testing
- Aleatory variability -> frequentist
- Epistemic uncertainty -> Bayesian methods
- Ontological error -> rejection of 'ontological' null hypothesis
- States that the true hazard is a realization of the extended experts distribution (EED)
- Rejection of this null hypothesis implies ontological error
- Ontological tests requires 'experimental concept' that conditions the aleatory variability of the natural system.
- Ensemble Modeling and Hybrid model
- Definition: inferring the extended expert's distribution from the sample provided by any set of models that sample the epistemic uncertainty
- Combining models allows the ensemble to perform only slightly worse than the best performing model. Useful when not sure what is 'correct' model.
- Two main features:
- Describe epistemic uncertainty
- Significantly increases the skill of forecast
- Hybrid models to increase information gain
- Additive hybrid
- Best fitting linear combination of models
- Maximum hybrid
- Multiplicative hybrids
- Exploit independent information ie., GPS and smoothed seismicity
- Additive hybrid
- Form hybrids for better information gain
- Does not require a choice of best model but leverages all models
- Could be a target for CSEP to help gain hybrid models -> improve collaborations
- Event based testing
- Could move to solution where you can make updates to forecast during the forecast period. Likely more important for long-term models
- Might want to update forecast when event happens and when event doesn't happen.
Day 3
- Milestones for UCERF3 Testing Program (U3-TP)
- Goals:
- Verify
- Validate
- Valuate
- Milestones
- Develop infrastructure
- Retrospective testing of UCERF3
- Prospective testing of UCERF3
- Comparatively evalulate U3 against empirical models and physics-based models
- 5 types of testing
- Exploratory testing: Turing
- Comparatively: T and W tests
- Mean-Hazard testing: null hypothesis significance testing
- Ontological testing: including epistemic uncertainty
- Sequence-specific testing: testing U3-ETAS against observed aftershock sequences
- Guiding principal: all OEF models should be under continual prospective testing, put ETAS under operational testing. Need to find the value for UCERF3-ETAS testing.
- Could possibly find out what aspects of U3 are superfluous and could simplify the model for using in other locales.
- Slip-rate data are special for California data sets
- Important to compare against physics-based earthquake simulations such as RSQsim to evaluate certain assumptions in the model.
- UCERF3 drivers of CSEP2
- Datasets used in prospective testing must be versioned and archived. Should include analyzed datasets and raw catalogs
- Testing on simulated event catalogs
- Benefits of UCERF3-TP to the USGS
- Scientific value
- Software infrastructure
- List of Possible milestones
- CSEP1.0. What products would be useful?
- Do we need to keep operationalizing CSEP1.0?
- New models necessary?
- [Max] CSEP1 big achievement/success was incorporating new models.
- [Ned] Should support new models but need to be more selective.
- [Max] Wants to published the CA 1-day forecast results.
- [Morgan] Value in providing CSEP1.0 data set publicly.
- [Mike B.] Need some clear scientific findings and be provocative about this. Need to find models that can be rejected and not worth pursuing. Need to curated in such a way that allows scientists to access and work. Would build community.
- [Peter] Need products that are digestible by the public.
- Simulation based testing
- Methods:
- ETAS, U3-ETAS,
- Need: Process for defining timelines, and which models we would be evaluating.
- Methods:
- Event-triggered/sequence specific testing
- Comparative valuations
- Turing tests
- Verification
- Inter-comparison of models
- Modeling catalog completeness
- Modeling epistemic uncertainties: Important, but IT challenges.
- Fault/cell participation
- [Ned] Testing usefulness!
- Rupture association problem
- Fault characteristics
- UCERF3-TD elastic-rebound testing
- [Peter] Declustering work could help with the USGS. Lots of ambiguity on different declustering models.
- [Phil] CSEP Should expose its methods so users could leverage the algorithm
- [Mike B.] Valuation question most important. Should physics-based simulations be used for Hazard?
- [Kevin] Errors need to be propagated moving forward!
- [Phil] Need to figure out how the data will be provided to the scientist.
- [All] Web based interface would be valuable.
- Notes about SRL special issue
- Targeting publication for July/August issue of SRL. Have some time to develop the webpage.
- Expect Press material surrounding the SRL release, so need to be prepared with digestible figures.