Difference between revisions of "CSEP Powell Center 2018"

Latest revision as of 22:54, 14 March 2018

CSEP2 Challenges

Testing fault and simulation-based models Care about low prob. events Should we be testing something other nucleation? Is UCERF3-ETAS more valuable given the alternatives? Epistemic Uncertainties

Day 1

Reasenberg & Jones for USGS OAF

Rate of ≥M aftershocks at time t after mainshock with given magnitude
Improvements made to reasenberg & jones model to update generic parameters for California
Aftershock forecast for Mw ≥ 5 using improved R&J model
Automating aftershock forecasts for the US (in progress w/ code development challenges)
Moving past R&J in favor of ETAS, but could be useful for UCERF3-ETAS testing
Testability challenges:
- Overlapping, non-independent forecasts
- EQ prob. Dist. Not necessarily Poissonian
- Temporal forecasts with poorly defined spatial area
- R&J not great with substantial triggering (e.g., swarms)

Update on ETAS Forecasting

GUI interface to compute manual forecasts for external uses.
AIC prefers ETAS, however more complicated models not favored over simple 3 param model
Performs better than R&J
Issues with "supercriticality"
- Could solve by fitting mainshock separately
For global problem (and local): estimating magnitude of completeness and b-value
Need to limit supercriticality before OEF can be given to non-experts
"Similarity forecast" can be implemented as mask/failsafe to reduce surprises
- Defined as having "similar number of earthquakes in binned magnitudes"
- ETAS has ½ the surprise rate of R&J
- Or could be included in ensemble

Spatial ETAS

ETAS type models can zero in on aftershock hot-spots
- Using spatial omori type
- Need some spatial kernel
Moving from spatial rates to hazards
- Couple forecasts with GMPE to produce ground motions
- MMI regression-based models
Testability
- Challenges associated with incorporating hazard, because it eliminates some granularity in the forecast model
- Worried about Type II error

Time-Dependent Background seismicity

Particularly useful for earthquake swarms where background seismicity differs from 'normal' rate
- Could determine rate from previous swarms
- Potential issues:
  - Swarm duration
  - Considerable variability in swarm durations
  - Solving using "life expectancy" table, but limited data in southern California
  - Likely need some physical constraints on distribution functions
Using STETAS to use standard catalog without needing declustering
- Hydro mechanical models for stressing-rate can be used for induced seismicity
- rate-and-state framework
Testing strategy:
- Given a swarm; how long should we provide forecasts?

Day 2

UCERF3

Three models
- Time-independent
  - Fault-based approach that splits faults into subsections
  - Rate of rupture computed from Grand Inversion (see pub for details)
  - Add gridded off-fault seismicity
  - Logic tree used to capture the epistemic uncertainty
  - Fault participation most important. (ie., what is the prob. of a particular fault hosting an eq ≥ Mw)
- Time-dependent
  - Based on reed renewal statistics
  - Additional logic tree branches added
- ETAS
  - Ignoring faults gives rise to discrepancy between ETAS and elastic rebound type models
  - Combines UCERF3-TD with an ETAS model and produces synthetic catalog
  - Issues:
    - Variability of MFD throughout CA
    - GR not consistent with data
    - Main question: what is the conditional prob of observing large eq given an observed small eq?
    - Determined that elastic rebound necessary
    - Rate of small events not always consistent with rate of expected aftershocks
  - Operationalizable, but needs significant resources
    - Major question: Does it have value?
    - HayWired scenario recently published in SRL
    - Shows value if interested in severe shaking.
    - Faults important for low prob high ground motions.
  - Testing UCERF3-ETAS
    - Fault participation, not nucleation
    - Logic-tree branches
    - Elastic rebound/aperiodicity
    - Characteristic behavior near faults
    - Retrospective testing
    - Aleatory variability and sequence specific etas parameters

RSQSim Rate-State earthquake Simulator

Physics-based forecasting model based on R&S statistics
Using RSQsim ruptures in hazard assessments
- Need to create ucerf3 style ruptures
- Do RSQSim ruptures pass ucerf3 plausibility criteria?
  - Surprisingly, most RSQsim ruptures didn't pass the coulomb criterion. ~17.5% did not pass.
  - Multi-fault ruptures tend to agree between UCERF3 and RSQsim
RSQsim agrees well with UCERF3 without specific tuning of recurrence intervals
- Also: repeat times and short period ground motions
- but starts to disagree at longer spectral periods
interesting conditional probs:
- what is the prob of having 2 mw 7 on the Mojave within 1 week?
  - 4.5% in UCERF3 and 5.6% in RSQSim
Could look at two-point statistics pairwise difference between centroids
CSEP2 needs to be able to compare synthetic catalogs coming from RSQSim, ie., the ability to handle non-standard catalog sources.

Objectives & Challenges of Model Testing

Not enough data
Expand data sources in space and time
- i.e., incorp. South America
- retrospective testing experiments
- time-dependence
- extend authoritative data sets
issue 2: primitive models based on point-based models (ie, hypocentral/nucleation)
- need: simulation- , fault-, and physics-based models
- 3d models
- Need to account for non-poissonian & correlation structures
- [DJ] expand ucerf3 approach to new locations to understand principal components
Issue 3: need to build complete prob. models accounting for ontological error (unknown unknowns)
- How to go from logic tree to continuous pdf?
- Need to consider correlations within tree
Issue 4: testing fault-based models
- Lots of work to be done
- 'Turing' style testing can help… Page (2018)

Turing Tests of UCERF3

properly accounting for spatial diffusivity
Inter-sequence aftershock productivity
Foreshock and aftershock productivity as function of differential magnitude
Nearest neighbor separations
Analysis of clusters
Paleo hiatus
[NF] what are the possible explanations of hiatus?
Super cycle: extreme clustering over extreme period
CSEP testing should be more visual and include into CSEP2

Comparing R&J with ETAS

R&J -> ETAS
- Secondary sequences
- Faster adaptation
- Spatial forecasts
- Better estimates of the range of outcomes
CSEP 1day forecasts begin at start of day; lose some power
Challenges for USGS testing:
- Overlapping windows
- Update forecasts within window
- New RJ89 method no longer Poissonian
- ETAS forecasts are not Poissonian
- All violate CSEP testing methods
Likelihood based on Poisson distribution using standard statistical test
CSEP strategy:
- Poisson numbers based on RJ forecasts
Strategy:

Want dist. Of events in window that starts at t with duration d
Using R&J and ETAS to simulate "real" observations
Fit R&J to ETAS model: fit is the mode of the individual ETAS runs

Using overlapping windows.. CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
ETAS observations fails bc they are not Poisson distributed
R&J fails when considering large magnitude main shock
Solution: all ETAS all the time
Takeaway: forecasts and observations must be consistent
Conclusion
- Non-Poissonian behavior
- Simulation based forecasts could address some issues
- Will also handle overlapping time windows
- RJ will fail assuming that the world is like ETAS
CSEP must take the forecasts in as simulations in order to test

Moving past Poisson

Poisson likelihood does not allow for clustering
Three-ways to eliminate:
- Adjusted likelihood simulations (in other words, remove likelihood)
- Normal approximation
- K-S test (could work with Turing style tests too)
N-test could be fixed by using negative binomial if dispersion is supplied
Accommodating simulation-based models better solution
Simulations can preserve space-time clustering
(CSEP needs to separate forecasting from modelling)
Consistency tests of simulation-based tests
- General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
- For example, inter-event time distribution
- P-values should be uniform on [0,1]
Interesting in improving models: looking at information gained
- Not obvious way to transparently estimate without gridding
- Standard CSEP information not restricted to Poisson
CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
Moving past parametric distribution functions in favor of non-parametric simulation-based models
CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats

Current CSEP Testing Approaches

Two approaches
- Establishing discrepancies/agreement with observations
  - E.g., number of earthquakes
  - Likelihood
- Comparing against other models
  - How much better or worse does one model do
Installed methods
- Number test: compares number of epicenter forecasts in bin
- Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
- Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
- Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
  - One of the more interesting tests
- Magnitude test:
  - Same as S test but integrating over space.
  - Not particularly powerful, could be using a more powerful KS test.
- Information gain per earthquake: is "rate-corrected" information gain significane greater than 0.
  - Paired t-test (T-test)
    - Differences must be approximately independent
    - If differences are not iid normal, CLT!
  - Wilcoxon signed rank test (W-test)
    - Less powerful
    - Require symmetric data
  - Differences are proportional to error bounds, ie., large difference -> large error bounds
  - Error bounds only apply to forecast pairs
Residuals based:
- Residual: difference between local forecast and observation
- Raw residual: bin-wise difference between observed # and forecast
- Pearson residuals: normalized cell-wise difference between rate and observed number
- Deviance residuals: difference between (point-process) log-likelihood Scores.
Hit&miss tests
- Receiver-operating characteristic
- Molchan error diagram
- Area-skill-score
Goal is to evaluate different aspects of the forecasting model
Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
- 1 day forecasting for California.
- Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…
- Curated dataset valuable resource for the scientific community.
Next steps:
- Ensemble modeling
  - Marzochi et al, 2012
  - BMA: averaged based on previously best performing model, which makes it better for selecting models
  - Using additive or multiplicative models for combining models
- Simulated-based forecasts
  - See previous lecture from Morgan Page and David Rhoades
  - NSIM: number of target eqs
  - Earthquake rate distribution
  - Inter-event time distribution
  - Inter-event distance distribution
- External forecasts and predictions
  - Quakefinder type predictions.
  - No implemented evaluation method.
  - Critical for real-time forecast and predictions that are generated externally to CSEP platform.

Testing Fault Based Models

Association problem: mapping an eq to the ucerf3 fault model
Need to understand the stopping probabilities associated with stopping between fault segments
Proposed Procedure:

Separate linear fault into sections
For each section: estimate nucleation rate for eqs of interest
Estimate conditional probabilities of earthquake stopping
Evaluate frequency of eqs for each pair of section

Could rely on aftershocks to determine the extent of the rupture plane or maybe a finite-fault inversion
Fault participation is the most important this to test for fault-based models.
Null-hypothesis can be established using the following assumptions
- Known magnitude distribution
- Known scaling between mw and length
- Uniform distribution of rupture locations on fault

Considering Epistemic Uncertainty

Aleatory variability: inherent complexity or randomness in some physical process
Epistemic uncertainty: comes from our lack of knowledge about the process
An exchangeable event allow testing of Bayesian models in frequentist framework
Modifying experimental concept allows for ontological testing of exchangeable sequences
Hierarchy of uncert. Necessary for testing
- Aleatory variability -> frequentist
- Epistemic uncertainty -> Bayesian methods
- Ontological error -> rejection of 'ontological' null hypothesis
  - States that the true hazard is a realization of the extended experts distribution (EED)
  - Rejection of this null hypothesis implies ontological error
- Ontological tests requires 'experimental concept' that conditions the aleatory variability of the natural system.

Ensemble Modeling and Hybrid model

Definition: inferring the extended expert's distribution from the sample provided by any set of models that sample the epistemic uncertainty
Combining models allows the ensemble to perform only slightly worse than the best performing model. Useful when not sure what is 'correct' model.
Two main features:
- Describe epistemic uncertainty
- Significantly increases the skill of forecast
Hybrid models to increase information gain
- Additive hybrid
  - Best fitting linear combination of models
- Maximum hybrid
- Multiplicative hybrids
  - Exploit independent information ie., GPS and smoothed seismicity
Form hybrids for better information gain
Does not require a choice of best model but leverages all models
Could be a target for CSEP to help gain hybrid models -> improve collaborations

Event based testing

Could move to solution where you can make updates to forecast during the forecast period. Likely more important for long-term models
Might want to update forecast when event happens and when event doesn't happen.

Day 3

Milestones for UCERF3 Testing Program (U3-TP)

Goals:
- Verify
- Validate
- Valuate
Milestones
- Develop infrastructure
- Retrospective testing of UCERF3
- Prospective testing of UCERF3
- Comparatively evalulate U3 against empirical models and physics-based models
5 types of testing
- Exploratory testing: Turing
- Comparatively: T and W tests
- Mean-Hazard testing: null hypothesis significance testing
- Ontological testing: including epistemic uncertainty
- Sequence-specific testing: testing U3-ETAS against observed aftershock sequences

Guiding principal: all OEF models should be under continual prospective testing, put ETAS under operational testing. Need to find the value for UCERF3-ETAS testing.
Could possibly find out what aspects of U3 are superfluous and could simplify the model for using in other locales.
Slip-rate data are special for California data sets
Important to compare against physics-based earthquake simulations such as RSQsim to evaluate certain assumptions in the model.
UCERF3 drivers of CSEP2
- Datasets used in prospective testing must be versioned and archived. Should include analyzed datasets and raw catalogs
- Testing on simulated event catalogs
Benefits of UCERF3-TP to the USGS
- Scientific value
- Software infrastructure

List of Possible milestones

CSEP1.0. What products would be useful?
- Do we need to keep operationalizing CSEP1.0?
- New models necessary?
  - [Max] CSEP1 big achievement/success was incorporating new models.
  - [Ned] Should support new models but need to be more selective.
  - [Max] Wants to published the CA 1-day forecast results.
  - [Morgan] Value in providing CSEP1.0 data set publicly.
  - [Mike B.] Need some clear scientific findings and be provocative about this. Need to find models that can be rejected and not worth pursuing. Need to curated in such a way that allows scientists to access and work. Would build community.
  - [Peter] Need products that are digestible by the public.
Simulation based testing
- Methods:
  - ETAS, U3-ETAS,
- Need: Process for defining timelines, and which models we would be evaluating.
Event-triggered/sequence specific testing
Comparative valuations
- Turing tests
- Verification
- Inter-comparison of models
Modeling catalog completeness
Modeling epistemic uncertainties: Important, but IT challenges.
Fault/cell participation
[Ned] Testing usefulness!
Rupture association problem
Fault characteristics
UCERF3-TD elastic-rebound testing
[Peter] Declustering work could help with the USGS. Lots of ambiguity on different declustering models.
[Phil] CSEP Should expose its methods so users could leverage the algorithm
[Mike B.] Valuation question most important. Should physics-based simulations be used for Hazard?
[Kevin] Errors need to be propagated moving forward!
[Phil] Need to figure out how the data will be provided to the scientist.
[All] Web based interface would be valuable.

Notes about SRL special issue

Targeting publication for July/August issue of SRL. Have some time to develop the webpage.
Expect Press material surrounding the SRL release, so need to be prepared with digestible figures.

Difference between revisions of "CSEP Powell Center 2018"

Latest revision as of 22:54, 14 March 2018

Contents

CSEP2 Challenges

Day 1

Day 2

Day 3

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 1: / Line 1: @@
-; CSEP2 Challenges
+==  CSEP2 Challenges ==
-* Testing fault and simulation-based models
-* Care about low prob. events
+;* Testing fault and simulation-based models
-* Should we be testing something other nucleation?
+;* Care about low prob. events
-* Is UCERF3-ETAS more valuable given the alternatives?
+;* Should we be testing something other nucleation?
-* Epistemic Uncertainties
+;* Is UCERF3-ETAS more valuable given the alternatives?
+;* Epistemic Uncertainties
 == Day 1 ==
@@ Line 93: / Line 94: @@
 **** Aleatory variability and sequence specific etas parameters
-RSQSim Rate-State earthquake Simulator
+; RSQSim Rate-State earthquake Simulator
-"	Physics-based forecasting model based on R&S statistics
+* Physics-based forecasting model based on R&S statistics
-"	Using RSQsim ruptures in hazard assessments
+* Using RSQsim ruptures in hazard assessments
-o	Need to create ucerf3 style ruptures
+** Need to create ucerf3 style ruptures
-o	Do RSQSim ruptures pass ucerf3 plausibility criteria?
+** Do RSQSim ruptures pass ucerf3 plausibility criteria?
-"	Surprisingly, most RSQsim ruptures didn't pass the coulomb criterion. ~17.5% did not pass.
+*** Surprisingly, most RSQsim ruptures didn't pass the coulomb criterion. ~17.5% did not pass.
-"	Multi-fault ruptures tend to agree between UCERF3 and RSQsim
+*** Multi-fault ruptures tend to agree between UCERF3 and RSQsim
-"	RSQsim agrees well with UCERF3 without specific tuning of recurrence intervals
+* RSQsim agrees well with UCERF3 without specific tuning of recurrence intervals
-o	Also: repeat times and short period ground motions
+** Also: repeat times and short period ground motions
-o	but starts to disagree at longer spectral periods
+** but starts to disagree at longer spectral periods
-"	interesting conditional probs:
+* interesting conditional probs:
-o	what is the prob of having 2 mw 7 on the Mojave within 1 week?
+** what is the prob of having 2 mw 7 on the Mojave within 1 week?
-"	4.5% in UCERF3 and 5.6% in RSQSim
+*** 4.5% in UCERF3 and 5.6% in RSQSim
-"	Could look at two-point statistics pairwise difference between centroids
+* Could look at two-point statistics pairwise difference between centroids
-"	CSEP2 needs to be able to compare synthetic catalogs coming from RSQSim, ie., the ability to handle non-standard catalog sources.
+* CSEP2 needs to be able to compare synthetic catalogs coming from RSQSim, ie., the ability to handle non-standard catalog sources.
-Objectives & Challenges of Model Testing
+; Objectives & Challenges of Model Testing
-"	Not enough data
+* Not enough data
-"	Expand data sources in space and time
+* Expand data sources in space and time
-o	i.e., incorp. South America
+** i.e., incorp. South America
-o	retrospective testing experiments
+** retrospective testing experiments
-o	time-dependence
+** time-dependence
-o	extend authoritative data sets
+** extend authoritative data sets
-"	issue 2: primitive models based on point-based models (ie, hypocentral/nucleation)
+* issue 2: primitive models based on point-based models (ie, hypocentral/nucleation)
-o	need: simulation- , fault-, and physics-based models
+** need: simulation- , fault-, and physics-based models
-o	3d models
+** 3d models
-o	Need to account for non-poissonian & correlation structures
+** Need to account for non-poissonian & correlation structures
-o	DJ: expand ucerf3 approach to new locations to understand principal components
+** [DJ] expand ucerf3 approach to new locations to understand principal components
-"	Issue 3: need to build complete prob. models accounting for ontological error (unknown unknowns)
+* Issue 3: need to build complete prob. models accounting for ontological error (unknown unknowns)
-o	How to go from logic tree to continuous pdf?
+** How to go from logic tree to continuous pdf?
-o	Need to consider correlations within tree
+** Need to consider correlations within tree
-"	Issue 4: testing fault-based models
+* Issue 4: testing fault-based models
-o	Lots of work to be done
+** Lots of work to be done
-o	'Turing' style testing can help… Page (2018)
+** 'Turing' style testing can help… Page (2018)
-Turing Tests of UCERF3
+; Turing Tests of UCERF3
-"	properly accounting for spatial diffusivity
+* properly accounting for spatial diffusivity
-"	Inter-sequence aftershock productivity
+* Inter-sequence aftershock productivity
-"	Foreshock and aftershock productivity as function of differential magnitude
+* Foreshock and aftershock productivity as function of differential magnitude
-"	Nearest neighbor separations
+* Nearest neighbor separations
-"	Analysis of clusters
+* Analysis of clusters
-"	Paleo hiatus
+* Paleo hiatus
-"	NF: what are the possible explanations of hiatus?
+* [NF] what are the possible explanations of hiatus?
-"	Super cycle: extreme clustering over extreme period
+* Super cycle: extreme clustering over extreme period
-"	CSEP testing should be more visual and include into CSEP2
+* CSEP testing should be more visual and include into CSEP2
-Comparing R&J with ETAS
+; Comparing R&J with ETAS
-"	R&J -> ETAS
+* R&J -> ETAS
-o	Secondard sequences
+** Secondary sequences
-o	Faster adaptation
+** Faster adaptation
-o	Spatial forecasts
+** Spatial forecasts
-o	Better estimates of the range of outcomes
+** Better estimates of the range of outcomes
-"	CSEP 1day forecasts begin at start of day; lose some power
+* CSEP 1day forecasts begin at start of day; lose some power
-"	Challenges for USGS testing:
+* Challenges for USGS testing:
-o	Overlapping windows
+** Overlapping windows
-o	Update forecasts within window
+** Update forecasts within window
-o	New RJ89 method no longer Poissonian
+** New RJ89 method no longer Poissonian
-o	ETAS forecasts are not Poissonian
+** ETAS forecasts are not Poissonian
-o	All violate CSEP testing methods
+** All violate CSEP testing methods
-"	Likelihood based on Poisson distribution using standard statistical test
+* Likelihood based on Poisson distribution using standard statistical test
-"	CSEP strategy:
+* CSEP strategy:
-o	Poisson numbers based on RJ forecasts
+** Poisson numbers based on RJ forecasts
-"	Strategy:
+* Strategy:
-.	Want dist. Of events in window that starts at t with duration d
+:# Want dist. Of events in window that starts at t with duration d
-.	Using R&J and ETAS to simulate "real" observations
+:# Using R&J and ETAS to simulate "real" observations
-.	Fit R&J to ETAS model: fit is the mode of the individual ETAS runs
+:# Fit R&J to ETAS model: fit is the mode of the individual ETAS runs
-"	Using overlapping windows CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
+* Using overlapping windows.. CSEP assumptions on independence fail; solved by incorporating R&J simulations, forecast, and observations
-"	ETAS observations fails bc they are not Poisson distributed
+* ETAS observations fails bc they are not Poisson distributed
-"	R&J fails when considering large magnitude main shock
+* R&J fails when considering large magnitude main shock
-"	Solution: all ETAS all the time
+* Solution: all ETAS all the time
-"	Takeaway: forecasts and observations must be consistent
+* Takeaway: forecasts and observations must be consistent
-"	Conclusion
+* Conclusion
-o	Non-Poissonian behavior
+** Non-Poissonian behavior
-o	Simulation based forecasts could address some issues
+** Simulation based forecasts could address some issues
-o	Will also handle overlapping time windows
+** Will also handle overlapping time windows
-o	RJ will fail assuming that the world is like ETAS
+** RJ will fail assuming that the world is like ETAS
-"	CSEP must take the forecasts in as simulations in order to test
+* CSEP must take the forecasts in as simulations in order to test
-Moving past Poisson
+; Moving past Poisson
-"	Poisson likelihood does not allow for clustering
+* Poisson likelihood does not allow for clustering
-"	Three-ways to eliminate:
+* Three-ways to eliminate:
-o	Adjusted likelihood simulations (in other words, remove likelihood)
+** Adjusted likelihood simulations (in other words, remove likelihood)
-o	Normal approximation
+** Normal approximation
-o	K-S test (could work with Turing style tests too)
+** K-S test (could work with Turing style tests too)
-"	N-test could be fixed by using negative binomial if dispersion is supplied
+* N-test could be fixed by using negative binomial if dispersion is supplied
-"	Accommodating simulation-based models better solution
+* Accommodating simulation-based models better solution
-"	Simulations can preserve space-time clustering
+* Simulations can preserve space-time clustering
-"	(CSEP needs to separate forecasting from modelling)
+* (CSEP needs to separate forecasting from modelling)
-"	Consistency tests of simulation-based tests
+* Consistency tests of simulation-based tests
-o	General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
+** General approach is to compare statistic computed from simulated catalog with same statistics from observed catalog
-o	For example, inter-event time distribution
+** For example, inter-event time distribution
-o	P-values should be uniform on [0,1]
+** P-values should be uniform on [0,1]
-"	Interesting in improving models: looking at information gained
+* Interesting in improving models: looking at information gained
-o	Not obvious way to transparently estimate without gridding
+** Not obvious way to transparently estimate without gridding
-o	Standard CSEP information not restricted to Poisson
+** Standard CSEP information not restricted to Poisson
-"	CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
+* CSEP needs to be able to retroactively evaluate new tests, in other words become a testing center.
-"	Moving past parametric distribution functions in favor of non-parametric simulation-based models
+* Moving past parametric distribution functions in favor of non-parametric simulation-based models
-"	CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats
+* CSEP could support individual testing, will be more straightforward with agreed upon simulated catalog formats
-Current CSEP Testing Approaches
+; Current CSEP Testing Approaches
-"	Two approaches
+* Two approaches
-o	Establishing discrepancies/agreement with observations
+** Establishing discrepancies/agreement with observations
-"	E.g., number of earthquakes
+*** E.g., number of earthquakes
-"	Likelihood
+*** Likelihood
-o	Comparing against other models
+** Comparing against other models
-"	How much better or worse does one model do
+*** How much better or worse does one model do
-"	Installed methods
+* Installed methods
-o	Number test: compares number of epicenter forecasts in bin
+** Number test: compares number of epicenter forecasts in bin
-o	Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
+** Likelihood test: based on RELM model setup using mainshock and mainshock+aftershock class, mainshocks declusted using reasenberg. Assume that the model is the data generating process.
-o	Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
+** Conditional likelihood test: set simulated = observed, and place sim eqs in bins according to relative rates
-o	Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
+** Space test: collapses forecast into spatial domain. Integrate over magnitude and set simulated = observed. Use relative rates. Calculate simulated LL scores
-"	One of the more interesting tests
+*** One of the more interesting tests
-o	Magnitude test:
+** Magnitude test:
-"	Same as S test but integrating over space.
+*** Same as S test but integrating over space.
-"	Not particularly powerful, could be using a more powerful KS test.
+*** Not particularly powerful, could be using a more powerful KS test.
-o	Information gain per earthquake: is "rate-corrected" information gain significant greater than 0.
+** Information gain per earthquake: is "rate-corrected" information gain significane greater than 0.
-"	Paired t-test (T-test)
+*** Paired t-test (T-test)
-"	Differences must be approximately independent
+**** Differences must be approximately independent
-"	If differences are not iid normal, CLT!
+**** If differences are not iid normal, CLT!
-"	Wilcoxon signed rank test (W-test)
+*** Wilcoxon signed rank test (W-test)
-"	Less powerful
+**** Less powerful
-"	Require symmetric data
+**** Require symmetric data
-"	Differences are proportional to error bounds, ie., large difference -> large error bounds
+*** Differences are proportional to error bounds, ie., large difference -> large error bounds
-"	Error bounds only apply to forecast pairs
+*** Error bounds only apply to forecast pairs
-"	Residuals based:
+* Residuals based:
-o	Residual: difference between local forecast and observation
+** Residual: difference between local forecast and observation
-o	Raw residual: bin-wise difference between observed # and forecast
+** Raw residual: bin-wise difference between observed # and forecast
-o	Pearson residuals: normalized cell-wise difference between rate and observed number
+** Pearson residuals: normalized cell-wise difference between rate and observed number
-o	Deviance residuals: difference between (point-process) log-likelihood Scores.
+** Deviance residuals: difference between (point-process) log-likelihood Scores.
-"	Hit&miss tests
+* Hit&miss tests
-o	Receiver-operating characteristic
+** Receiver-operating characteristic
-o	Molchan error diagram
+** Molchan error diagram
-o	Area-skill-score
+** Area-skill-score
-"	Goal is to evaluate different aspects of the forecasting model
+* Goal is to evaluate different aspects of the forecasting model
-"	Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
+* Interpreting results requires going back to the models. A shortcoming of CSEP results is that not enough scientific discussion about the evaluations within context of models.
-"	10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
+* 10 years of data collected by testing centers @ SCEC and GNS science. Need more results.
-o	1 day forecasting for California.
+** 1 day forecasting for California.
-o	Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…
+** Over 200 eqs for New Zealand, lots of science to be done here. Kaikoura and Christchurch…
-o	Curated dataset valuable resource for the scientific community.
+** Curated dataset valuable resource for the scientific community.
-"	Next steps:
+* Next steps:
-o	Ensemble modeling
+** Ensemble modeling
-"	Marzochi et al, 2012
+*** Marzochi et al, 2012
-"	BMA: averaged based on previously best performing model, which makes it better for selecting models
+*** BMA: averaged based on previously best performing model, which makes it better for selecting models
-"	Using additive or multiplicative models for combining models
+*** Using additive or multiplicative models for combining models
-o	Simulated-based forecasts
+** Simulated-based forecasts
-"	See previous lecture from Morgan Page and David Rhoades
+*** See previous lecture from Morgan Page and David Rhoades
-"	NSIM: number of target eqs
+*** NSIM: number of target eqs
-"	Earthquake rate distribution
+*** Earthquake rate distribution
-"	Inter-event time distribution
+*** Inter-event time distribution
-"	Inter-event distance distribution
+*** Inter-event distance distribution
-o	External forecasts and predictions
+** External forecasts and predictions
-"	Quakefinder type predictions.
+*** Quakefinder type predictions.
-"	No implemented evaluation method.
+*** No implemented evaluation method.
-"	Critical for real-time forecast and predictions that are generated externally to CSEP platform.
+*** Critical for real-time forecast and predictions that are generated externally to CSEP platform.
-Testing Fault Based Models
+; Testing Fault Based Models
-"	Association problem: mapping an eq to the ucerf3 fault model
+* Association problem: mapping an eq to the ucerf3 fault model
-"	Need to understand the stopping probabilities associated with stopping between fault segments
+* Need to understand the stopping probabilities associated with stopping between fault segments
-"	Proposed Procedure:
+* Proposed Procedure:
-.	Separate linear fault into sections
+:# Separate linear fault into sections
-.	For each section: estimate nucleation rate for eqs of interest
+:# For each section: estimate nucleation rate for eqs of interest
-.	Estimate conditional probabilities of earthquake stopping
+:# Estimate conditional probabilities of earthquake stopping
-.	Evaluate frequency of eqs for each pair of section
+:# Evaluate frequency of eqs for each pair of section
-"	Could rely on aftershocks to determine the extent of the rupture plane or maybe a finite-fault inversion
+* Could rely on aftershocks to determine the extent of the rupture plane or maybe a finite-fault inversion
-"	Fault participation is the most important this to test for fault-based models.
+* Fault participation is the most important this to test for fault-based models.
-"	Null-hypothesis can be established using the following assumptions
+* Null-hypothesis can be established using the following assumptions
-o	Known magnitude distribution
+** Known magnitude distribution
-o	Known scaling between mw and length
+** Known scaling between mw and length
-o	Uniform distribution of rupture locations on fault
+** Uniform distribution of rupture locations on fault
-Considering Epistemic Uncertainty
+; Considering Epistemic Uncertainty
-"	Aleatory variability: inherent complexity or randomness in some physical process
+* Aleatory variability: inherent complexity or randomness in some physical process
-"	Epistemic uncertainty: comes from our lack of knowledge about the process
+* Epistemic uncertainty: comes from our lack of knowledge about the process
-"	An exchangeable event allow testing of Bayesian models in frequentist framework
+* An exchangeable event allow testing of Bayesian models in frequentist framework
-"	Modifying experimental concept allows for ontological testing of exchangeable sequences
+* Modifying experimental concept allows for ontological testing of exchangeable sequences
-"	Hierarchy of uncert. Necessary for testing
+* Hierarchy of uncert. Necessary for testing
-o	Aleatory variability -> frequentist
+** Aleatory variability -> frequentist
-o	Epistemic uncertainty -> Bayesian methods
+** Epistemic uncertainty -> Bayesian methods
-o	Ontological error -> rejection of 'ontological' null hypothesis
+** Ontological error -> rejection of 'ontological' null hypothesis
-"	States that the true hazard is a realization of the extended experts distribution (EED)
+*** States that the true hazard is a realization of the extended experts distribution (EED)
-"	Rejection of this null hypothesis implies ontological error
+*** Rejection of this null hypothesis implies ontological error
-o	Ontological tests requires 'experimental concept' that conditions the aleatory variability of the natural system.
+** Ontological tests requires 'experimental concept' that conditions the aleatory variability of the natural system.
-Ensemble Modeling and Hybrid model
+; Ensemble Modeling and Hybrid model
-"	Definition: inferring the extended expert's distribution from the sample provided by any set of models that sample the epistemic uncertainty
+* Definition: inferring the extended expert's distribution from the sample provided by any set of models that sample the epistemic uncertainty
-"	Combining models allows the ensemble to perform only slightly worse than the best performing model. Useful when not sure what is 'correct' model.
+* Combining models allows the ensemble to perform only slightly worse than the best performing model. Useful when not sure what is 'correct' model.
-"	Two main features:
+* Two main features:
-o	Describe epistemic uncertainty
+** Describe epistemic uncertainty
-o	Significantly increases the skill of forecast
+** Significantly increases the skill of forecast
-"	Hybrid models to increase information gain
+* Hybrid models to increase information gain
-o	Additive hybrid
+** Additive hybrid
-"	Best fitting linear combination of models
+*** Best fitting linear combination of models
-o	Maximum hybrid
+** Maximum hybrid
-o	Multiplicative hybrids
+** Multiplicative hybrids
-"	Exploit independent information ie., GPS and smoothed seismicity
+*** Exploit independent information ie., GPS and smoothed seismicity
-"	Form hybrids for better information gain
+* Form hybrids for better information gain
-"	Does not require a choice of best model but leverages all models
+* Does not require a choice of best model but leverages all models
-"	Could be a target for CSEP to help gain hybrid models -> improve collaborations
+* Could be a target for CSEP to help gain hybrid models -> improve collaborations
-Event based testing
+; Event based testing
-"	Could move to solution where you can make updates to forecast during the forecast period. Likely more important for long-term models
+* Could move to solution where you can make updates to forecast during the forecast period. Likely more important for long-term models
-"	Might want to update forecast when event happens and when event doesn't happen.
+* Might want to update forecast when event happens and when event doesn't happen.
-Milestones for UCERF3 Testing Program (U3-TP)
+== Day 3 ==
-.	Goals:
-o	Verify
-o	Validate
-o	Valuate
-.	Milestones
-"	Develop infrastructure
-"	Retrospective testing of UCERF3
-"	Prospective testing of UCERF3
-"	Comparatively evalulate U3 against empirical models and physics-based models
-.	5 types of testing
-o	Exploratory testing: Turing
-o	Comparatively: T and W tests
-o	Mean-Hazard testing: null hypothesis significance testing
-o	Ontological testing: including epistemic uncertainty
-o	Sequence-specific testing: testing U3-ETAS against observed aftershock sequences
-"	Guiding principal: all OEF models should be under continual prospective testing, put ETAS under operational testing. Need to find the value for UCERF3-ETAS testing.
-"	Could possibly find out what aspects of U3 are superfluous and could simplify the model for using in other locales.
-"	Slip-rate data are special for California data sets
-"	Important to compare against physics-based earthquake simulations such as RSQsim to evaluate certain assumptions in the model.
-"	U3 drivers of CSEP2
-o	Datasets used in prospective testing must be versioned and archived. Should include analyzed datasets and raw catalogs
-o	Testing on simulated event catalogs
-"	Benefits of U3-TP to the USGS
-o	Scientific value
-o	Software infrastructure
-List of Possible milestones
+; Milestones for UCERF3 Testing Program (U3-TP)
-"	CSEP1.0. What products would be useful?
+#	Goals:
-o	Do we need to keep operationalizing CSEP1.0?
+#*	Verify
-"	[Matt] Mistake to shutdown CSEP1.0
+#*	Validate
-o	New models necessary?
+#*	Valuate
-"	[Max] CSEP1 big achievement/success was incorporating new models.
+#	Milestones
-"	[Ned] Should support new models but need to be more selective.
+#*	Develop infrastructure
-o	[Max] Wants to published the CA 1-day forecast results.
+#*	Retrospective testing of UCERF3
-o	[Morgan] Value in providing CSEP1.0 data set publicly.
+#*	Prospective testing of UCERF3
-o	[Mike B.] Need some clear scientific findings and provocative about this. Need to find models that can be rejected and not worth pursuing. Need to curated in such a way that allows scientists to access and work. Would build community.
+#*	Comparatively evalulate U3 against empirical models and physics-based models
-o	[Peter] Need products that are digestible by the public.
+#	5 types of testing
-"	Simulation based testing
+#*	Exploratory testing: Turing
-o	Methods:
+#*	Comparatively: T and W tests
-"	ETAS, U3-ETAS,
+#*	Mean-Hazard testing: null hypothesis significance testing
-o	Need: Process for defining timelines, and which models we would be evaluating.
+#*	Ontological testing: including epistemic uncertainty
-"	Event-triggered/sequence specific testing
+#*	Sequence-specific testing: testing U3-ETAS against observed aftershock sequences
-"	Comparative valuations
+*	Guiding principal: all OEF models should be under continual prospective testing, put ETAS under operational testing. Need to find the value for UCERF3-ETAS testing.
-o	Turing tests
+*	Could possibly find out what aspects of U3 are superfluous and could simplify the model for using in other locales.
-o	Verification
+*	Slip-rate data are special for California data sets
-o	Inter-comparison of models
+*	Important to compare against physics-based earthquake simulations such as RSQsim to evaluate certain assumptions in the model.
-"	Modeling catalog completeness
+*	UCERF3 drivers of CSEP2
-"	Modeling epistemic uncertainties: Important, but IT challenges.
+**	Datasets used in prospective testing must be versioned and archived. Should include analyzed datasets and raw catalogs
-"	Fault/cell participation
+**	Testing on simulated event catalogs
-"	[Ned] Testing usefulness!
+*	Benefits of UCERF3-TP to the USGS
-"	Rupture association problem
+**	Scientific value
-"	Fault characteristics
+**	Software infrastructure
-"	U3-TD elastic-rebound testing
-"	[Peter] Declustering work could help with the USGS. Lots of ambiguity on different declustering models.
-"	[Phil] CSEP Should expose its methods so users could leverage the algorithm
-"	[Mike B.] Valuation question most important. Should physics-based simulations be used for Hazard?
-"	[Kevin] Errors need to be propagated moving forward!
-"	[Phil] Need to figure out how the data will be provided to the scientist.
-"	[All] Web based interface would be valuable.
+;List of Possible milestones
+*	CSEP1.0. What products would be useful?
+**	Do we need to keep operationalizing CSEP1.0?
+**	New models necessary?
+***	[Max] CSEP1 big achievement/success was incorporating new models.
+*** [Ned] Should support new models but need to be more selective.
+*** [Max] Wants to published the CA 1-day forecast results.
+*** [Morgan] Value in providing CSEP1.0 data set publicly.
+*** [Mike B.] Need some clear scientific findings and be provocative about this. Need to find models that can be rejected and not worth pursuing. Need to curated in such a way that allows scientists to access and work. Would build community.
+*** [Peter] Need products that are digestible by the public.
+* Simulation based testing
+** Methods:
+*** ETAS, U3-ETAS,
+** Need: Process for defining timelines, and which models we would be evaluating.
+* Event-triggered/sequence specific testing
+* Comparative valuations
+** Turing tests
+** Verification
+** Inter-comparison of models
+* Modeling catalog completeness
+* Modeling epistemic uncertainties: Important, but IT challenges.
+* Fault/cell participation
+* [Ned] Testing usefulness!
+* Rupture association problem
+* Fault characteristics
+* UCERF3-TD elastic-rebound testing
+* [Peter] Declustering work could help with the USGS. Lots of ambiguity on different declustering models.
+* [Phil] CSEP Should expose its methods so users could leverage the algorithm
+* [Mike B.] Valuation question most important. Should physics-based simulations be used for Hazard?
+* [Kevin] Errors need to be propagated moving forward!
+* [Phil] Need to figure out how the data will be provided to the scientist.
+* [All] Web based interface would be valuable.
-Notes: Targeting publication for July/August issue of SRL. Have some time to develop the webpage.
+; Notes about SRL special issue:
-"	Expect Press material surrounding the SRL release, so need to be prepared with digestible figures.
+* Targeting publication for July/August issue of SRL. Have some time to develop the webpage.
-"	Schedule IT call with Peter, Ned, Phil, Bill, Max, Fabio
+* Expect Press material surrounding the SRL release, so need to be prepared with digestible figures.