Difference between revisions of "CSEP2 Storing Stochastic Event Sets"

From SCECpedia
Jump to navigationJump to search
Line 26: Line 26:
 
Internally, a stochastic event set will be represented as a collection of pandas DataFrames. Each column in the data frame will represent one of the 10 fields represented in the ZMAP format.  
 
Internally, a stochastic event set will be represented as a collection of pandas DataFrames. Each column in the data frame will represent one of the 10 fields represented in the ZMAP format.  
  
If the machines have sufficient memory, the data frames could be merged into a single large data structure that will support SQL-like queries. Pandas data frames can also interface directly with a database in the case that we would want to
+
If the machines have sufficient memory, the data frames could be merged into a single large data structure that will support SQL-like queries. Note: Pandas data frames can also interface directly with an SQL database to allow for the possibility of storing simulation results in databases in the future.

Revision as of 01:18, 26 October 2018

Introduction

This page reflects the work on dealing with stochastic event sets within CSEP2.

In order to maintain inter-operability between models and the international testing centers we must adopt a standard format for stochastic event sets that contains the necessary and sufficient information needed to perform CSEP evaluations of the forecasts.

The most straightforward approach would be to continue with a catalog format supported by CSEP1. The likely candidate for CSEP2 catalog format would be the so-called ZMAP format which has the following format:

  1. Longitude [deg]
  2. Latitude [deg]
  3. Decimal year (e.g., 2005.5 for July 1st, 2005)
  4. Month
  5. Day
  6. Magnitude
  7. Depth [km]
  8. Hour
  9. Minute
  10. Second

File Formats

The fields represented can be represented on the computer using a number of different file formats. However, we will focus on a binary representation that aims to reduce the total file size.

Internally, a stochastic event set will be represented as a collection of pandas DataFrames. Each column in the data frame will represent one of the 10 fields represented in the ZMAP format.

If the machines have sufficient memory, the data frames could be merged into a single large data structure that will support SQL-like queries. Note: Pandas data frames can also interface directly with an SQL database to allow for the possibility of storing simulation results in databases in the future.