Difference between revisions of "2016 CyberShake database migration"

Revision as of 19:49, 14 July 2016

Overview of CyberShake data products

CyberShake is a multi-layered seismic hazard model. The CyberShake system is designed with two primary interfaces to external programs, both of which operate through an MySQL database. The MySQL database schema is maintained in the CyberShake SVN repostiory. A recent version is posted. CyberShake input interfaceAt this point, OpenSHA p

To clarify terminology:

"Input data": Rupture data, ERF-related data, sites data. This data is shared between studies.
"Run data": What parameters are used with each run, timestamps, systems, study membership. A run is only part of a single study.
"Output data": Peak amplitudes data

Goals of DB Migration

Improve performance of production CyberShake runs by improving write performance of CyberShake database. Our first order attempt to improve write performance will be to seperate production data from completed studies.
Provide improved read performance for users of CyberShake databases.
Build CyberShake data access mechanisms and infrastructure that will support planned UGMS CyberShake MCER web site

Status of DB resources following migration

Swapped hardware between moment and focal
On read-only server, 2 databases: 1 with Study 15.4, and 1 with Study 15.12 data.
On production server, 1 database with all input data, the runs and output data for Study 15.12 and 15.4, and the runs and output data for runs which are not associated with any study.
After the above is complete, migrate older studies to alternative format and delete from production server.

Detailed Procedure for CyberShake DB Migration

Run mysqldump on entire DB on focal. Generate dumpfiles for all the input data, each study's output and runs data, and the runs and output data which is not part of any study.
Delete database on moment.
Reconfigure DB on moment (single file per table, etc.)
Load Study 15.12, 15.4, non-study data into DB on moment using the InnoDB engine.
Confirm the reload into moment was successful.
Convert older study runs, output data, and all input data from MySQL dump file into SQLite format. Create a different DB for each study.
Confirm the reloads into SQLite format were successful.
Delete database on focal.
Load input data, Study 15.12 runs+output data, and Study 15.4 runs+output data onto focal for read-only access, using the MyISAM engine. Each study is in a separate database.
Swap names of focal and moment so we don't have to change all our scripts.

Since the input data is much smaller (~100x) than the output data, we will keep a full copy of it with each study. It would be much more time intensive to identify which subset of input data applies just to the study and the extra space needed to keep it all is trivial. However, for each study, we will only keep the runs data for runs which are associated with that study.

@@ Line 1: / Line 1: @@
+== Overview of CyberShake data products ==
+CyberShake is a multi-layered seismic hazard model. The CyberShake system is designed with two primary interfaces to external programs, both of which operate through an MySQL database. The MySQL database schema is maintained in the CyberShake SVN repostiory. A recent version is posted. CyberShake input interfaceAt this point, OpenSHA p
 To clarify terminology:
+*"Input data": Rupture data, ERF-related data, sites data.  This data is shared between studies.
-"Input data": Rupture data, ERF-related data, sites data.  This data is shared between studies.
+*"Run data": What parameters are used with each run, timestamps, systems, study membership.  A run is only part of a single study.
+*"Output data": Peak amplitudes data
-"Run data": What parameters are used with each run, timestamps, systems, study membership.  A run is only part of a single study.
-"Output data": Peak amplitudes data
 === Goals of DB Migration ===
+*Improve performance of production CyberShake runs by improving write performance of CyberShake database. Our first order attempt to improve write performance will be to seperate production data from completed studies.
-*Provide improved read performance for users of CyberShake data
+*Provide improved read performance for users of CyberShake databases.
-*Separate production data from data from completed studies
+*Build CyberShake data access mechanisms and infrastructure that will support planned UGMS CyberShake MCER web site
-*Permit easy extension to support UGMS web site
 === Status of DB resources following migration ===
 *Swapped hardware between moment and focal
 *On read-only server, 2 databases: 1 with Study 15.4, and 1 with Study 15.12 data.

Difference between revisions of "2016 CyberShake database migration"

Revision as of 19:49, 14 July 2016

Contents

Overview of CyberShake data products

Goals of DB Migration

Status of DB resources following migration

Detailed Procedure for CyberShake DB Migration

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools