Difference between revisions of "2016 CyberShake database migration"

Revision as of 19:45, 18 May 2016

To clarify terminology:

"Input data": Rupture data, ERF-related data, sites data. This data is shared between studies. "Runs data": What parameters are used with each run, timestamps, systems, study membership. A run is only part of a single study. "Output data": Peak amplitudes data

Goals of DB Migration

Provide improved read performance for users of CyberShake data
Separate production data from data from completed studies
Permit easy extension to support UGMS web site

Status of DB resources following migration

Swapped hardware between moment and focal
On read-only server, multiple DBs with Study 15.4 and Study 15.12 data
On production server, keep Study 15.12 and 15.4
Migrate older studies to alternative format, delete from production server.

Detailed Procedure for CyberShake DB Migration

Run mysqldump on entire DB on focal. Generate dumpfiles for all the input data, each study's output and runs data, and the runs and output data which is not part of a study.
Delete database on moment.
Reconfigure DB on moment (single file per table, etc.)
Load all files into DB on moment using the InnoDB engine; confirm they are present.
Delete database on focal
Load input data, Study 15.12, Study 15.4 onto focal for read-only access, using the MyISAM engine.
Swap names of focal and moment so we don't have to change all our scripts.

Once this is done, we can finalize the alternative format we want. We may want to test SQLite to see if it can handle a large study.

Since the runs data and input data is much smaller (~100x) than the output data, we will keep a full copy of it along with the output data for each study. It would be much more time intensive to identify which subset of runs data and input data applies just to the study and the extra space needed to keep it all is trivial.

@@ Line 1: / Line 1: @@
+To clarify terminology:
+"Input data": Rupture data, ERF-related data, sites data.  This data is shared between studies.
+"Runs data": What parameters are used with each run, timestamps, systems, study membership.  A run is only part of a single study.
+"Output data": Peak amplitudes data
 === Goals of DB Migration ===
-*Swap hardware between moment and focal
+*Provide improved read performance for users of CyberShake data
-*On read-only server, have multiple DBs with Study 15.4 and Study 15.12 data
+*Separate production data from data from completed studies
+*Permit easy extension to support UGMS web site
+=== Status of DB resources following migration ===
+*Swapped hardware between moment and focal
+*On read-only server, multiple DBs with Study 15.4 and Study 15.12 data
 *On production server, keep Study 15.12 and 15.4
 *Migrate older studies to alternative format, delete from production server.
-=== Procedure for CyberShake DB Migration ===
+=== Detailed Procedure for CyberShake DB Migration ===
-# Run mysqldump on entire DB on focal.  Generate dumpfiles for all the metadata, and then each study separately, along with the runs which are not part of a study.
+# Run mysqldump on entire DB on focal.  Generate dumpfiles for all the input data, each study's output and runs data, and the runs and output data which is not part of a study.
 # Delete database on moment.
 # Reconfigure DB on moment (single file per table, etc.)
-# Load all files into DB on moment; confirm they are present.
+# Load all files into DB on moment using the InnoDB engine; confirm they are present.
 # Delete database on focal
-# Load metadata, Study 15.12, Study 15.4 onto focal for read-only access.
+# Load input data, Study 15.12, Study 15.4 onto focal for read-only access, using the MyISAM engine.
 # Swap names of focal and moment so we don't have to change all our scripts.
 Once this is done, we can finalize the alternative format we want.  We may want to test SQLite to see if it can handle a large study.
+Since the runs data and input data is much smaller (~100x) than the output data, we will keep a full copy of it along with the output data for each study.  It would be much more time intensive to identify which subset of runs data and input data applies just to the study and the extra space needed to keep it all is trivial.

Difference between revisions of "2016 CyberShake database migration"

Revision as of 19:45, 18 May 2016

Goals of DB Migration

Status of DB resources following migration

Detailed Procedure for CyberShake DB Migration

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools