Difference between revisions of "2016 CyberShake database migration"

From SCECpedia
Jump to navigationJump to search
 
(20 intermediate revisions by 2 users not shown)
Line 10: Line 10:
 
== Current CyberShake Database server hardware and software ==
 
== Current CyberShake Database server hardware and software ==
 
* Production CyberShake DB version (focal): Server version: 5.1.73 Source distribution
 
* Production CyberShake DB version (focal): Server version: 5.1.73 Source distribution
* Production CyberShake Computer Hardware:
+
* Production CyberShake Computer Hardware Specs: 128 GB RAM, 24 cores
 
* Read Only CyberShake DB version (moment): Server version: 10.0.23-MariaDB MariaDB Server
 
* Read Only CyberShake DB version (moment): Server version: 10.0.23-MariaDB MariaDB Server
* Read only Computer Hardware:
+
* Read only Computer Hardware Spec: 48 GB RAM, 24 cores
  
 
== Goals of DB Migration ==
 
== Goals of DB Migration ==
Line 21: Line 21:
  
 
== Status of DB resources following migration ==
 
== Status of DB resources following migration ==
*Production CyberShake database server software upgraded to current version of mysql software.
+
*Production CyberShake database server software upgraded to current version of MariaDB (10.1.16), FC 24
 
*Production CyberShake database server running on upgraded computer hardware.
 
*Production CyberShake database server running on upgraded computer hardware.
*Read-only Mysql database server running on adequate computer hardware and storage.
+
*Read-only Mysql database server running on adequate computer hardware and storage; upgraded to MariaDB 10.1.16
*CyberShake production server contains: 1 database with all input data, the runs and output data for Study 15.12 and 15.4, and the runs and output data for runs which are not associated with any study.
+
*CyberShake production server contains: 1 database with all input data, the runs and output data for Study 15.12, and the runs and output data for runs which are not associated with any study.
 
*CyberShake read-only database server: 2 databases: 1 with Study 15.4, and 1 with Study 15.12 data.
 
*CyberShake read-only database server: 2 databases: 1 with Study 15.4, and 1 with Study 15.12 data.
*Sqlite v3+ database containing complete copy of all CyberShake database data (as it contained on July 15, 2016)
+
*5 sqlite v3+ databases, each one containing the all the input data, and the output data and runs data for one of the pre-Study 15.4 CyberShake studies.
  
 
== Detailed Procedure for CyberShake DB Migration ==
 
== Detailed Procedure for CyberShake DB Migration ==
# Check current size of production CyberShake DB (focal).
+
# <s>Back up read-only</s> and production databases
# Confirm storage space on CyberShake production server to confirm that file system has room for dump files.
+
# <s>Upgrade CyberShake Production DB server OS (FC 24) and MariaDB (10.1.16) software </s>
# Create a backup of the full database to ensure we can recreate the CyberShake production database if there are problems during the dump procedure.
+
# <s>Reconfigure DB on CyberShake Production DB Server:</s>
# Run mysqldump on entire CyberShake DB on focal using (row, buffering) options.  Use mysqldump (sql, delimited) output format.
+
#*innodb_file_per_table = 1
## The proposed mysqldump commands are:
+
#*innodb_buffer_pool_size = 112G
#Although we are dumping the entire CyberShake DB, we will generate dumpfiles for  
+
#*innodb_buffer_pool_instances = 8
##all the input data,  
+
#*query_cache_type = 0
##each study's output and runs data, and  
+
#*tmp_table_size = 32M
##the runs and output data which is not part of any study.
+
#*max_heap_table_size = 32M
# Confirm all Production database dumps completed successfully.
+
#*innodb_log_file_size = 2G
 
+
# <s>Check current size of production CyberShake DB (focal).</s>
 
+
# <s>Confirm storage space on CyberShake production server to confirm that file system has room for dump files.</s>
# Backup read-only database (moment)
+
# <s>Run mysqldump on CyberShake DB on production server using (row, buffering) options.  Use mysqldump default output format.</s>
 
+
## The proposed mysqldump commands will use the mysqldump 'where' syntax for joining and dumping data is associated with a study.
+
##Although we are dumping the entire CyberShake DB, we will generate individual dumpfiles for  
# Upgrade CyberShake Production DB server hardware/OS/Mysql software and required workflow tools, certs, etc
+
###all the input data,  
# Reconfigure DB on CyberShake Production DB Server (single file per table - Please note other specific changes here)
+
###each study's output and runs data, and  
# Either Load Study 15.12, 15.4, non-study data into CyberShake Production Server using the InnoDB engine ,or
+
###the runs and output data which is not part of any study.
# Or Load CyberShake Input data into CyberShake production server.  
+
## Create dump files on a USB3 drive for easier migration.
# Confirm the reload into CyberShake Production Server was successful by running workflows and populating CyberShake production server with new hazard curves using workflows from Blue Waters.
+
# <s>Load Study 15.12 and non-study data into CyberShake Production Server using the InnoDB engine.</s>
 
+
# <s>Add new index to speed up count queries, on (Run_ID/IM_Type_ID) in PeakAmplitudes table on production server.</s>
 
+
# <s>Confirm the reload on the production server has the correct number of rows in each table, as compared to the focal DB.</s>
# Convert older study runs, output data, and all input data from MySQL dump file into SQLite format. Create a different DB for each study.
+
# <s> Confirm the reload into CyberShake Production Server was successful by running workflows and populating CyberShake production server with new hazard curves using workflows from Blue Waters. </s>
## Conversion done with sqlite insert scripts posted here ....
+
# <s>Convert older study runs, output data, and all input data from MySQL dump file into SQLite format.</s>
## Mysql non-primary key fields that are autoincrement are converted this way...
+
## Conversion done with sqlite insert scripts at https://github.com/dumblob/mysql2sqlite .
## Mysql date/time fields that where sub-seconds are important are converted this way...
+
## Mysql date/time fields will be converted to integer fields.
## mysql date/time files where integer time is okay are converted this way....
+
# <s>For each study, load all the insert data, the runs data for that study, and the output data for that study into a DB. </s>
# Confirm the reloads into SQLite format were successful with the following acceptance tests:
+
# <s>Confirm the reloads into SQLite format were successful:</s>
## Study 15.4 hazard curve query and result ...
+
##Check the row count on all tables.
##Study 15.12 hazard curve query and result ..
+
##Ask Kevin to create a hazard curve using both the focal DB and the sqlite DB and compare results.
## Other tests
+
# <s>Drop database on focal.</s>
 
+
# <s>Update focal's software and configuration to be read-only database server, using the InnoDB engine.  We originally planned to use the MyISAM engine because the read performance can be better, but Oracle is actually phasing out MyISAM and InnoDB performance continues to improve, so we will use InnoDB.</s>
 
+
# <s>Load input data, Study 15.12 runs+output data, Study 15.4 runs+output data, AR* tables, and curves for all runs onto focal for read-only access.</s>
# Possibly Re-establish Mysql CyberShake read-only database so that ...
 
# Load input data, Study 15.12 runs+output data, and Study 15.4 runs+output data onto focal for read-only access, using the MyISAM engine.  Each study is in a separate database.
 
  
 
Since the input data is much smaller (~100x) than the output data, we will keep a full copy of it with each study.  It would be much more time intensive to identify which subset of input data applies just to the study and the extra space needed to keep it all is trivial.  However, for each study, we will only keep the runs data for runs which are associated with that study.
 
Since the input data is much smaller (~100x) than the output data, we will keep a full copy of it with each study.  It would be much more time intensive to identify which subset of input data applies just to the study and the extra space needed to keep it all is trivial.  However, for each study, we will only keep the runs data for runs which are associated with that study.

Latest revision as of 22:45, 23 January 2018

Overview of CyberShake data products

CyberShake is a multi-layered seismic hazard model. The CyberShake system is designed with two primary interfaces to external programs, both of which operate through an MySQL database. The MySQL database schema is maintained in the CyberShake SVN repository. A recent version is posted online. CyberShake has an input interface for UCERF rupture forecasts. OpenSHA uses this interface to populate CyberShake database with fault models and rupture probabilities. Internal CyberShake programs populate additional DB Tables with information about sites, and velocity models, and other hazard model specific information. Then, the CyberShake system run on HPC systems. Much of the CyberShake processing is conducted using scientific workflow tools. The end point of the CyberShake workflow processing are output Database tables, on the CyberShake production database server. These tables represent the CyberShake output interface. The CyberShake output external interface provides study information, site information, hazard curves, and peak amplitude information. Associated output data, not currently in the database, includes SRF files, and seismogram files. These output external data are access through file-based lookups, often depending on metadata encoded in the output file names.

To clarify terminology:

  • "Input data": Rupture data, ERF-related data, sites data. This data is shared between studies.
  • "Run data": What parameters are used with each run, timestamps, systems, study membership. A run is only part of a single study.
  • "Output DB data": Peak amplitudes data, point on hazard curves, frequencies for which hazard curves are calculated.
  • "Output External data": SRF files, seismogram files

Current CyberShake Database server hardware and software

  • Production CyberShake DB version (focal): Server version: 5.1.73 Source distribution
  • Production CyberShake Computer Hardware Specs: 128 GB RAM, 24 cores
  • Read Only CyberShake DB version (moment): Server version: 10.0.23-MariaDB MariaDB Server
  • Read only Computer Hardware Spec: 48 GB RAM, 24 cores

Goals of DB Migration

  • Improve performance of production CyberShake runs. Indications are that database write performance is slowing down our CyberShake production runs.
  • Separate production data from completed studies to reduce the possibility that new production runs will affect completed studies.
  • Provide improved read performance for users of CyberShake external output interface, such as programs that plot CyberShake hazard maps, hazard curves, and peak amplitudes.
  • Build CyberShake data access mechanisms and infrastructure that will support planned UGMS CyberShake MCER web tools

Status of DB resources following migration

  • Production CyberShake database server software upgraded to current version of MariaDB (10.1.16), FC 24
  • Production CyberShake database server running on upgraded computer hardware.
  • Read-only Mysql database server running on adequate computer hardware and storage; upgraded to MariaDB 10.1.16
  • CyberShake production server contains: 1 database with all input data, the runs and output data for Study 15.12, and the runs and output data for runs which are not associated with any study.
  • CyberShake read-only database server: 2 databases: 1 with Study 15.4, and 1 with Study 15.12 data.
  • 5 sqlite v3+ databases, each one containing the all the input data, and the output data and runs data for one of the pre-Study 15.4 CyberShake studies.

Detailed Procedure for CyberShake DB Migration

  1. Back up read-only and production databases
  2. Upgrade CyberShake Production DB server OS (FC 24) and MariaDB (10.1.16) software
  3. Reconfigure DB on CyberShake Production DB Server:
    • innodb_file_per_table = 1
    • innodb_buffer_pool_size = 112G
    • innodb_buffer_pool_instances = 8
    • query_cache_type = 0
    • tmp_table_size = 32M
    • max_heap_table_size = 32M
    • innodb_log_file_size = 2G
  4. Check current size of production CyberShake DB (focal).
  5. Confirm storage space on CyberShake production server to confirm that file system has room for dump files.
  6. Run mysqldump on CyberShake DB on production server using (row, buffering) options. Use mysqldump default output format.
    1. The proposed mysqldump commands will use the mysqldump 'where' syntax for joining and dumping data is associated with a study.
    2. Although we are dumping the entire CyberShake DB, we will generate individual dumpfiles for
      1. all the input data,
      2. each study's output and runs data, and
      3. the runs and output data which is not part of any study.
    3. Create dump files on a USB3 drive for easier migration.
  7. Load Study 15.12 and non-study data into CyberShake Production Server using the InnoDB engine.
  8. Add new index to speed up count queries, on (Run_ID/IM_Type_ID) in PeakAmplitudes table on production server.
  9. Confirm the reload on the production server has the correct number of rows in each table, as compared to the focal DB.
  10. Confirm the reload into CyberShake Production Server was successful by running workflows and populating CyberShake production server with new hazard curves using workflows from Blue Waters.
  11. Convert older study runs, output data, and all input data from MySQL dump file into SQLite format.
    1. Conversion done with sqlite insert scripts at https://github.com/dumblob/mysql2sqlite .
    2. Mysql date/time fields will be converted to integer fields.
  12. For each study, load all the insert data, the runs data for that study, and the output data for that study into a DB.
  13. Confirm the reloads into SQLite format were successful:
    1. Check the row count on all tables.
    2. Ask Kevin to create a hazard curve using both the focal DB and the sqlite DB and compare results.
  14. Drop database on focal.
  15. Update focal's software and configuration to be read-only database server, using the InnoDB engine. We originally planned to use the MyISAM engine because the read performance can be better, but Oracle is actually phasing out MyISAM and InnoDB performance continues to improve, so we will use InnoDB.
  16. Load input data, Study 15.12 runs+output data, Study 15.4 runs+output data, AR* tables, and curves for all runs onto focal for read-only access.

Since the input data is much smaller (~100x) than the output data, we will keep a full copy of it with each study. It would be much more time intensive to identify which subset of input data applies just to the study and the extra space needed to keep it all is trivial. However, for each study, we will only keep the runs data for runs which are associated with that study.