Difference between revisions of "CyberShake Study Database Archiving"
From SCECpedia
Jump to navigationJump to searchLine 1: | Line 1: | ||
The following procedure should be used when we want to archive a study which is either in the production or data access database to disk, to free up room in the database. | The following procedure should be used when we want to archive a study which is either in the production or data access database to disk, to free up room in the database. | ||
− | First, dump the old database contents: | + | <b>First, dump the old database contents into a new directory:</b> |
<ol> | <ol> | ||
Line 20: | Line 20: | ||
</ol> | </ol> | ||
− | Next, convert the SQL dumps into SQLite format using mysql2sqlite: | + | <b>Next, convert the SQL dumps into SQLite format using mysql2sqlite:</b> |
#For each of the 6 dump files, run: | #For each of the 6 dump files, run: | ||
Line 27: | Line 27: | ||
For large tables this may take an hour or two. | For large tables this may take an hour or two. | ||
− | Create a SQLite database and import the tables. Use sqlite3 3.7.11 or later, or there will be an error reading the dump files. | + | <b>Create a SQLite database and import the tables. Use sqlite3 3.7.11 or later, or there will be an error reading the dump files.</b> |
<ol> | <ol> | ||
<li>Create a database for this study:</li> | <li>Create a database for this study:</li> | ||
Line 35: | Line 35: | ||
</ol> | </ol> | ||
− | Run a few queries on the original table on the database, and on the sqlite files to check that the count is the same. | + | <b>Run a few queries on the original table on the database, and on the sqlite files to check that the count is the same.</b> |
<ol> | <ol> | ||
<li>Check the number of PeakAmplitudes:</li> | <li>Check the number of PeakAmplitudes:</li> | ||
Line 42: | Line 42: | ||
select count(*) from CyberShake_Runs R, CyberShake_Site_Ruptures SR, Rupture_Variations V where R.Site_ID=SR.CS_Site_ID and SR.Source_ID=V.Source_ID and SR.Rupture_ID=V.Rupture_ID and R.ERF_ID=V.ERF_ID and R.ERF_ID=SR.ERF_ID and R.Study_ID=5 and V.Rup_Var_Scenario_ID=R.Rup_Var_Scenario_ID; | select count(*) from CyberShake_Runs R, CyberShake_Site_Ruptures SR, Rupture_Variations V where R.Site_ID=SR.CS_Site_ID and SR.Source_ID=V.Source_ID and SR.Rupture_ID=V.Rupture_ID and R.ERF_ID=V.ERF_ID and R.ERF_ID=SR.ERF_ID and R.Study_ID=5 and V.Rup_Var_Scenario_ID=R.Rup_Var_Scenario_ID; | ||
</ol> | </ol> | ||
+ | |||
+ | <b>Move the files to the archive location on scec-04.</b> | ||
+ | <ol> | ||
+ | <li>Tar up the sqlite files:</li> | ||
+ | tar czvf <study_name>.tgz *.sqlite | ||
+ | <li>Copy the tgz file to the study sqlite archive location on scec-04.</li> | ||
+ | cp <study_name>.tgz /home/scec-04/tera3d/CyberShake/sqlite_studies/sqlite_dumps | ||
+ | <li>Extract the database to /home/scec-04/tera3d/CyberShake/sqlite_studies.</li> | ||
+ | tar xzvf sqlite_dumps/<study_name>.tgz <study_name>.sqlite | ||
+ | </ol> | ||
+ | |||
+ | <b>Extract |
Revision as of 02:37, 6 October 2017
The following procedure should be used when we want to archive a study which is either in the production or data access database to disk, to free up room in the database.
First, dump the old database contents into a new directory:
- Determine the Study ID you want to archive, from looking at the Studies table in the DB: select * from Studies;
- Dump the PeakAmplitudes, using: mysqldump --lock-all-tables --where 'Run_ID in (select R.Run_ID from CyberShake_Runs R where R.Study_ID=<study id to archive>)' CyberShake PeakAmplitudes > peak_amps.sql
- Dump Hazard_Datasets, using: mysqldump --lock-all-tables --where 'Study_ID=<study id to archive>' CyberShake Hazard_Datasets > hazard_datasets.sql
- Dump Hazard_Curves, using: mysqldump --lock-all-tables --where 'Hazard_Dataset_ID in (select D.Hazard_Dataset_ID from Hazard_Datasets D where D.Study_ID=<study id to archive>)' CyberShake Hazard_Curves > hazard_curves.sql
- Dump Hazard_Curve_Points, using: mysqldump --lock-all-tables --where 'Hazard_Curve_ID in (select C.Hazard_Curve_ID from Hazard_Curves C, Hazard_Datasets D where D.Study_ID=<study id to archive>)' CyberShake Hazard_Curve_Points > hazard_curve_points.sql
- Dump CyberShake_Runs, using: mysqldump --lock-all-tables --where 'Study_ID=<study id to archive>' CyberShake CyberShake_Runs > runs.sql
- Dump all of the input tables -- they are small enough that we don't mind capturing data which wasn't directly used in this study. mysqldump CyberShake AR_Hazard_Curve_Points AR_Hazard_Curves AR_Hazard_Datasets Atten_Rel_Metadata Atten_Rels CyberShake_Site_Regions CyberShake_Site_Ruptures CyberShake_Site_Types CyberShake_Sites ERF_IDs ERF_Metadata ERF_Probability_Models IM_Types Points Rupture_Variation_Probability_Modifier Rupture_Variation_Scenario_IDs Rupture_Variation_Scenario_Metadata Rupture_Variations Ruptures SGT_Variation_IDs SGT_Variation_Metadata Studies Time_Spans Velocity_Models > input_tables.sql
Next, convert the SQL dumps into SQLite format using mysql2sqlite:
- For each of the 6 dump files, run:
mysql2sqlite <SQL dump file> > <SQLite dump file> Example: ./mysql2sqlite peak_amps.sql > peak_amps.sqlite
For large tables this may take an hour or two.
Create a SQLite database and import the tables. Use sqlite3 3.7.11 or later, or there will be an error reading the dump files.
- Create a database for this study: sqlite3 <study name>
- For each dump file, run the following command: .read <path/to/dump/file.sqlite>
Run a few queries on the original table on the database, and on the sqlite files to check that the count is the same.
- Check the number of PeakAmplitudes: select count(*) from PeakAmplitudes
- Check the number of rupture variations for each site: select count(*) from CyberShake_Runs R, CyberShake_Site_Ruptures SR, Rupture_Variations V where R.Site_ID=SR.CS_Site_ID and SR.Source_ID=V.Source_ID and SR.Rupture_ID=V.Rupture_ID and R.ERF_ID=V.ERF_ID and R.ERF_ID=SR.ERF_ID and R.Study_ID=5 and V.Rup_Var_Scenario_ID=R.Rup_Var_Scenario_ID;
Move the files to the archive location on scec-04.
- Tar up the sqlite files: tar czvf <study_name>.tgz *.sqlite
- Copy the tgz file to the study sqlite archive location on scec-04. cp <study_name>.tgz /home/scec-04/tera3d/CyberShake/sqlite_studies/sqlite_dumps
- Extract the database to /home/scec-04/tera3d/CyberShake/sqlite_studies. tar xzvf sqlite_dumps/<study_name>.tgz <study_name>.sqlite
Extract