Difference between revisions of "GetFile"

From SCECpedia
Jump to navigationJump to search
(Create GetFile entry)
 
(Rewrite GetFile entry in terms of OpenSHA problem)
 
Line 1: Line 1:
== What is GetFile? ==
+
== OpenSHA Problem ==
GetFile is a framework for keeping versioned files up to date. It can download latest changes, validate downloads, and provides rollback functionality.
+
Fault section data and fault system rupture sets are loaded by OpenSHA to compute earthquake rupture forecasts. The geospatial data for several models is stored directly with the OpenSHA code that operates on it. Models for seismic hazard analysis under the OpenSHA framework are becoming progressively larger. Unfortunately, there are file size constraints of 100MB on GitHub, which can't fit the new 2023 US National Seismic Hazard Model (NSHM23).
This general-purpose file updating tool is designed to be integrated into Java applications.
 
GetFile will actively be maintained with bug fixes and API-compatible changes and is seamlessly deployed as GetFile is capable of updating itself.
 
GetFile is currently code-complete and has decent test coverage. I plan to continue writing thorough tests and developing new features.
 
  
== Why did we build this? ==
+
== Current Solution ==
GetFile was created to enable us to host hazard models separately from version control.
+
Smaller models can continue to be hosted on GitHub with the OpenSHA code, but UCERF3 has been moved to a server on USC campus. Currently, OpenSHA downloads the model from the ASB "cheesegrater" server. This solution is not scalable and has the potential to partially download or otherwise download a corrupted version of the UCERF3 model. These older servers are going to be decommissioned soon and we need to transition to a better long-term solution.
Models were previously stored directly with the OpenSHA code on GitHub, but due to file size restrictions of 100MB this is proving to be unsustainable.
 
In addition to enabling the hosting of larger files, GetFile allows us to keep client applications up to date with the latest scientific models.
 
  
== How will it be used? ==
+
== Proposed Solution ==
OpenSHA does not yet have GetFile integration, but it will soon use GetFile to retrieve the UCERF3 and NSHM23 models.
+
GetFile is a more robust solution to hosting hazard models for use in OpenSHA. It will be used to download and validate the UCERF3 and NSHM23 models. It may see use in other models and several projects across SCEC that need to download and validate files, such as UCVM. Scientific models can be stored on USC CARC and downloaded via the GetFile framework. GetFile provides a sophisticated feature-set for data validation, rolling back to older model snapshots, and enabling automatic updates of the GetFile framework for seamless deployment of new features and bug fixes.
GetFile may also find use in other projects across SCEC, like UCVM.
 
  
 
== Docs and Code ==
 
== Docs and Code ==
 
The source code and detailed usage and setup documentation: https://github.com/abhatthal/getfile
 
The source code and detailed usage and setup documentation: https://github.com/abhatthal/getfile
  
Demo applications using the GetFile lib: https://github.com/abhatthal/getfile-demo
+
Demo applications using the GetFile library: https://github.com/abhatthal/getfile-demo

Latest revision as of 21:51, 21 November 2024

OpenSHA Problem

Fault section data and fault system rupture sets are loaded by OpenSHA to compute earthquake rupture forecasts. The geospatial data for several models is stored directly with the OpenSHA code that operates on it. Models for seismic hazard analysis under the OpenSHA framework are becoming progressively larger. Unfortunately, there are file size constraints of 100MB on GitHub, which can't fit the new 2023 US National Seismic Hazard Model (NSHM23).

Current Solution

Smaller models can continue to be hosted on GitHub with the OpenSHA code, but UCERF3 has been moved to a server on USC campus. Currently, OpenSHA downloads the model from the ASB "cheesegrater" server. This solution is not scalable and has the potential to partially download or otherwise download a corrupted version of the UCERF3 model. These older servers are going to be decommissioned soon and we need to transition to a better long-term solution.

Proposed Solution

GetFile is a more robust solution to hosting hazard models for use in OpenSHA. It will be used to download and validate the UCERF3 and NSHM23 models. It may see use in other models and several projects across SCEC that need to download and validate files, such as UCVM. Scientific models can be stored on USC CARC and downloaded via the GetFile framework. GetFile provides a sophisticated feature-set for data validation, rolling back to older model snapshots, and enabling automatic updates of the GetFile framework for seamless deployment of new features and bug fixes.

Docs and Code

The source code and detailed usage and setup documentation: https://github.com/abhatthal/getfile

Demo applications using the GetFile library: https://github.com/abhatthal/getfile-demo