Collaborative Research: Mining Seismic Wavefields
Investigators: Gregory C. Beroza (Stanford); Co-PIs: Philip J. Maechling (USC), Yehuda Ben-Zion (USC), Thomas H. Jordan (USC), Egill Hauksson (Caltech), Zhigang Peng (Georgia Tech)
Overview. A working group of the Southern California Earthquake Center (SCEC) proposes to develop and deploy cyberinfrastructure for mining seismic wavefields through data intensive computing techniques to extend similarity search for earthquake detection to massive data sets. Similarity search has been used to understand the mechanics of tectonic tremor, transform our understanding of the depth- dependence of faulting, illuminate diffusion within aftershock seismicity, and reveal new insights into induced earthquakes. These results were achieved with modest data volumes – from ~ 10 seismic stations spanning ~ 10 km – yet they increased the number of detected earthquakes by a factor of 10 to 100. This geoinformatics project will develop the cyberinfrastructure required to enable high-sensitivity studies of earthquake processes through the discovery of previously undetected seismic events within massive data volumes.
IntellectualMerit. Standardpracticetreatsseismicsignalsindividually,anddetectsanearthquakewhen multiple impulsive arrivals consistent with a source within the Earth are associated across a network. This approach dates from an era where analog data, limited telecommunications, the high cost of sensors, and lack of computing prevented network-based detection. The premise of this proposal is that continuous and/or densely recorded data coupled with high-performance computing and improved scalable algorithms enable a network-based approach that will greatly enhance the detection of currently unreported weak and unusual events.
Earth structure is effectively constant over extended observational periods, such that proximal earthquake sources generate similar signals. Event detection based on similarity has led to many fundamental discoveries; however, most similarity-based detection methods require prior knowledge of the source waveform. Search for signals with unknown signatures based on pair-wise or multiple matches has seen some success; however, naïve implementations suffer from quadratic scaling with time such that problems of great interest – decades of data recorded on hundreds to thousands of channels – are beyond even the most capable computers. For dense networks, the availability of waveform data motivates alternative detection schemes, such as reverse-time imaging, but 4D imaging of continuous data also leads to computationally daunting problems.
We propose to use highly efficient techniques from data mining to implement scalable search for similar seismic signals. Multiple facets of mining massive waveform data are challenging. The technical challenges to be addressed as part of the proposed research for spatially sparse recording are to: improve similarity-preserving compression, use improved compressed representations to search efficiently for repeating signals over a network, and improve post-processing of search output to isolate signals of seismological interest. For spatially dense recording, we propose to: extend recently developed wavefield techniques to similarity across channels within dense networks and to enable similarity search across elastic wavefields in four dimensions.
Broader impacts. The proposed work has the potential for exceptionally broad impact on all disciplines using earthquakes by providing vastly more detailed seismic catalogs. Seismicity induced by human activities is an emerging problem that adversely affects energy options for the 21st century, including shale gas development, enhanced geothermal energy, and carbon sequestration. A more complete view of seismicity related to these activities is essential to managing the risks they pose. The project will create exceptional training opportunities for graduate students, including several female students, from geo- and computational sciences, and a postdoc at four institutions. The complementary nature of the proposed approaches will allow us to compare their output and assess their performance. Data-intensive computing approaches have had limited impact in seismology. Cheap, capable sensor technology is poised to increase data rates dramatically, and earthquake seismology is under-prepared for this. The proposed work would change that by introducing state-of-the-art data-mining techniques to earthquake seismology.