Skip navigation and jump directly to page content

 IU Trident Indiana University

Weather Forecasting: Data Capacitor LEADs to Faster-Than-Real-Time Weather Forecasting 
Data Capacitor Principal Investigator: Craig A. Stewart 
Funded by National Science Foundation grant number CNS-0521433 
LEAD Principal Investigator: Beth Plale 
Funded by National Science Foundation grant number AGS-0331480 

TornadoWeather happens in real time. But thanks to scientists and researchers from two NSF-funded projects, forecasting the weather can now happen even faster. The Linked Environments for Atmospheric Discovery (LEAD) project is a multi-disciplinary collaborative project which has built and deployed cyberinfrastructure to users ranging from high school students learning about weather, to seasoned researchers understanding the dynamics of severe weather storms. LEAD users can search, assimilate, data-mine and visualize realtime observational data streaming from various sources including radars, satellites, mesonets and weather balloons. Users can run weather forecasting workflows ingesting these data and track storms in faster than real-time.

From early on LEAD data movements have struggled with reliability issues as the file transfer nodes were getting overwhelmed and the middleware was not scaling accordingly. LEAD has worked with the technologists at Indiana University to incorporate a TeraGrid resource, the NSF-funded Data Capacitor, into the LEAD infrastructure by mounting the Lustre-WAN onto the gateway hosting and data ingesting servers. The Data Capacitor is used to rapidly and seamlessly move and temporarily store very large data sets. Integration of the Data Capacitor has facilitated LEAD to directly access 42 TeraBytes of Data Capacitor space as local disks, has decreased the workflow execution times by two times, and significantly increased reliability. More importantly, having Data Capacitor mounted directly onto the ingest servers and accessible to TeraGrid and IU’s BigRed compute nodes, has eliminated the need for data transfer, with the applications able to directly read the data as local disks.

The Data Capacitor also hosted the pre-final project data (before being archived onto the IU HPSS system). And the Data Capacitor is directly mounted on the LEAD OpenDAP servers to visualize the outputs without any transfers. Overall LEAD has eliminated the 3 sets of data transfers: ingest to compute, compute to storage, storage to visualize. This elimination of transfers has directly contributed to significantly increasing LEAD’s workflow scalability and yielding faster-than-real-time weather forecasts.

Data Capacitor Information

http://portal.leadproject.org/