From early on LEAD data movements have struggled with reliability issues as the file transfer nodes were getting overwhelmed and the middleware was not scaling accordingly. LEAD has worked with the technologists at Indiana University to incorporate a TeraGrid resource, the NSF-funded Data Capacitor, into the LEAD infrastructure by mounting the Lustre-WAN onto the gateway hosting and data ingesting servers. The Data Capacitor is used to rapidly and seamlessly move and temporarily store very large data sets. Integration of the Data Capacitor has facilitated LEAD to directly access 42 TeraBytes of Data Capacitor space as local disks, has decreased the workflow execution times by two times, and significantly increased reliability. More importantly, having Data Capacitor mounted directly onto the ingest servers and accessible to TeraGrid and IU’s BigRed compute nodes, has eliminated the need for data transfer, with the applications able to directly read the data as local disks.
The Data Capacitor also hosted the pre-final project data (before being archived onto the IU HPSS system). And the Data Capacitor is directly mounted on the LEAD OpenDAP servers to visualize the outputs without any transfers. Overall LEAD has eliminated the 3 sets of data transfers: ingest to compute, compute to storage, storage to visualize. This elimination of transfers has directly contributed to significantly increasing LEAD’s workflow scalability and yielding faster-than-real-time weather forecasts.