Research Technologies Presentations



Community Grids Lab: Virtual Block Store System

The fast development and deployment of cloud computing systems stimulate the needs for a standalone block storage system which can provide flexible in-line and off-line block storage services to the virtual machine instances and virtual clusters maintained by the cloud management software. This talk presents the Virtual Block Store (VBS) System, a standalone block storage system developed by the Community Grids Lab of Indiana University. We have built a prototype of VBS based on LVM, ISCSI, and Xen hypervisor, which can provide basic block storage services such as creating and destroying logical volumes and snapshots, and attaching and detaching a volume to/from a running Xen DomU instance. The concept and functional interfaces of VBS are based on the Amazon Elastic Block Store (EBS) service; moreover, VBS can be used independently and directly with an existing volume server and Xen nodes, and can be easily extended to support other VM management systems, and integrated to various cloud computing systems.

In this talk we will present various aspects of the design and implementation of VBS, including its system web service architecture, functionality and workflows, metadata maintenance, and integration with Nimbus. Based on this prototype we will further discuss the prospective challenges in the future development of VBS, including scalability, consistency, reliability, etc., and talk about directions of our future research, such as support for other VM management systems, and integration with the other cloud computing environments, such as Eucalyptus.

What: Research Technologies Roundtable
When: Thursday, August 27th, 12:30-1:30pm
Where: IMU Maple Room (IU Bloomington); ICTC 497 (IUPUI)

Live URL: http://tinyurl.com/n976xc
Archive URL: http://tinyurl.com/lwt94q


Developing Web Applications with DivRep Framework - July 23


DivRep framework is a simple Java servlet based web framework that allows you to create an AJAX application without having to write Javascript. With the DivRep framework, you can create both the server and the client side code in plain Java and without any special recompilations of the client side code to Javascript.

At this month's Research Technologies Roundtable, Arvind Gopu and Soichi Hayashi, UITS Open Science Grid, will cover the following topics:

  • Forms/Validation
  • Security
  • Real-World Applications
  • Current Issues and Future Plans


Batch Queueing Systems on BigRed and Quarry - May 28th, 2009


Maximize your computing productivity and help to improve Research Technologies' systems utilization by attending this discussion on resource management (batch queueing) systems. At this month's Research Technologies Roundtable, George Turner, UITS High Performance Systems, will cover the following topics:

  • BigRed & Quarry: similarities and differences
  • Queue structures and how they relate to infrastructure
  • Resource managers verses schedulers: LoadLeveler, TORQUE/PBS, Moab
  • Running interactively verses mass production batch processing
  • Scheduling strategies: fairshare, backfill and other dispatch tactics
  • Queue packing: I just wanna run!
  • Packed Queue: when will I run?
  • Debugging: it won't run!
  • Help: it is available


NITRD Reauthorization - current status - April 29th, 2009


NITRD (Networking and Information Technology Research and Development) is the act that provides overarching coordination of several federal agency's efforts in information technology research and development. NITRD is currently up for renewal. The Science and Technology Committee of the U.S. House of Representatives began this process with a hearing on 31 July 2008. Also during July, CASC (the Coalition for Academic Scientific Computing) and the Educause Campus CyberInfrastructure committee held a workshop on integrating university and federally-funded cyberinfrastructure systems.

Dr. Craig Stewart will relay the collective input collected from CASC and Educause about the current status of national academic cyberinfrastructure, and information on the development of NITRD reauthorization legislature. Particular focus will be given to the TeraGrid and the Open Science Grid (IU is a participant in both projects), as well as the potential role of Science Gateways, Cloud computing, and distributed file system approaches in the future. NITRD draft legislation will soon be open for public comment, so this talk should be useful background for those who wish to offer commentary on the NITRD draft legislation.

Download NITRD - A Thumbnail History and Overview
Download NITRD Agenda -- Themes, Logistics, February 24-26th

Using Matlab's Distributed Computing Server - April 17th


While Matlab was not initially conceived as a parallel programming tool, many users want to parallelize bits and pieces of Matlab code. The Matlab Distributed Computing Server (DCS) can help parallelize Matlab code. The talks will cover.
  • DCS availability and how to configure users settings
  • Prototyping parallelized code before submitting it to a job manager
  • What sorts of improvements can be seen with the DCS
  • Limitations and license restrictions of the DCS
Jefferson Davis, from the Research Technologies Stat/Math Center, will lead this discussion/presentation.

Security Best Practices - February 19th

Keith Lehigh, Research Technologies Security Engineer will discuss security best practices for servers, desktops and mobile platforms. He will examine the common security threats facing users and systems and what happens when a system is compromised. Keith will also cover methods for the secure use of passphraseless ssh keys when needed for unattended remote access.

February 19th Presentation Video


Parallel Data Mining on Multicore Clusters - December 3rd

At the December round table, Judy Qiu will discuss parallel data mining on multicore clusters. Judy received her PhD on "Messaged-based MVC Architecture for Distributed and Desktop Applications" in Computer Science from Syracuse University in 2005. She is working for UITS Research Technologies and is currently doing research in collaboration with PTL on multicore algorithms, software and performance.

A multicore CPU combines two or more cores (independent microprocessors) in a single chip. In the future, CPUs will have hundreds or thousands of cores. This will increase computing power for both research and commercial applications but also undoubtedly will present significant software challenges for parallel applications. SALSA project is targeted to develop a novel hybrid model of parallel computing, involving workflow and mashups linking high performance parallel modules implemented on multicore clusters. We believe this model will span both science and commercial uses of multicore systems.

We present a suite of parallel data mining algorithms for applications including GIS, cheminformatics, bioinformatics, particle physics and the structure of the World Wide Web. New parallel algorithms cover clustering and mixture models with built-in annealing to improve convergence and robustness. In addition, we have developed parallel methods for mapping high-dimensional spaces to a smaller number of dimensions for easier visualization and analytic processing. We are comparing PCA (Principal Component Analysis), GMT (Generative Topographic Mapping), and MDS (Multidimensional Scaling). PDC (Pairwise Data Clustering) has been implemented for major new applications including the search for gene families in a collection of a million sequences.

December 3rd Presentation Video


Web Service Resource Framework

This presentation will introduce Web Services through simple examples, and then present its "extension," the Web Services Resource Framework which allows web resources to "maintain state" by storing data accessible via operations standardized within the framework.

October 29th Presentation Video


Research Technologies Overview

Whether you're new to IU or returning, the Research Technologies (RT) division of UITS welcomes you. RT maintains some of the most powerful supercomputers in the world, as part of a comprehensive strategy which includes computers, data storage systems, data collections, instruments and sensor networks, and technical support.

As we head into fall, we'll take an updated look at Research Technologies' research compute systems which include the Big Red and Quarry supercomputers, High Performance Storage Services, the Data Capacitor, and the TeraGrid.

Please join us on Thursday, September 25th for an overview of these systems and an opportunity to speak with the sytem administrators and project managers.

September 25th Presentation Video


Cloud Computing and Virtualization

Cloud computing, virtualization, Amazon EC2, Google Apps are some of today's hottest technologies. In this month's roundtable we will be discussing cloud computing, virtualization, and one of IU's initiatives in this area, the Quarry Gateway Hosting Service. The Quarry Gateway Hosting service provides a web hosting environment to TeraGrid science gateways.

August 28th Presentation Video


The Indiana Spatial Data Portal and Service

The Indiana Spatial Data Portal and Service provides access to terabytes of Indiana imagery, including aerial photos, topographic maps, and digital elevation data. These imagery archives and publishing services support the Indiana Map, a single statewide map for Indiana.

Please join Geographic Information Systems (GIS) staff from UITS, the Indiana Geological Survey, and the Polis Center in exploring the IU community's geospatial imagery needs. Roundtable discussions will cover the current database and online services, projects which rely upon these resources, and upcoming statewide imagery and services.

July 31st Presentation Video


Help design a system to manage Big Red job submissions

Some researchers need to run thousands of jobs on Big Red. For example, a recent project to annotate the Maize genome required running 25,000 jobs, a difficult number to manage.

This month's Research Technologies Roundtable is an opportunity for you to provide input regarding the design of a UITS facility that will allow users to manage thousands of jobs.

Dick Repasky, Matt Allen, and Yu Ma will present the design goals of the project and outline plans for how users could specify, process, and monitor a pool of thousands of jobs.

June 26th Presentation Video


May 29, 2008 Science Gateways

After more than a decade of development, tools for Web-based access to computing resources and data archives are now very mature. This month's Roundtable will include discussion of these Science Gateways and the Grid middleware that they access, architecture and standards used by the science portal community, component-based Web portals, Web Services, and workflow (or service orchestration) tools. Also discussed will be Web 2.0 and Cloud Computing approaches to resource and data access, and these tools' eventual merger into Science Gateways and portals.

May 29th Presentation Video


April 24, 2008 Star-P

Star-P is a software platform that links high performance computational resources with programming languages such as Matlab and Python. Star-P extends and parallelizes these high level languages, converting their code so they can engage multiple processors simultaneously. This month's Roundtable will include discussions of how Star-P extends Matlab, optimizing Matlab code by incorporating Star-P data types, and Star-P availability on the IU campuses.

Dr. Stuart Broson, a senior development engineer at Interactive Supercomputing (a Star-P vendor), will also provide information on Star-P support for Python, plans for Star-P support for R, and the under-the-hood details of Star-P.

April 24th Presentation Video


March 27, 2008 Optimizing source code

Why optimize code that already works? This month's Research Technologies Roundtable centers on the time-versus-gain trade-offs of optimizing source code. Discussion will also include case examples from Big Red users, as well as simple tricks and possible pitfalls related to coding.

Optimizing Code Powerpoint Slides


February 28, 2008 Grid Tools

Accessing today's computer systems requires new paradigms beyond simply logging in and using the computer. The power of modern computing technology lies in its diverse and distributed nature. Tying together this vast cyberinfrastructure are the grid tools. Research Technologies will offer a brief introduction to these grid tools to stimulate discussion about ways to use these tools to increase our users' research productivity.

Grid Tools Powerpoint Slides


January 31, 2008 Meet the High Performance Applications Group

The High Performance Applications (HPA) group (formerly HPC) will present an overview of their activities and welcome questions from the audience. The mission of the HPA group is to help promote scholarly research through the use of high performance computing and communication environments. The HPA group works closely with the High Performance Systems (HPS) group to fulfill this mission. A sampling of HPA activities include: user support for IU faculty, staff and students who want to get started using our supercomputers, longer term 1-on-1 consulting, support for our NSF TeraGrid users, benchmarking new or upgraded systems, and (in the future) developing services that hide the complexity of using high performance applications. Please visit us at http://rtinfo.uits.iu.edu/hpa/ .

HPA Powerpoint Slides


November 1, 2007 Experiences implementing Big Red - a 30.72 TFLOPS IBM BladeCenter Cluster

IU's Big Red, a 20.4 TFLOPS IBM e1350 BladeCenter cluster, was ordered on 7 April 2006; on 28 June 2006, it appeared in the 27th Top500 list as the 23rd fastest supercomputer in the world. We have now expanded this system to 30.7 TFLOPS. In this talk, Dr. Craig Stewart, Associate Dean, Research Technologies, will discuss the basic architecture of Big Red, its implementation, and management systems. In addition, he will describe the performance characteristics of the system in some detail. Even prior to the upgrade, Big Red was one of the largest supercomputers integrated into the US TeraGrid. We will discuss the challenges of supporting and using a very large Linux cluster based on IBM's Power architecture within the context of the TeraGrid, which is heavily dominated by Linux clusters running Intel instruction sets. Dr. Stewart will also describe some of the science results obtained with Big Red, including weather prediction and protein structure prediction.


September 26, 2007 Research Data Complex (RDC) Migration

For our September presentation, we'll discuss the recent Research Data Complex (RDC) migration to a new platform. The RDC is dedicated to faculty, graduate students with a faculty sponsor, and staff needing a research databases. The RDC infrastructure supports primarily Oracle database applications and is well-suited for data-intensive applications. The RDC staff in High Performance Systems provides database hosting, administration, and consulting services to assist researchers with gaining access to data, managing data, determining the right data-related tools and processes, and database design and implementation. Our presentation will explore what benefits the new hardware and Oracle database support bring to IU researchers.

RDC Migration Powerpoint Slides


August 28, 2007 The Research File System (RFS)

Kurt Seiffert, manager of the UITS Research Storage group, will talk about the Research File System (RFS). Based on OpenAFS, the Research File System allows researchers to store and access files from a wide range of platforms, from desktop systems to supercomputers. Recently RFS was upgraded, to increase both storage capacity and the number of simultaneous users supported. This talk will discuss these enhancements, and the various uses of RFS.

Research File Systems Powerpoint Slides


July 25, 2007 IU's Participation in the TeraGrid

For our July presentation, we'll review IU's participation in the TeraGrid, the NSF's flagship effort to build a national cyberinfrastructure to support scientific research. This will include an overview of the project and progress to date, and a look at IU's unique contribution.

The TeraGrid provides academic researchers nationwide with access to some of the most powerful computing, storage, and network resources available, and continues to develop tools to integrate these resources and make them easier to use. Our presentation will look at how innovations under development at IU and other TeraGrid partner sites are transforming the way big science is done.

IU & the TeraGrid Powerpoint Slides


June 27, 2007 Research Technologies Systems Update

As we head into midsummer, we'll take an updated look at Research Technologies' research compute systems and where we are headed through the second half of 2007. By the time we meet in June, the Indiana Economic Development Corporation's expansion to Big Red will have occurred, the Libra expansion will be close to complete, the migration of the Research Database Complex will be well underway, and the new Intel X86_64 cluster should be newly delivered and awaiting installation.

It's important that we hear from the users of Research Technologies' compute systems in these round table discussions. As a result of January's meeting, we were able to justify a new X86_64 system. Your input is important and we encourage you to continue to participate in these valuable discussions.

Research Systems Update Powerpoint Slides


May 30, 2007 -- UITS Advanced Visualization Lab

Do you have a project that might benefit from visualization but you don't know where to start? Staff from the UITS Advanced Visualization Lab will describe the many advanced visualization technologies available to the IU community. The discussion will include the existing hardware in the IT building on the IUPUI campus as well as the new display opening soon in Lindley Hall on the IUB campus. Bring your ideas and let's talk vis!

AVL Systems Powerpoint Slides


April 25, 2007 -- Fetal Alcohol Spectrum Disorder

Fetal Alcohol Spectrum Disorder (FASD) is a term used to describe the range of disabilities caused by prenatal exposure to alcohol. The Collaborative Initiative on FASDs (CIFASD)'s mission is to inform and develop effective interventions and treatment approaches for FASD, through multidisciplinary research involving basic, behavioral and clinical investigators and projects.

The Scientific Data Services group contributes to this initiative by providing custom software, support, as well as a Central Repository for research data. Michel Tavares, FASD Technical Architect, will introduce the FASD project, describe the different technologies that support it, and discuss how similar projects could benefit from this approach.

FASD Powerpoint Slides


March 28th -- IU's Data Capacitor

Indiana University's Data Capacitor is a 535 TeraByte filesystem capable of receiving data at an aggregate rate exceeding 14.5 GigaBytes per second. Architected for short to mid term storage of large data sets, the Data Capacitor goes into production at the beginning of April. Stephen Simms, Data Capacitor project manager, will introduce the Data Capacitor, explain the technology that supports it, and discuss its many uses.

Data Capacitor Powerpoint Slides


February 28th -- What you need to know about Massive Data Storage Service

Learn about recent changes and enhancements made to the Massive Data Storage Service (MDSS) at this month's meeting. Hear about access methods that have been added or modified, and changes needed to access the new version of MDSS.

MDSS Powerpoint Slides


January 31 -- AVIDD's Pending Retirement

The topic for Janurary will be AVIDD's pending retirement and options for migrating to new systems, including local and TeraGrid migration options. Bring your lunch and your concerns about migrating to the newer, higher performance platforms.

AVIDD's Retirement Powerpoint Slides