5th Petascale Data Storage Workshop

held in conjunction with
Supercomputing '10

Chair: Carlos Maltzahn, UCSC

Monday, November 15, 2010
8:55 a.m. - 5:45 p.m.
Rooms 280 & 281, New Orleans Convention Center, New Orleans, LA

IEEE Digital Library Proceedings


workshop abstract

Peta- and exascale computing infrastructures make unprecedented demands on information storage capacity, performance, concurrency, reliability, availability, and manageability. This one-day workshop focuses on the data storage problems and emerging solutions found in peta- and exascale scientific computing environments, with special attention to issues in which community collaboration can be crucial, problem identification, workload capture, solution interoperability, standards with community buy-in, and shared tools. This workshop seeks contributions on relevant topics, including but not limited to: performance and benchmarking results and tools, failure tolerance problems and solutions, APIs for high performance features, parallel file systems, high bandwidth storage architectures, wide area file systems, metadata intensive workloads, autonomics for HPC storage, virtualization for storage systems, archival storage advances, resource management innovations.

All papers appear in the IEEE Digital Library.


AGENDA

8:55am - 9:00am
Welcome - Garth Gibson, CMU; Carlos Maltzahn, UCSC
9:00am - 9:45am
Keynote Speaker - John Shalf (LBNL/NERSC)
Exascale Computing Hardware Challenges
Abstract and Speaker Bio | Slides
9:45am - 10:15am
POSTER SESSION 1 - List of participants and links to posters
10:15am - 11:45am
SESSION 1: Keeping Data
Chair: Galen Shipman, Oak Ridge National Laboratory
 

Self-Adjusting Two-Failure Tolerant Disk Arrays
I. Corderi, Universidad Católica del Uruguay; T. J. Schwarz, Universidad Católica del Uruguay; A. Amer, Santa Clara University; D. D. E. Long, UC Santa Cruz; J.-F. Pâris, University of Houston
Speaker: Darrell Long
Paper | Slides

Using a Shared Storage Class Memory Device to Improve the Reliability of RAID Arrays
S. Chaarawi, University of Houston; J.-F. Pâris, University of Houston; A. Amer, Santa Clara University; T. J. Schwarz, Universidad Católica del Uruguay; D. D. E. Long, UC Santa Cruz
Speaker: Jehan-François Pâris
Paper | Slides

Semantic Data Placement for Power Management in Archival Storage
A. Wildani, E. Miller, UC Santa Cruz
Speaker: Ethan Miller
Paper | Slides

11:45pm - 1:15pm
Lunch
1:15pm - 2:45pm
SESSION 2: Accessing Data
Chair: Ron A. Oldfield, Sandia National Laboratory
 

Workload Characterization of a Leadership Class Storage Cluster
Y. Kim, R. Gunasekaran, G. Shipman, D. Dillow, Z. Zhang, B. Settlemyer, Oak Ridge National Laboratory
Speaker: Youngjae Kim
Paper | Slides

Performance Analysis of Commodity and Enterprise Class Flash Devices
N. Master, M. Andrews, J. Hick, S. Canon, N. Wright, NERSC, Lawrence Berkeley National Lab
Speaker: Nicholas J. Wright
Paper | Slides

Virtualization-based Bandwidth Management for Parallel Storage Systems
Y. Xu, Florida International University; L. Wang, Florida International University; D. Clavijo, Florida International University; Y. Liu, University of Florida; R. Figueiredo, University of Florida; M. Zhao, Florida International University
Speaker: Yiqi Xu
Paper | Slides

2:45pm - 3:15pm
POSTER SESSION 2 - List of participants and links to posters
3:15pm - 4:45pm
SESSION 3: Moving Data
Chair: Dean Hildebrand, IBM Almaden Research Center
 

Extracting Information ASAP!
H. Abbasi, Georgia Institute of Technology; G. Eisenhauer, Georgia Institute of Technology; S. Klasky, Oak Ridge National Laboratory; K. Schwan, Georgia Institute of Technology;M. Wolf, Georgia Institute of Technology
Speaker: Hasan Abbasi
Paper

Collective Prefetching for Parallel I/O Systems
Y. Chen, P. Roth, Oak Ridge National Laboratory
Speaker: Yong Chen
Paper | Slides

Towards Parallel Access of Multi-dimensional, Multi-resolution Scientific Data
S. Kumar, Valerio Pascucci, University of Utah; V. Vishwanath, P. Carns, R. Latham, T. Peterka, M. Papka, R. Ross, Argonne National Laboratory
Speaker: Sidharth Kumar
Paper | Slides

4:45pm - 5:15pm
Short Announcements

A New Community Resource for Experiments at Scale: PRObE. Garth Gibson, Carnegie Mellon University; Gary Grider, Los Alamos National Laboratory; Katharine Chartrand, New Mexico Consortium; Andree Jacobson, New Mexico Consortium
Speaker: Garth Gibson
Slides

Town Hall Meeting (resulting notes follow)

- Attendees were happy to have a keynote; thought talk lengths were appropriate; were pleased with the web site, especially its simplicity.

- Requests for the future include printed agendas available to attendees, draw in IO-intensive users to workshop, possibly through keynote, update SC11 web site with PDSW11 agenda as it is available.

- Specific to SC11 we should be involved in the SC11 theme on "Data Intensive Computing", perhaps with a Data Intensive theme ourselves, and with connections to the other theme events

- With respect to the aging name "Petascale" we will relabel PDSW as Parallel Data Storage Workshop, so we can have the 6th PDSW at SC11.

- We should note and track the plan for SC to become/join a SIG in ACM, IEEE or SIAM, so we can have an any-time-in-the-year sponsor, not just in November.

5:15pm - 5:45pm
POSTER SESSION 3 - List of participants and links to posters


COMMITTEE:

Carlos Maltzahn, University of California, Santa Cruz
John Bent, Los Alamos National Laboratory
Galen Shipman, Oak Ridge National Laboratory
Sage Weil, DreamHost
Roger Haskin, IBM Almaden Research Center
Rob Ross, Argonne National Laboratory
Brent Welch, Panasas
Karsten Schwan, Georgia Institute of Technology
Ron A. Oldfield, Sandia National Laboratory
Dean Hildebrand, IBM Almaden Research Center
Yong Chen, Oak Ridge National Laboratory
Peter Braam, Xyratex

STEERING COMMITTEE:

Phil Roth, Oak Ridge National Laboratory
Evan Felix, Pacific Northwest National Laboratory
Peter Honeyman, University of Michigan
Scott Brandt, University of California, Santa Cruz
Gary Grider, Los Alamos National Laboratory
Garth Gibson, Carnegie Mellon University
Darrell Long, University of California, Santa Cruz
John Shalf, National Energy Research Scientific Computing Center
Bill Kramer, National Center for Supercomputing Applications/
    University of Illinois Urbana-Champaign
Lee Ward, Sandia National Laboratories