9th Parallel Data Storage Workshop

held in conjunction with SC14

, Qatar Computing Research Institute; North Carolina State University
, Argonne National Laboratory

General Chair:
Dean Hildebrand, IBM, USA

Sunday, November 16, 2014
Rooms 271 & 272, Ernest N. Morial Convention Center
New Orleans, LA

SC14 Workshop Web Page

Work-in-progress submissions - now closed

abstract / agenda / cfp / WIP session / attending the workshop
committees / camera ready instructions


Peta- and exascale computing infrastructures make unprecedented demands on storage capacity, performance, concurrency, reliability, availability, and manageability. This one-day workshop focuses on the data storage and management problems and emerging solutions found in peta- and exascale scientific computing environments, with special attention to issues in which community collaboration can be crucial for problem identification, workload capture, solution interoperability, standards with community buy-in, and shared tools.

Addressing storage media ranging from tape, HDD, and SSD, to emerging storage devices like NVRAM, the workshop seeks contributions on relevant topics, including but not limited to:

  • performance and benchmarking
  • failure tolerance problems and solutions
  • APIs for high performance features
  • parallel file systems
  • high bandwidth storage architectures
  • support for high velocity or complex data
  • metadata intensive workloads
  • autonomics for HPC storage
  • virtualization for storage systems
  • archival storage advances
  • resource management innovations
  • incorporation of emerging storage technologies
  • workload study from production systems


8:55am - 9:00am Welcome & Introduction
9:00am – 10:00am Keynote SpeakerSage Weil, Red Hat
10:00am - 10:30am Morning Break
10:30am – 12:00pm SESSION 1: HPC I/O
Chair: Carlos Maltzahn, University of California, Santa Cruz
  BatchFS: Scaling the File System Control Plane with Client-Funded Metadata Servers
*Qing Zheng, Carnegie Mellon University
Kai Ren, Carnegie Mellon University
Garth Gibson, Carnegie Mellon University
Paper | Slides
  Using Property Graphs for Rich Metadata Management in HPC Systems
*Dong Dai, Texas Tech University
Robert B. Ross, Argonne National Laboratory
Philip Carns, Argonne National Laboratory
Dries Kimpe, Argonne National Laboratory
Yong Chen, Texas Tech University
Paper | Slides
  Evaluating Lustre's Metadata Server on a Multi-Socket Platform
*Konstantinos Chasapis, University of Hamburg
Manuel F. Dolz, University of Hamburg
Michael Kuhn, University of Hamburg
Thomas Ludwig, University of Hamburg
Paper | Slides
12:00pm - 1:30pm Lunch (not provided)
1:30pm – 2:30pm SESSION 2: Distributed I/O Systems
Chair: Ron Oldfield, Sandia National Laboratories
  Alleviating I/O Interference via Caching and Rate-Controlled Prefetching without Degrading Migration Performance
*Morgan Stuart, Virginia Commonwealth University
Tao Lu, Virginia Commonwealth University
Xubin He, Virginia Commonwealth University
Paper | Slides
  VSFS: A Searchable Distributed File System
Lei Xu, Cloudera
Ziling Huang, NetApp
Hong Jiang, University of Nebraska-Lincoln
*Lei Tian, University of Nebraska-Lincoln
David Swanson, University of Nebraska-Lincoln
Paper | Slides
2:30pm – 3:00pm WIP SESSION 1 -- Chair: Garth Gibson, Carnegie Mellon

1. Recent Progress in Tuning Performance of Large-scale I/O with Parallel HDF5
, M. Scot Breitenfeld, Kalyana Chadalavada, Robert Sisneros, Suren Byna, Quincey Koziol, Neil Fortner, Prabhat Mr. and Venkat Vishwanath.
2. Achieving up to Zero Communication Delay in BSP-based Graph Processing via Vertex Categorization, Xuhong Zhang, Ruijun Wang and Jun Wang.
3. Wireless Network as a Multicasting Channel for MPI IO, Wenguang Chen, Wei Xue, Jidong Zhai and Weimin Zheng.
4. SideIO: A Sided I/O System Framework for Hybrid Scientific Workflow, Dan Huang, Jiangling Yin, Jun Wang and Qing Liu.
5. Performance Improvement of Gfarm Using InfiniBand RDMA, Shin Sasaki, Ryo Matsumiya, Kazushi Takahashi and Yoshihiro Oyama.
3:00pm - 3:30pm Afternoon Break
3:30pm – 5:00pm SESSION 3: I/O Design and Evaluation Tools
Chair: Rob Ross, Argonne National Laboratory
  Automatic Generation of I/O Kernels for HPC Applications
Babak Behzad, University of Illinois at Urbana-Champaign
Hoang-Vu Dang, University of Illinois at Urbana-Champaign
Farah Hariri, University of Illinois at Urbana-Champaign
Weizhe Zhang, Harbin Institute of Technology
Marc Snir, University of Illinois at Urbana-Champaign and Argonne
    National Laboratory
Paper | Slides
  HPIS3: Towards a High-performance Simulator for Hybrid Parallel I/O and Storage Systems
*Bo Feng, Illinois Institute of Technology
Ning Liu, Illinois Institute of Technology
Shuibing He, Illinois Institute of Technology
Xian-He Sun, Illinois Institute of Technology
Paper | Slides
  Feign: In-Silico Laboratory for Researching I/O Strategies (Using the Flexible Event Imitation Engine (Feign) to Alter
Application I/O)

Jakob Lüttgau, Universität Hamburg
Julian M. Kunkel, DKRZ
*Michaela Zimmer, Universität Hamburg
Paper | Slides
5:00pm - 5:30pm WIP SESSION 2 -- Chair: Garth Gibson, Carnegie Mellon

1. Opass: Analysis and Optimization of Parallel Data Access on Distributed File Systems, Jiangling Yin, Jun Wang and Tyler Lukasiewicz.
2. Predicting Performance of Non-Contiguous I/O with Machine Learning, Julian Kunkel, Eugen Betke and Michaela Zimmer.
3. ifarm: Implementing Inline Deduplication to a Distributed File System, Ryo Matsumiya, Shin Sasaki, Kazushi Takahashi and Yoshihiro Oyama.
4. Investigation into RAID Front External Journaling with SSD, Benjamin Young.
5. Mig-drive -- a High Performance HDF5 Driver with Meta-data, Migrate Liqiang Cao and Weichao Shen.


CALL FOR PAPERS POSTER - download, print, and hang one up at your office / department!

The Parallel Data Storage Workshop holds a peer reviewed competitive process for selecting short papers. Submit a not previously published short paper of up to 5 pages, not less than 10 point font and not including references, in a PDF file as instructed on the workshop web site. Submitted papers will be reviewed under the supervision of the workshop program committee. Submissions should indicate authors and affiliations. Final papers must not be longer than 5 pages (excluding references). Selected papers and associated talk slides will be made available on the workshop web site; the papers will also be published in the digital library of the IEEE or ACM.


Due: 9pm PDT, Saturday, August 30, 2014 - closed
Notification to authors: Tuesday, September 30, 2014
Camera-ready due: Tuesday, October 14, 2014
Slides due: Saturday, Nov. 15, 2014, 5:00 pm PDT, BEFORE the workshop
- please
email them to Joan


Work-in-progress (WIP) Submissions -- now closed

There will also be a WIP session at the workshop, where presenters give 5-minute brief talks on their on-going work, with fresh problems/solutions, but may not be mature or complete yet for paper submission. A 1-page abstract is required.



Please be aware that all attendees to the workshop, both speakers and participants, will have to pay the SC14 registration fee. Workshops are no longer included as part of the technical program registration. With a paid Technical Program registration, workshop fees are $50 for Members/Non-Members and $25 for Students. A workshop only fee is available for $200 for Members/Non-Members and $100 for Students.

To attend the workshop, please register through the Supercomputing '14 registration page. Registration opens in July.

program COMMITTEE:

Andre Brinkmann, Johannes Gutenberg University, Germany
Garth Gibson, Carnegie Mellon University and Panasas Inc., USA
Dean Hildebrand, IBM, USA - General Chair
Hong Jiang, University of Nebraska, USA
Youngjae Kim, Oak Ridge National Laboratory, USA
Dries Kimpe (Chair), Argonne National Laboratory, USA
Xiaosong Ma (Chair), Qatar Computing Research Institute, Qatar
    and North Carolina State University
Carlos Maltzahn, University of California, Santa Cruz, USA
Ron Oldfield, Sandia National Laboratories, USA
Narasimha Reddy, Texas A&M University, USA
Robert Ross, Argonne National Laboratory, USA
Karsten Schwan, Georgia Tech, USA
Matt Tolentino, Intel, USA
Jin Xiong, ICT, Chinese Academy of Science, China
Sudharshan Vazhkudai, Oak Ridge National Laboratory, USA


John Bent, EMC
Scott Brandt, University of California, Santa Cruz
Evan Felix, Pacific Northwest National Laboratory
Garth Gibson, Carnegie Mellon University and Panasas Inc.
Gary Grider, Los Alamos National Laboratory
Dean Hildebrand, IBM
Peter Honeyman, University of Michigan
Darrell Long, University of California, Santa Cruz
Carlos Maltzahn, University of California, Santa Cruz
Rob Ross, Argonne National Laboratory
Phil Roth, Oak Ridge National Laboratory
Karsten Schwan, Georgia Tech
John Shalf, National Energy Research Scientific Computing Center,
   Lawrence Berkeley National Laboratory
Lee Ward, Sandia National Laboratories