pdsw 2019:

4th International Parallel Data Systems Workshop


Monday, November 18, 2019

Colorado Convention Center
Denver, CO

Time: 9:00am - 5:30 pm

Location: Room 601

SC Workshop page

Program Co-Chairs:

Lawrence Berkeley National Laboratory

Argonne National Laboratory

Publicity Chair:


Web & Publications Chair:

Carnegie Mellon University
General Chair:

New York University,
Courant Institute of Mathematical Sciences Center for Data Science

Reproducibility Co-Chairs:

University of California, Santa Cruz

University of California, Santa Cruz

Work-In-Progress Chair:

Sandia National Laboratory

abstract / agenda / keynote speaker / cfp / submissions / WIP session / committees
author instructions

keynote speaker

Keynote speaker to be announced in June 2019.


Information on scheduling will be added here as the event approaches.


(Find the complete proposal outlining the merger between PDSW and DISCS here.)

We are pleased to announce that the 4th International Parallel Data Systems Workshop (PDSW’19) will be hosted at SC19: The International Conference for High Performance Computing, Networking, Storage and Analysis. The objectives of this one day workshop are to promote and stimulate researchers’ interactions to address some of the most critical challenges for scientific data storage, management, devices, and processing infrastructure for both traditional compute intensive simulations and data-intensive high performance computing solutions. Special attention will be given to issues in which community collaboration can be crucial for problem identification, workload capture, solution interoperability, standards with community buy-in, and shared tools.

Many scientific problem domains continue to be extremely data intensive. Traditional high performance computing (HPC) systems and the programming models for using them such as MPI were designed from a compute-centric perspective with an emphasis on achieving high floating point computation rates. But processing, memory, and storage technologies have not kept pace and there is a widening performance gap between computation and the data management infrastructure. Hence data management has become the performance bottleneck for a significant number of applications targeting HPC systems. Concurrently, there are increasing challenges in meeting the growing demand for analyzing experimental and observational data. In many cases, this is leading new communities to look towards HPC platforms. In addition, the broader computing space has seen a revolution in new tools and frameworks to support Big Data analysis and machine learning.

There is a growing need for convergence between these two worlds. Consequently, the U.S. Congressional Office of Management and Budget has informed the U.S. Department of Energy that new machines beyond the first exascale machines must address both traditional simulation workloads as well as data intensive applications. This coming convergence prompted the integration of the PDSW and DISCS workshops into a single entity to address the common challenges.

The scope of PDSW is summarized as:

  • Scalable storage architectures, archival storage, storage virtualization, emerging storage devices and techniques
  • Performance benchmarking, resource management, and workload studies from production systems including both traditional HPC and data-intensive workloads.
  • Programmability, APIs, and fault tolerance of storage systems
  • Parallel file systems, metadata management, and complex data management, object and key-value storage, and other emerging data storage/retrieval techniques
  • Programming models and frameworks for data intensive computing including extensions to traditional and nontraditional programming models, asynchronous multi-task programming models, or to data intensive programming models
  • Techniques for data integrity, availability and reliability especially
  • Productivity tools for data intensive computing, data mining and knowledge discovery
  • Application or optimization of emerging “big data” frameworks towards scientific computing and analysis
  • Techniques and architectures to enable cloud and container-based models for scientific computing and analysis
  • Techniques for integrating compute into a complex memory and storage hierarchy facilitating in situ and in transit data processing
  • Data filtering/compressing/reduction techniques that maintain sufficient scientific validity for large scale compute-intensive workloads
  • Tools and techniques for managing data movement among compute and data intensive components both solely within the computational infrastructure as well as incorporating the memory/storage hierarchy




This year, we are soliciting two categories of papers, regular papers and reproducibility study papers. Both will be evaluated by a competitive peer review process under the supervision of the workshop program committee. Selected papers and associated talk slides will be made available on the workshop web site. The papers will also be published in the digital libraries of the IEEE and ACM.

Regular paper SUBMISSIONS


We invite regular papers which may optionally include Experimental Details Appendices as described here for SC. Submissions that include experimental appendices and/or reproducibility information for automated validation (described below) will be given special consideration by the Program Committee. Authors are encouraged to include reproducibility information for automated validation of experimental results. A description of the infrastructure available for automated validation and the criteria used in validation are available here. Accepted submissions passing automated validation will earn the Results Replicated badge in the ACM DL in accordance with ACM’s artifact evaluation policy.


Submissions deadline: Paper (in pdf format) due Sep. 1, 2019, 11:59 PM AoE
Submissions website: https://submissions.supercomputing.org/
Notification: September 29, 2019
Camera ready and copyright forms due: October 11, 2019, 11:59 PM AoE
Slides due before workshop: November 10, 2019 to jdigney@cs.cmu.edu
* Submissions must be in the IEEE conference format


We also call for reproducibility studies that for the first time reproduce experiments from papers previously published in PDSW or in other peer-reviewed conferences with similar topics of interest (see reproducibility study instructions). Reproducibility study submissions are selected by the same peer-reviewed competitive process as regular papers, except these submissions must pass automated validation of experimental results (see artifact evaluation criteria). Accepted submissions passing automated validation will earn the prestigious ACM “Results Replicated” Badge and, if the work under study was successfully reproduced, the associated paper will earn the ACM “Results Reproduced” Badge in the ACM DL in accordance with ACM’s artifact review and badging policy.


Submissions deadline: Paper (in pdf format) due Sep. 1, 2019, 11:59 PM AoE
Submissions website: https://submissions.supercomputing.org/
Notification: September 29, 2019
Camera ready and copyright forms due: October 6, 2019, 11:59 PM AoE
Slides due before workshop: November 10, 2019 to jdigney@cs.cmu.edu

* Submissions must be in the IEEE conference format. Details on reproducibility
criteria are here.

Guidelines for Regular Papers and Reproducibility Study Papers

Papers must be at least 6 pages long and no more than 10 pages long (including appendices and references, but not including the optional reproducibility appendix). Download the IEEE conference paper template.

Details on reproducibility criteria are here.

Work-in-progress (WIP) Submissions

There will be a WIP session where presenters provide brief 5-minute talks on their on-going work, with fresh problems/solutions. WIP content is typically material that may not be mature or complete enough for a full paper submission. A one-page abstract is required.

Please email your submission to:

WIP Submission Deadline: Nov. 3, 2019, 11:59 PM AoE
WIP Notification: November 10, 2019


Please be aware that all attendees to the workshop, both speakers and participants, will have to pay the SC19 registration fee. Workshops are no longer included as part of the technical program registration.

To attend the workshop, please register through the Supercomputing '19 registration page. Registration opens July 11, 2019.


  • Yong Chen, Texas Tech University
  • Yue Cheng, George Mason University
  • Jason Cope, DDN Storage
  • Stratos Efstathiadis, New York University
  • Lisa Gerhardt, Lawrence Berkeley National Laboratory
  • Elsa J. Gonsiorowski, Lawrence Livermore National Laboratory
  • Jian Huang, University of Illinois
  • Shadi Ibrahim, French Institute for Research in Computer Science and Automation (INRIA)
  • Sidharth Kumar, University of Alabama
  • Julian Kunkel, University of Reading
  • Johann Lombarddi, Intel Corporation
  • Xiaoyi Lu, u Ohio State University
  • Pierre Matri, Argonne National Laboratory
  • Ron Oldfield, Sandia National Laboratories
  • Sangmi Pallickara, Colorado State University
  • Vasily Tarasov, IBM
  • Osamu Tatebe, University of Tsukuba
  • Gala Yadgar, Technion - Israel Institute of Technology
  • Amelie Chi Zhou, Shenzhen University


  • John Bent, Seagate
  • Ali R. Butt, Virginia Tech
  • Shane Canon, Lawrence Berkeley National Laboratory
  • Raghunath Raja Chandrasekar, Amazon Web Services
  • Yong Chen, Texas Tech University
  • Evan J. Felix, Pacific Northwest National Laboratory
  • Gary Grider, Los Alamos National Laboratory
  • William D. Gropp, University of Illinois at Urbana-Champaign
  • Dean Hildebrand, Google
  • Dries Kimpe, KCG, USA
  • Jay Lofstead, Sandia National Laboratories
  • Xiaosong Ma, Qatar Computing Research Institute, Qatar
  • Carlos Maltzahn, University of California, Santa Cruz
  • Suzanne McIntosh, New York University
  • Kathryn Mohror, Lawrence Livermore National Laboratory
  • Robert Ross, Argonne National Laboratory
  • Philip C. Roth, Oak Ridge National Laboratory
  • John Shalf, Lawrence Berkeley National Laboratory
  • Xian-He Sun, Illinois Institute of Technology
  • Rajeev Thakur, Argonne National Laboratory
  • Lee Ward, Sandia National Laboratories
  • Brent Welch, Google