pdsw 2019:

4th International Parallel Data Systems Workshop


HELD IN CONJUNCTION WITH SC19: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS


Monday, November 18, 2019

Colorado Convention Center
Denver, CO

Time: 9:00am - 5:30 pm

Location: Room 601

SC Workshop page


Program Co-Chairs:

Lawrence Berkeley National Laboratory


Argonne National Laboratory

Publicity Chair:

EPCC

Web & Publications Chair:

Carnegie Mellon University
General Chair:

New York University,
Courant Institute of Mathematical Sciences Center for Data Science

Reproducibility Co-Chairs:

University of California, Santa Cruz


University of California, Santa Cruz

Work-In-Progress Chair:

Sandia National Laboratory

abstract / agenda / keynote speaker / cfp / submissions / reproducibility / WIP session
committees / author instructions / workshop registration


keynote speaker

PDSW 19 is proud to announce that Haoyuan (H.Y.) Li, Alluxio, will be our keynote speaker. He will be discussing Data orchestration for AI, Big Data, and Cloud. Please watch for further details here.

Haoyuan (H.Y.) Li is the Founder, Chairman, and CTO of Alluxio. He holds a PhD in computer science from UC Berkeley’s AMPLab, where he co-created the Alluxio (formerly Tachyon) open source data orchestration system, co-created Apache Spark Streaming, and became an Apache Spark founding committer. He also holds an MS from Cornell University and a BS from Peking University, both in computer science.


agenda


Information on scheduling will be added here as the event approaches.


WORKSHOP ABSTRACT


We are pleased to announce the 4th International Parallel Data Systems Workshop (PDSW’19). PDSW'19 will be hosted in conjunction with SC19: The International Conference for High Performance Computing, Networking, Storage and Analysis.

Efficient data storage and data management are crucial to scientific productivity in both traditional simulation-oriented HPC environments and Big Data analysis environments. This issue is further exacerbated by the growing volume of experimental and observational data, the widening gap between the performance of computational hardware and storage hardware, and the emergence of new data-driven algorithms in machine learning.

The goal of this workshop is to facilitate research that addresses the most critical challenges in scientific data storage and data processing. We therefore encourage the community to submit original manuscripts that:

  • introduce and evaluate novel algorithms or architectures,
  • inform the community of important scientific case studies or workloads, or
  • validate the reproducibility of previously published work

Special attention will be given to issues in which community collaboration is crucial for problem identification, workload capture, solution interoperability, standardization, and shared tools. We also strongly encourage papers to share complete experimental environment information (software version numbers, benchmark configurations, etc.) to facilitate collaboration.

Topics of interest include the following:

  • Scalable architectures for data storage, archival, and virtualization
  • Performance benchmarking, resource management, and workload studies
  • Programmability of storage systems
  • Parallel file systems, metadata management, and complex data management
  • Alternative data storage models, including object stores and key-value stores
  • Programming models and frameworks for data intensive computing
  • Techniques for data integrity, availability, reliability, and fault tolerance
  • Productivity tools for data intensive computing, data mining, and knowledge discovery
  • Application of emerging big data frameworks towards scientific computing and analysis
  • Enabling cloud and container-based models for scientific data analysis
  • Data filtering/compressing/reduction techniques
  • Tools and techniques for managing data movement among compute and data intensive components
  • Integrating computation into the memory and storage hierarchy to facilitate in-situ and in-transit data processing

CALL FOR PAPERS

 

CALL FOR PAPERS - now available


Regular paper SUBMISSIONS

 

All papers will be evaluated by a competitive peer review process under the supervision of the workshop program committee. Selected papers and associated talk slides will be made available on the workshop web site. The papers will also be published by the IEEE TCHPC.

Authors are also strongly encouraged to automate the reproducibility and validation of their experimental results. Submissions that are accompanied by URLs to resources that allow reviewers to repeat automatic validation will be given favorable consideration for the PDSW Best Paper award. The PDSW reproducibility initiative will do their best to provide infrastructure and resources to support automated reproducibility and validation. PDSW reviewers, while appreciative, might not be able to validate non-automated artifact descriptions and evaluations included in (optional) reproducibility appendices. Read detailed information on the PDSW reproducibility initiative (bit.ly/pdsw-automatic).

Submit a not previously published paper as a PDF file, indicate authors and affiliations. Papers must be between 6 and 10 pages long including references, but not including optional reproducibility appendices. Papers must use the IEEE conference paper template available at: https://www.ieee.org/conferences/publishing/templates.html.

Deadlines

Submissions deadline: Paper (in pdf format) due Sep. 1, 2019, 11:59 PM AoE
Submissions website: https://submissions.supercomputing.org/
Notification: September 29, 2019
Camera ready and copyright forms due: October 11, 2019, 11:59 PM AoE
Slides due before workshop: November 10, 2019 to jdigney@cs.cmu.edu
* Submissions must be in the IEEE conference format


Work In Progress Session


There will be a WIP session where presenters provide brief 5-minute talks on their on-going work, with fresh problems/solutions. WIP content is typically material that may not be mature or complete enough for a full paper submission and will not be included in the proceedings. A one-page abstract is required.

Deadlines

Work in Progress (WIP) submissions due: Nov. 3, 2019, 11:59 PM AoE WIP
Notification: Nov. 10, 2019
Submissions website: https://submissions.supercomputing.org/


Workshop Registration

To attend the workshop, please register through the Supercomputing '19 registration page. Registration opens July 11, 2019.


PROGRAM COMMITTEE:

  • Yong Chen, Texas Tech University
  • Yue Cheng, George Mason University
  • Jason Cope, DDN Storage
  • Stratos Efstathiadis, New York University
  • Lisa Gerhardt, Lawrence Berkeley National Laboratory
  • Elsa J. Gonsiorowski, Lawrence Livermore National Laboratory
  • Jian Huang, University of Illinois
  • Shadi Ibrahim, French Institute for Research in Computer Science and Automation (INRIA)
  • Sidharth Kumar, University of Alabama
  • Julian Kunkel, University of Reading
  • Johann Lombardi, Intel Corporation
  • Xiaoyi Lu, Ohio State University
  • Pierre Matri, Argonne National Laboratory
  • Ron Oldfield, Sandia National Laboratories
  • Sangmi Pallickara, Colorado State University
  • Vasily Tarasov, IBM
  • Osamu Tatebe, University of Tsukuba
  • Gala Yadgar, Technion - Israel Institute of Technology
  • Amelie Chi Zhou, Shenzhen University

STEERING COMMITTEE:

  • John Bent, Seagate
  • Ali R. Butt, Virginia Tech
  • Shane Canon, Lawrence Berkeley National Laboratory
  • Raghunath Raja Chandrasekar, Amazon Web Services
  • Yong Chen, Texas Tech University
  • Evan J. Felix, Pacific Northwest National Laboratory
  • Gary Grider, Los Alamos National Laboratory
  • William D. Gropp, University of Illinois at Urbana-Champaign
  • Dean Hildebrand, Google
  • Dries Kimpe, 3 Red Partners
  • Jay Lofstead, Sandia National Laboratories
  • Xiaosong Ma, Qatar Computing Research Institute, Qatar
  • Carlos Maltzahn, University of California, Santa Cruz
  • Suzanne McIntosh, New York University
  • Kathryn Mohror, Lawrence Livermore National Laboratory
  • Robert Ross, Argonne National Laboratory
  • Philip C. Roth, Oak Ridge National Laboratory
  • John Shalf, Lawrence Berkeley National Laboratory
  • Xian-He Sun, Illinois Institute of Technology
  • Rajeev Thakur, Argonne National Laboratory
  • Lee Ward, Sandia National Laboratories
  • Brent Welch, Google