pdsw 2022:

7th International Parallel Data Systems Workshop


HELD IN CONJUNCTION WITH SC22: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS

In cooperation with: IEEE Computer Society


DATE: November 14, 2022
Kay Bailey Hutchison
Convention Center
Dallas, TX

Time: 1:30 PM - 5:00 pm (CST)
Room C148
SC Workshop page


 

Program Co-Chairs:

Shenzhen University, China


Oak Ridge National Laboratory, USA

Reproducibility Co-Chairs:


University of California, Santa Cruz


Leiden University, Netherlands
General Chair:

Riken, Japan

Publicity Chair:

Lawerence Berkeley National Laboratory, USA

Web & Publications Chair:

Carnegie Mellon University

submission deadline extended - August 20, 2022, 11:59 PM AoE
abstract / cfp [updated] / submissions / WIP session
workshop registration / committees
PDSW22 Reproducability Addendum


WORKSHOP ABSTRACT


We are pleased to announce the 7th International Parallel Data Systems Workshop (PDSW’22). PDSW'22 will be hosted in conjunction with SC22: The International Conference for High Performance Computing, Networking, Storage and Analysis.

Efficient data storage and data management are crucial to scientific productivity in both traditional simulation-oriented HPC environments and Big Data analysis environments. This issue is further exacerbated by the growing volume of experimental and observational data, the widening gap between the performance of computational hardware and storage hardware, and the emergence of new data-driven algorithms in machine learning. The goal of this workshop is to facilitate research that addresses the most critical challenges in scientific data storage and data processing. PDSW will continue to build on the successful tradition established by its predecessor workshops: the Petascale Data Storage Workshop (PDSW, 2006-2015) and the Data Intensive Scalable Computing Systems (DISCS 2012-2015) workshop. These workshops were successfully combined in 2016, and the resulting joint workshop has attracted up to 38 full paper submissions and 140 attendees per year from 2016 to 2021.

We encourage the community to submit original manuscripts that:

  • introduce and evaluate novel algorithms or architectures,
  • inform the community of important scientific case studies or workloads, or
  • validate the reproducibility of previously published work

Special attention will be given to issues in which community collaboration is crucial for problem identification, workload capture, solution interoperability, standardization, and shared tools. We also strongly encourage papers to share complete experimental environment information (software version numbers, benchmark configurations, etc.) to facilitate collaboration.

Topics of interest include the following:

  • Scalable architectures for distributed data storage, archival, and virtualization
  • The application of new data processing models and algorithms towards scientific computing and analysis
  • Performance benchmarking, resource management, and workload studies
  • Enabling cloud and container-based models for scientific data analysis
  • Techniques for data integrity, availability, reliability, and fault tolerance
  • Programming models and big data frameworks for data intensive computing
  • Hybrid cloud/on-premise data processing
  • Cloud-specific data storage and transit costs and opportunities
  • Programmability of storage systems
  • Data filtering/compressing/reduction techniques
  • Parallel file systems, metadata management, and complex data management
  • Integrating computation into the memory and storage hierarchy to facilitate in-situ and in-transit data processing
  • Alternative data storage models, including object stores and key-value stores
  • Productivity tools for data intensive computing, data mining, and knowledge discovery
  • Tools and techniques for managing data movement among compute and data intensive components
  • Cross-cloud data management
  • Storage system optimization and data analytics with machine learning
  • Innovative techniques and performance evaluation for new memory and storage systems

CALL FOR PAPERS

 

Call for papers available now (pdf). [updated August 12, 2022]


Regular paper SUBMISSIONS

All papers will be evaluated by a competitive peer review process under the supervision of the workshop program committee. Selected papers and associated talk slides will be made available on the workshop web site. The papers will also be published by the IEEE Computer Society.

Authors of regular papers are strongly encouraged to submit Artifact Description (AD) Appendices that can help to reproduce and validate their experimental results. While the inclusion of the AD Appendices is optional for PDSW’22, submissions that are accompanied by AD Appendices will be given favorable consideration for the PDSW Best Paper award.

PDSW’22 follows the SC22 Reproducibility Initiative (see Addendum).. For Artifact Description (AD) Appendices, we will use the format of the SC22 for PDSW'22 submissions. The AD should include a field for one or more links to data (zenodo, figshare, etc.) and code (github, gitlab, bitbucket, etc.) repositories. For the Artifacts that will be placed in the code repository, we encourage authors to follow the guidelines of PDSW22 on how to structure the artifact, as it will make it easier to the reviewing committee and readers of the paper in the future.

Submit a not previously published paper as a PDF file, indicate authors and affiliations. Papers must be up to 5 pages, not less than 10 point font and not including references and optional reproducibility appendices. Papers must use the IEEE conference paper template.

Deadlines - Regular Papers and Reproducibility Study Papers

Submissions due: Aug. 20, 2022, 11:59 PM AoE
Submissions website: https://submissions.supercomputing.org/
Notification: Sep. 9, 2022
Copyright forms due: TBD
Slides due before workshop: TBD
Camera ready files due:
Sep. 30, 2022, 11:59 PM AoE


Work In Progress (WIP) Session


There will be a WIP session where presenters provide brief 5-minute talks on their on-going work, with fresh problems/solutions. WIP content is typically material that may not be mature or complete enough for a full paper submission and will not be included in the proceedings. A one-page abstract is required.

Deadlines - Work in Progress (WIP)

Work in Progress (WIP) submissions due: Sep. 16, 2022, 11:59PM AoE
Notification: On or before Sep. 23, 2022
Submissions website: https://submissions.supercomputing.org/


Workshop Registration

Registration opens July 13, 2022. To allow you to prepare, find details on registration pricing, and policies affecting registration changes and cancellations here on July 13.


PROGRAM COMMITTEE:

 

  • Jalil Boukhobza, University of Western Brittany, France
  • Suren Byna, Lawrence Berkeley National Laboratory
  • Yong Chen, Texas Tech University
  • Wei Der Chen, University of Edinburgh
  • Dong Dai, University of North Carolina at Charlotte
  • Matthieu Dorier, Argonne National Laboratory (ANL)
  • Bogdan Ghit, Databricks
  • Qian Gong, Oak Ridge National Laboratory
  • Luanzheng Guo, Pacific Northwest National Laboratory
  • Shadi Ibrahim, Inria ‪
  • Tanzima Islam, Texas State University
  • Youngjae Kim, Sogang University
  • Johann Lombardi, Intel Corporation
  • Xiaoyi Lu, University of California, Merced
  • Xiaosong Ma, Qatar Computing Research Institute
  • Kathryn Mohror, Lawrence Livermore National Laboratory
  • Diana Moise, Hewlett Packard Enterprise
  • Sarah Neuwirth, Habilitation Candidate at Goethe University
  • M. Mustafa Rafique, Rochester Institute of Technology
  • Raghunath Raja Chandrasekar, Stealth Startup
  • Michael Schöttner, Duesseldorf University
  • Vasily Tarasov, IBM Corporation
  • Qing Zheng, Los Alamos National Lab

STEERING COMMITTEE:

  • John Bent, Cray
  • Ali R. Butt, Virginia Tech
  • Philip Carns, Argonne National Laboratory
  • Shane Canon, Lawrence Berkeley National Laboratory
  • Raghunath Raja Chandrasekar, Amazon Web Services
  • Yong Chen, Texas Tech University
  • Evan J. Felix, Pacific Northwest National Laboratory
  • Gary Grider, Los Alamos National Laboratory
  • William D. Gropp, University of Illinois at Urbana-Champaign
  • Dean Hildebrand, Google
  • Shadi Ibraim, Inria, France
  • Dries Kimpe, KCG, USA
  • Glenn Lockwood, Lawrence Berkeley National Laboratory
  • Jay Lofstead, Sandia National Laboratories
  • Xiaosong Ma, Qatar Computing Research Institute, Qatar
  • Carlos Maltzahn, University of California, Santa Cruz
  • Suzanne McIntosh, New York University
  • Kathryn Mohror, Lawrence Livermore National Laboratory
  • Robert Ross, Argonne National Laboratory
  • Philip C. Roth, Oak Ridge National Laboratory
  • John Shalf, NERSC, Lawrence Berkeley National Laboratory
  • Xian-He Sun, Illinois Institute of Technology
  • Rajeev Thakur, Argonne National Laboratory
  • Lee Ward, Sandia National Laboratories
  • Brent Welch, Google