pdsw 2025:

10th International
Parallel Data Systems Workshop


Held in conjunction with SC25, NOV 16–21, 2025
St. louis, MO • Monday, Nov 17, 2025
Room TBA
9:00 AM - 5:30 pm (eST)


Program Co-Chairs:


Illinois Institute of Technology, USA


Johannes Gutenberg University Mainz (JGU), Germany
General Chair:


The Ohio State University, USA


About the PDSW Workshop


Efficient storage, movement, and management of data are crucial to application performance and scientific productivity in both traditional simulation-oriented HPC environments and Cloud, AI/ML/Big Data analysis environments. This issue is further exacerbated by the growing volume of experimental and observational data, the widening gap between the performance of computational hardware and storage hardware, and the emergence of new data-driven algorithms in machine learning. The goal of this workshop is to facilitate in-depth discussions of research and development that address the most critical challenges in large-scale data storage and data processing. PDSW will continue to build on the successful tradition established by its predecessor workshops: the Petascale Data Storage Workshop (PDSW, 2006-2015) and the Data Intensive Scalable Computing Systems (DISCS 2012-2015) workshop. These workshops were successfully combined in 2016, and the resulting joint workshop has attracted up to 45 full paper submissions and 195 attendees per year from 2016 to 2024.

The scope of PDSW includes the following and relevant topics:

  • Scalable architectures for distributed data storage, archival, and virtualization
  • The application of new data processing models and algorithms towards parallel computing and analysis
  • Performance benchmarking, resource management, and workload studies
  • Enabling cloud and container-based models for large-scale data analysis
  • Storage technologies for the emergence of new hardware and computing models
  • Techniques for data integrity, availability, reliability, and fault tolerance
  • Programming models and big data frameworks for data intensive computing
  • Hybrid cloud/on-premise data processing
  • Cloud-specific data storage and transit costs and opportunities
  • Programmability of storage systems
  • Data filtering/compressing/reduction techniques
  • Parallel file systems, metadata management, and complex data management
  • Integrating computation into the memory and storage hierarchy to facilitate in-situ and in-transit data processing
  • Alternative data storage models, including object stores and key-value stores
  • Productivity tools for data intensive computing, data mining, and knowledge discovery
  • Tools and techniques for managing data movement among compute and data intensive components
  • Cross-cloud data management
  • Storage system optimization and data analytics with machine learning
  • Data quality assessment and improvement
  • Innovative techniques and performance evaluation for new memory and storage system