pdsw-DISCS 2022:

7th International Parallel Data Systems Workshop


Held in conjunction with SC22
Kay Bailey Hutchison Convention Center, Dallas, TX

November 14, 2022
1:30 PM - 5:00 pm (CST)


Program Co-Chairs:

Shenzhen University, China


Oak Ridge National Laboratory, USA
General Chair:

Riken, Japan



Arif Merchant, Google


Splinters – Distributed IO Sampling for Cloud Data Centers – Design and Applications

slides - coming sooon

abstract: Splinters is a distributed system for sampling IO metadata in Google data centers. It has been deployed in production for several years, and is the main engine for the analysis of storage systems and workloads in Google. Given the scale of the storage infrastructure, reliably collecting and processing the IO samples is a complex problem, and we explain how we design around the various challenges. We show how the collected IO samples are used for ad-hoc queries and longitudinal analysis. We also outline several applications where we used the IO samples for the design and implementation of new systems.

bio: Arif Merchant is a Research Scientist with the Storage Analytics group at Google, where he studies interactions between components of the storage stack. Prior to this, he was with HP Labs, where he worked on storage QoS, distributed storage systems, and stochastic models of storage. He holds the B.Tech. degree from IIT Bombay and the Ph.D. in Computer Science from Stanford University. He is an ACM Distinguished Scientist.