pdsw-DISCS 2018:

3Rd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems


Held in conjunction with SC18

Monday, November 12, 2018
Dallas, TX


Program Co-Chairs:

Lawrence Livermore National Laboratory


Google
General Chair:

Google

deadlines / camera ready instructions / reproducibility / artifact evaluation criteria


PDSW-DISCS 2018 Accepted Technical Paper Deadlines


We invite regular papers which may optionally undergo validation of experiment results by providing reproducibility information. Papers successfully validated earn a badge in the ACM DL in accordance with ACM’s artifact evaluation policy.

Text formatting:

Papers must be at least 8 pages long and no more than 12 pages long (including appendices and references). Download the IEEE conference paper template.

Please do not include page numbers on your paper!

Deadlines - PAPER SUBMISSION DEADLINES EXTENDED

Submissions deadline: Paper (in pdf format) due SEPTEMBER 9, 2018 11:59 PM AoE Submissions website: https://submissions.supercomputing.org/
Notification: September 30, 2018
Camera ready and copyright forms due: October 5, 2018
Slides due before workshop: Friday, November 9, 2018 to jdigney@cs.cmu.edu
* Submissions must be in the IEEE conference format

copyright instructions

Instructions on submitting copyright will be sent to each contact author. It is extremely important that authors submit their copyright info as soon as possible after receiving this email as your paper will be denied publication without it. Once you have submitted your copyright info, you will receive further information on how to include it on your final paper.


Camera ready instructions


Copyright info: 

All final papers must include on their first page the appropriate copyright and bibliographic lines. The text can vary, depending upon whether or not the authors are associated with a government, or if the paper copyright is owned by a government. It is your responsibility to read the information that confirms your copyright and make sure the text is correct. The confirmation from IEEE will specify exactly what to include. The full text must be included, exactly as shown. All this information must appear in the lower lefthand corner on the first page of your PDF file.  Note that the ACM templates already account for this, and the email will give you the exact block of text to copy into each type of template (MSWord or LaTeX).

Previously copyrighted material: 

Make sure your manuscript contains no previously copyrighted material (whether text, images, or tables) unless it is properly cited. NOTE: This guideline also applies to materials from your own previous papers; the only difference between re-using your own work and someone else’s is that you don’t need to quotation marks around text that you wrote – it still needs to be cited any time it appears. Violation of these guidelines will result in your paper being withdrawn from the conference proceedings.


Instructions for PDSW-DISCS Reproducibility Studies

We call for reproducibility studies that for the first time reproduce experiments from papers previously published in PDSW-DISCS or in other peer-reviewed conferences with similar topics of interest. Reproducibility study submissions are selected by the same peer-reviewed competitive process to which regular papers are subjected. In addition, these submissions undergo validation of the reproduced experiment and must include reproducibility information that can be evaluated by a publicly available automation service.

(The following has been adapted from the ISSTA’18 CFP.) A reproducibility study must go beyond simply re-implementing an algorithm and/or re-running the artifacts provided by the original paper. It should at the very least apply the approach to new, significantly broadened inputs. A reproducibility study should clearly report on results that the authors were able to reproduce as well as on aspects of the work that were irreproducible. In the latter case, authors are encouraged to make an effort to communicate or collaborate with the original paper's authors to determine the cause for any observed discrepancies and, if possible, address them (e.g., through minor implementation changes).

In particular, reproducibility studies should follow the ACM guidelines on reproducibility (different team, different experimental setup) with respect to the experiment under study: “The measurement can be obtained with stated precision by a different team, a different measuring system, in a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using artifacts which they develop completely independently.” [ACM Artifact Review and Badging Policy]

This means that it is also insufficient to focus on repeatability (i.e., same experiment) alone. Reproducibility Studies will be evaluated according to the following standards:

  • Depth and breadth of experiments

  • Clarity of writing

  • Appropriateness of Conclusions

  • Amount of useful, actionable insights

  • Availability of artifacts that pass the automated testing procedure

In particular, we require reproducibility studies to pass our automated testing procedure (see here). Accepted papers will earn the prestigious ACM badge Results Replicated and, if the work under study was successfully reproduced, the associated paper will earn the prestigious ACM badge Results Reproduced. (End of adaptation from the ISSTA’18 CFP.)


PDSW-DISCS Artifact Evaluation Criteria


Infrastructure

We have an instance of Jenkins running at http://ci.falsifiable.us, maintained by members of the Systems Research Lab (SRL) at UC Santa Cruz. Detailed instructions on how to create an account on this service and how to use it is available here (also includes instructions on how to self-host it). This service allows researchers and students to easily automate the execution and validation of experimentation pipelines. Users of this service follow a convention for structuring their experiment repositories, which allows the service to be domain-agnostic. The service reports the status of an experimentation pipeline (fail, success or validated).

NOTE: Using our server is not obligatory. However, we will require authors to provide a URL to the service they use (e.g., TravisCI, GitLabCI, CircleCI, Jenkins, etc) so reviewers can validate the repeatability of the submission using that service.

Evaluation Criteria

In order to be considered for an ACM badge, the pipelines associated to a submission must be in a healthy (runnable) state and automate the following:

  • Code and data dependencies. Code must reside on a version control system (e.g. github, gitlab, etc.). If datasets are used, then they should reside in a dataset management system (datapackage, gitlfs, dataverse, etc.). The experimentation pipelines must obtain the code/data from these services on every execution.

  • Setup. The pipeline should build and deploy the code under test. For example, if a pipeline is using containers or VMs to package their code, the pipeline should build the container/VM images prior to executing them. The goal of this is to verify that all the code and 3rd party dependencies are available at the time a pipeline runs, as well as the instructions on how to build the software.

  • Resource allocation. If a pipeline requires a cluster or custom hardware to reproduce results, resource allocation must be done as part of the execution of the pipeline. This allocation can be static or dynamic. For example, if an experiment runs on custom hardware, the pipeline can statically allocate (i.e. hardcode IP/hostnames) the machines where the code under study runs (e.g. GPU/FPGA nodes). Alternatively, a pipeline can dynamically allocate nodes (using infrastructure automation tools) on CloudLab, Chameleon, Grid5k, SLURM, Terraform (EC2, GCE, etc.), etc.

  • Validation. Scripts must verify that the output corroborates the claims made on the article. For example, the pipeline might check that the throughput of a system is within an expected confidence interval (e.g. defined with respect to a baseline obtained at runtime).

A list of example pipelines meeting the above criteria:

Many more available at this page.

Questions & Answers

Since the reproducibility submission process is new, we expect quite a few questions. To make the process of answering questions more efficient, please use our gitter room to browse for possible answers or post your question.