What do we need to make reproducibility for HPC experiments practical?
Come debate with authors and reviewers of artifact evaluation initiatives at HPC conferences, community practitioners, and infrastructure providers!
Special keynotes from Dr. Torsten Hoefler and Dr. Kate Keahey.
Questions? Contact event coordinator Marc Richardson (mtrichardson@uchicago.edu)
Note: This workshop is not an official workshop of the SC24 conference. Separate registration is required.
About the Workshop
Reproducibility in High-Performance Computing (HPC) and systems research presents unique challenges. The requirements for specialized hardware, scale, and deep reconfigurability often make experiments extremely difficult to reproduce. The diverse nature of HPC research further complicates matters, with some experiments being relatively straightforward and low-cost to replicate, while others remain practically unfeasible. Despite these challenges, the potential benefits of reproducibility in HPC are immense. Examining experiments from various angles can yield significant insights, fostering collaboration by allowing researchers to explore each other's results not just through reading, but through hands-on experimentation.
Our workshop, colocated in Atlanta with the premier annual conference showcasing the latest innovations in supercomputing technology, aims to advance the concept of practical reproducibility in HPC - a practice where reproducing results becomes a mainstream method of scientific exploration. We will provide a forum for debate on the tools, services, and approaches that best support reproducibility in HPC and systems science, concluding with a comprehensive report that captures the community's collective knowledge and recommendations for advancing practical reproducibility in HPC and systems research.
This workshop is supported by the Chameleon project, a cutting-edge cloud platform designed for computer science research. Chameleon has been instrumental as a platform for reproducibility in major conferences, including most recently serving as the default platform for the SC24 Reproducibility Initiative as well as supporting others like ICPE, ACM CSS, EuroSys, FAST, OSDI/ATC, and more.
We are excited to announce our featured keynotes from distinguished speakers including Torsten Hoefler, Professor of Computer Science at ETH Zurich, and Kate Keahey, Senior Computer Scientist at Argonne National Laboratory and PI of Chameleon. These talks will provide valuable insights into the state-of-the-art in HPC reproducibility and future directions.
Workshop Objectives
- Bring together authors and reviewers participating in reproducibility initiatives associated with HPC and systems conferences, particularly those leveraging the Chameleon platform
- Share experiences and discuss challenges in implementing reproducibility in HPC environments
- Propose and evaluate features for platforms, tools, and services that would facilitate easier reproducibility
- Explore innovative solutions for reproducible HPC experiments and enabling platforms
- Foster community practices that integrate reproducibility into mainstream research and education
- Establish a repository of exemplar reproducibility artifacts in HPC
Who Should Attend
- HPC researchers and practitioners
- Participants in SC's Reproducibility Initiative and similar programs
- Educators interested in reproducibility in HPC education
- Students and early-career researchers in HPC
- Tool and platform developers focused on reproducibility
- Anyone interested in advancing reproducibility in computational science
Join us for this full-day workshop as we work together to bridge the gap between theoretical reproducibility and its practical application in HPC research.
Prior Chameleon User Meetings
Call for Presentations
As in previous Chameleon User Meetings, the organizers will reimburse travel expenses of up to $1,500 for the presenting authors of the top 10 selected presentation abstracts (one author per abstract). Please, take a look at the Call For Presentations below for details.
Important Dates and Actions
Submission due date: NOW: October 21, 2024 at 11:59 PM any time on earth
Acceptance notification date: NOW: October 26, 2024
Send submissions to: presentations@chameleoncloud.org
Presentation Proposal Guidelines
Presentation proposals should be in PDF format, no longer than 2 pages, and include the following:
- Project Description: (One paragraph or less)
A brief but clear description of the Computer Science experiment you are packaging or reproducing: What research problem does it address? (What is the hypothesis? What is the challenge/trade-off? Why is it significant?) How does the experiment capture this research problem? - Reproducibility Approach: What is the description of the experiment you are either packaging or reproducing? (One paragraph or less)
What resources does it need and how many? How do those resources need to be configured? What is the experimental environments in which the experiment should run? How are they configured? What does the experiment body consist of? How long and in what manner is the experiment run? What types of data does it produce and how much? How is the data analyzed? What platforms or tools did you use in your experiment? - Lessons Learned:
Good and bad: What were the most important obstacles to either packaging for reproducibility or reproducing the experiment you encountered – and how did you overcome them? Were they on the infrastructure level (hardware availability), configuration level (how the hardware is configured, i.e., “creating an MPI cluster”), executing the experimental workflow, managing or analyzing data? What strategies were effective? How did you decide (or what guidelines did you give to reviewers) as to when the experiment is reproduced? What suggestions or wishes do you have on how support for reproducibility should improve? Specifically, what should improve in (1) the ability to set up an experimental environment, (2) the ability to manage the experiment body, and (3) data analysis? Which of those desired capabilities are the most important ones? Any procedural insights on how to organize reproducibility initiatives? - Impact and Future Directions: (One paragraph or less)
How has prioritizing reproducibility influenced your HPC research or applications? How can the HPC community better integrate reproducibility into mainstream research and education?
Selection Criteria
Presentations will be selected based on their relevance to HPC-specific reproducibility challenges, the potential to foster discussion, and the insights they offer. We especially encourage submissions with detailed lessons learned, discussion of reproducibility challenges, and experience in reproducibility initiatives.
We particularly encourage submissions that:
- Have an insightful, detailed, and well-communicated "lessons learned" section
- Have a good discussion/analysis of specific reproducibility challenges in HPC
- Share experiences from HPC-focused reproducibility initiatives
Travel Support
For the top 10 selected abstracts, we will reimburse travel expenses of up to $1,500 for the presenting authors (one per abstract). Submitting the presentation proposal doubles as travel support application.
Workshop Outcomes
This workshop aims to produce a report capturing the community's collective knowledge and recommendations for advancing practical reproducibility in HPC. Your presentations and participation will directly contribute to this valuable resource.
If you have any questions, please contact us at contact@chameleoncloud.org or via the Chameleon users list.
Workshop Agenda
Schedule Overview
Times are in local Atlanta timeIntroduction/Morning Keynote (9:00 AM - 10:00 AM)
Welcome Reception and Breakfast
Introduction/Welcoming Remarks
Kate Keahey
Keynote Address: Adaptable Infrastructures for Reproducible Science - The Chameleon 4 Approach
Kate Keahey, Senior Computer Scientist, Argonne National Laboratory
The landscape of computer science research is evolving at an unprecedented pace, with innovations in AI, data science, edge computing, and beyond. These advancements demand a flexible, powerful infrastructure capable of supporting a wide array of experiments while facilitating reproducibility and collaboration. Dr. Keahey will unveil how Chameleon 4 extends its deeply reconfigurable edge-to-cloud architecture to support emerging research needs, describing enhanced virtualization capabilities, expanded edge computing functionalities, and advanced mechanisms for sharing digital artifacts.
Morning Break (10:00 AM - 10:30 AM)
Morning Presentations - Session 1 (10:30 AM - 12:00 PM)
Each presenter will have approximately 15 minutes for their presentation, followed by 2-3 minutes for questions.
Artifact Evaluations as Authors and Reviewers: Lessons, Questions, and Frustrations
Quentin Guilloteau, University of Basel
Insights and recommendations from extensive experience with artifact evaluation processes across major conferences.
Packaging a Testbed for Reproducibility Workflows
Sam Grayson, University of Illinois Urbana-Champaign
An exploration of reproducible package management approaches for workflow systems, with insights on shared system considerations and incremental computation strategies.
Reproducing C++ Multicore and GPU Benchmark Results on Chameleon Cloud
Ruben Laso, University of Vienna
A detailed examination of STL implementation reproducibility across various compilers and architectures, featuring practical demonstrations using Chameleon Cloud.
Assessing Visualization Reproducibility in HPC
Triveni Gurram and David Koop, Northern Illinois University
Novel approaches for evaluating and ensuring reproducibility in scientific visualizations, with emphasis on meaningful difference detection and validation methods.
Performance and Power Optimization Strategies Concerning HPC Nodes
Akhilesh Raj, Vanderbilt University
Strategies for real-time monitoring and optimization of power usage in HPC systems using machine learning approaches.
Morning Panel Discussion (12:00 PM - 12:30 PM)
Featuring morning session 1 presenters
Lunch Break (12:30 PM - 1:30 PM)
Complimentary lunch served in the main event room.
Afternoon Keynote (1:30 PM - 2:30 PM)
Reproducing Performance - The Good, the Bad, and the Ugly
Torsten Hoefler, Professor, ETH Zürich
While containers and Jupyter notebooks are useful tools for reproducing computational results, performance reproducibility presents unique challenges. Dr. Hoefler will outline techniques to facilitate performance reproducibility across various settings, addressing challenges with system-specific results and configurations. The talk will provide guidelines for reproducible science in performance benchmarking, including considerations for performance-accuracy tradeoffs in data science and artificial intelligence contexts.
Afternoon Break (2:30 PM - 3:00 PM)
Afternoon Presentations - Session 2 (3:00 PM - 4:30 PM)
Each presenter will have approximately 15 minutes for their presentation, followed by 2-3 minutes for questions.
FAIR Assessment of Cloud-based Experiments
Tanu Malik, DePaul University
An analysis of 100+ Chameleon cloud experiments across different categories (tutorials, research, bug reproduction, and coursework) found that while most were reproducible on Chameleon's infrastructure, additional effort was needed to reproduce them on other public cloud platforms.
AutoAppendix: Towards One-Click Reproducibility of Computational Artifacts Using Chameleon Cloud
Klaus Kraßnitzer, IST Austria
A survey of reproducibility in SC24 submissions, presenting guidelines and best practices for artifact evaluation using Chameleon Cloud.
Understanding Scalability Bugs in Large-Scale Software Systems
Bogdan Stoica, University of Chicago
Analysis of reproducibility challenges in scalability bugs, examining root causes and proposing improved testing methodologies.
Hierarchical Federated Learning Based Smart Home System Using Chameleon Testbed
Kevin Kostage, Florida Gulf Coast University
Implementation of privacy-preserving learning systems using Chameleon testbed's infrastructure.
Energy Consumption as a Metric for HPC Workload Reproducibility
Adithya Raman, University at Buffalo
Development of hardware-agnostic energy profiling approaches for HPC workloads.
Afternoon Panel Discussion (4:30 PM - 5:00 PM)
Featuring afternoon session 2 presenters
Concluding Remarks & Happy Hour 🍺 (5:00 PM - 6:30 PM)
Concluding Summary & Remarks
Rooftop Happy Hour
Keynotes
The workshop will feature keynotes on the state of reproducibility in HPC from our distinguished speakers. Additionally, panel discussions with authors and reviewers who have participated in reproducibility initiatives at HPC and systems conferences, such as SC, will share their experiences in creating, evaluating, and ranking HPC artifacts to support reproducibility.
Torsten Hoefler
Keynote: Reproducing Performance - The Good, the Bad, and the Ugly
Containers and Jupyter notebooks are useful tools for reproducing computational results of any packaged application. However, if the execution performance or efficiency is the science result, matters are more complex. It may not be sufficient to package codes in containers. In fact, containers may disturb the performance results and reproducibility. We outline a set of techniques to facilitate performance reproducibility in various settings. Some performance results may be linked to specific computer architectures or even specific system configurations that may not be accessible to other researchers or even the original team after a software update. We outline techniques to help researchers interpret results on the original system even if it is practically impossible to reproduce the original results. We discuss such techniques both in the context of pure performance but also in the context of the emerging field of data science and artificial intelligence that often allows for a performance-accuracy tradeoff. All-in-all, our work provides a set of guidelines to follow to support reproducible science of performance and benchmarking.
Kate Keahey
Keynote: Adaptable Infrastructures for Reproducible Science - The Chameleon 4 Approach
The landscape of computer science research is evolving at an unprecedented pace, with innovations in AI, data science, edge computing, and beyond. These advancements demand a flexible, powerful infrastructure capable of supporting a wide array of experiments while facilitating reproducibility and collaboration. Chameleon 4, the latest iteration of the NSF-funded testbed, rises to meet these challenges. In this keynote, Dr. Keahey will unveil how Chameleon 4 extends its deeply reconfigurable edge-to-cloud architecture to support emerging research needs. She will describe the testbed's enhanced virtualization capabilities, expanded edge computing functionalities, and advanced mechanisms for sharing digital artifacts. Dr. Keahey will illustrate how these features, combined with Chameleon's existing bare-metal reconfigurability and diverse hardware offerings, create an unparalleled platform for reproducible science. The talk will explore Chameleon 4's approach to federation, enabling seamless integration with other research infrastructures, and discuss how the platform's adaptability ensures it can evolve alongside the ever-changing frontiers of computer science research. Through real-world examples and future roadmaps, attendees will gain insight into how Chameleon 4 is poised to accelerate innovation and foster a more open, collaborative scientific community in the realm of HPC and beyond.
See You in Atlanta!
Address
330 Marietta St NW
Atlanta, GA 30313
The workshop will be held at Terminus 330, a state-of-the-art venue in the heart of Atlanta, GA, conveniently located right around the corner from the Georgia World Congress Center where the biggest annual conference in supercomputing is being held November 17-22, 2024.
Lodging
Our workshop is conveniently located in the heart of downtown Atlanta. There are many high-quality hotels and lodgings in the area. If you are also in town for the big HPC conference, we recommend checking out their resources for more options.
Travel
Getting to Atlanta
Atlanta is easily accessible by air and ground transportation:
- By Air: Hartsfield-Jackson Atlanta International Airport (ATL) is the primary airport serving Atlanta.
- By Car: Atlanta is intersected by several major interstate highways, making it easily accessible by car from many parts of the United States.
- By Train: Amtrak's Crescent line serves Atlanta, connecting it to major cities in the Northeast and New Orleans.
Getting to the Venue
Terminus 330 is located in downtown Atlanta. Here are some options for getting to the venue:
- From the Airport: Take the MARTA train from the airport to downtown Atlanta. The venue is a short walk or ride-share trip from several MARTA stations.
- By Public Transit: Atlanta's MARTA system provides bus and rail service throughout the city. Check the MARTA website for routes and schedules.
- By Car: If driving, there are several parking options near the venue. We recommend checking online parking apps for the best rates.
- Ride-sharing: Services like Uber and Lyft are widely available in Atlanta and can be a convenient option for getting around the city.
For more information on getting around Atlanta, visit the official Atlanta tourism website.
Registration Details
Early Bird Registration (NOW UNTIL Oct. 25, 2024): $20
Regular Registration (Oct. 26 - Nov. 18, 2024): $100
All registration fees will cover event-related costs. Limited sponsorships are available to cover costs for presenters. Eligible registrations will receive a code.
Note: Seats are limited. Register early to secure your spot and take advantage of the early bird discount!
Contact Us
For any questions or feedback, send us an email at contact@chameleoncloud.org or reach out to one of our event coordinators:
Name: Marc Richardson
Email: mtrichardson@uchicago.edu
Name: Roberto Vale
Email: rvale@uchicago.edu
Planning Committee
Meet the dedicated team assisting the Community Workshop on Practical Reproducibility in HPC.
Brian Kocoloski
USC Information Sciences Institute
Fraida Fund
New York University
Sascha Hunold
Vienna University of Technology
Tanu Malik
DePaul University
Rafa Tolosana Calasanz
University of Zaragoza
Our Projects
About the REPETO Project
The REPETO (pronounce to rhyme with Geppetto) project is an NSF-funded research coordination network (RCN) that promotes the concept of practical reproducibility. This practice aims to package experiments in a way that allows cost-effective repetition, making them as accessible for exploring research as reading papers is today.
Key aspects of the REPETO project include:
- Focus on experiments in the computer science community
- Understanding the cost/benefit equation of reproducibility for these experiments
- Identifying factors that make reproducibility difficult or infeasible
- Fostering community practices to integrate reproducibility into mainstream research and education activities in computer science
About the Chameleon Testbed
Chameleon is an NSF-funded, large-scale, deeply programmable experimental platform for Computer Science systems research. It provides a configurable environment that can support a wide range of experimental needs, from bare metal reconfiguration to support for reproducible software environments. Chameleon allows researchers to explore transformative concepts in cloud computing, distributed computing, networking, and machine learning, enabling them to experiment with novel cloud architectures and pursue new applications of cloud computing. With its unique features and commitment to openness, Chameleon plays a crucial role in advancing computer science research and education.