Welcome to the Community Workshop on Practical Reproducibility in HPC
Update - Oct. 1, 2024: Early Bird Registration Extended
Due to popular demand, we have extended the early bird registration deadline to October 25, 2024. Register now to take advantage of the discounted rate!
Update - Sept. 25, 2024: Call for Presentations
We are now accepting presentation proposals for the workshop. See the Call for Presentations
Reproducibility in High-Performance Computing (HPC) and systems research presents unique challenges. The requirements for specialized hardware, scale, and deep reconfigurability often make experiments extremely difficult to reproduce. The diverse nature of HPC research further complicates matters, with some experiments being relatively straightforward and low-cost to replicate, while others remain practically unfeasible. Despite these challenges, the potential benefits of reproducibility in HPC are immense. Examining experiments from various angles can yield significant insights, fostering collaboration by allowing researchers to explore each other's results not just through reading, but through hands-on experimentation.
Event Info
Date: Monday, November 18, 2024
Location: Terminus 330, Atlanta, GA
Time: 8:30 AM - 6:30 PM ET
Our workshop, colocated in Atlanta with the premier annual conference showcasing the latest innovations in supercomputing technology, aims to advance the concept of practical reproducibility in HPC - a practice where reproducing results becomes a mainstream method of scientific exploration. We will provide a forum for debate on the tools, services, and approaches that best support reproducibility in HPC and systems science, concluding with a comprehensive report that captures the community's collective knowledge and recommendations for advancing practical reproducibility in HPC and systems research.
This workshop is supported by the Chameleon project, a cutting-edge cloud platform designed for computer science research. Chameleon has been instrumental as a platform for reproducibility in major conferences, including most recently serving as the default platform for the SC24 Reproducibility Initiative as well as supporting others like ICPE, ACM CSS, EuroSys, FAST, OSDI/ATC, and more.
We are excited to announce our featured keynotes from distinguished speakers including Torsten Hoefler, Professor of Computer Science at ETH Zurich, and Kate Keahey, Senior Computer Scientist at Argonne National Laboratory and PI of Chameleon. These talks will provide valuable insights into the state-of-the-art in HPC reproducibility and future directions.
Workshop Objectives
- Bring together authors and reviewers participating in reproducibility initiatives associated with HPC and systems conferences, particularly those leveraging the Chameleon platform
- Share experiences and discuss challenges in implementing reproducibility in HPC environments
- Propose and evaluate features for platforms, tools, and services that would facilitate easier reproducibility
- Explore innovative solutions for reproducible HPC experiments and enabling platforms
- Foster community practices that integrate reproducibility into mainstream research and education
- Establish a repository of exemplar reproducibility artifacts in HPC
Who Should Attend
- HPC researchers and practitioners
- Participants in SC's Reproducibility Initiative and similar programs
- Educators interested in reproducibility in HPC education
- Students and early-career researchers in HPC
- Tool and platform developers focused on reproducibility
- Anyone interested in advancing reproducibility in computational science
Join us for this full-day workshop as we work together to bridge the gap between theoretical reproducibility and its practical application in HPC research.
Prior Chameleon User Meetings
Call for Presentations
As in previous Chameleon User Meetings, the organizers will reimburse travel expenses of up to $1,500 for the presenting authors of the top 10 selected presentation abstracts (one author per abstract). Please, take a look at the Call For Presentations below for details.
Important Dates and Actions
Submission due date: October 18, 2024
Acceptance notification date: October 21, 2024
Send submissions to: presentations@chameleoncloud.org
Presentation Proposal Guidelines
Presentation proposals should be in PDF format, no longer than 2 pages, and include the following:
- Project Description: (One paragraph or less)
A brief but clear description of the Computer Science experiment you are packaging or reproducing: What research problem does it address? (What is the hypothesis? What is the challenge/trade-off? Why is it significant?) How does the experiment capture this research problem? - Reproducibility Approach: What is the description of the experiment you are either packaging or reproducing? (One paragraph or less)
What resources does it need and how many? How do those resources need to be configured? What is the experimental environments in which the experiment should run? How are they configured? What does the experiment body consist of? How long and in what manner is the experiment run? What types of data does it produce and how much? How is the data analyzed? What platforms or tools did you use in your experiment? - Lessons Learned:
Good and bad: What were the most important obstacles to either packaging for reproducibility or reproducing the experiment you encountered – and how did you overcome them? Were they on the infrastructure level (hardware availability), configuration level (how the hardware is configured, i.e., “creating an MPI cluster”), executing the experimental workflow, managing or analyzing data? What strategies were effective? How did you decide (or what guidelines did you give to reviewers) as to when the experiment is reproduced? What suggestions or wishes do you have on how support for reproducibility should improve? Specifically, what should improve in (1) the ability to set up an experimental environment, (2) the ability to manage the experiment body, and (3) data analysis? Which of those desired capabilities are the most important ones? Any procedural insights on how to organize reproducibility initiatives? - Impact and Future Directions: (One paragraph or less)
How has prioritizing reproducibility influenced your HPC research or applications? How can the HPC community better integrate reproducibility into mainstream research and education?
Selection Criteria
Presentations will be selected based on their relevance to HPC-specific reproducibility challenges, the potential to foster discussion, and the insights they offer. We especially encourage submissions with detailed lessons learned, discussion of reproducibility challenges, and experience in reproducibility initiatives.
We particularly encourage submissions that:
- Have an insightful, detailed, and well-communicated "lessons learned" section
- Have a good discussion/analysis of specific reproducibility challenges in HPC
- Share experiences from HPC-focused reproducibility initiatives
Travel Support
For the top 10 selected abstracts, we will reimburse travel expenses of up to $1,500 for the presenting authors (one per abstract). Submitting the presentation proposal doubles as travel support application.
Workshop Outcomes
This workshop aims to produce a report capturing the community's collective knowledge and recommendations for advancing practical reproducibility in HPC. Your presentations and participation will directly contribute to this valuable resource.
If you have any questions, please contact us at contact@chameleoncloud.org or via the Chameleon users list.
Coming Soon!
The Community Workshop on Practical Reproducibility in HPC will occur on Nov. 18, 2024 in Atlanta, GA - the same week and location as this year's premier high-performance computing conference.
Doors will open at 8:30 AM and the event will start at 9 AM. All attendees are invited to join us for a rooftop happy hour at the end of the event (around 5 PM). We will send out announcements and update this page when we have a detailed agenda to share.
Keynotes
The workshop will feature keynotes on the state of reproducibility in HPC from our distinguished speakers. Additionally, panel discussions with authors and reviewers who have participated in reproducibility initiatives at HPC and systems conferences, such as SC, will share their experiences in creating, evaluating, and ranking HPC artifacts to support reproducibility.
Torsten Hoefler
Keynote: Reproducing Performance - The Good, the Bad, and the Ugly
Containers and Jupyter notebooks are useful tools for reproducing computational results of any packaged application. However, if the execution performance or efficiency is the science result, matters are more complex. It may not be sufficient to package codes in containers. In fact, containers may disturb the performance results and reproducibility. We outline a set of techniques to facilitate performance reproducibility in various settings. Some performance results may be linked to specific computer architectures or even specific system configurations that may not be accessible to other researchers or even the original team after a software update. We outline techniques to help researchers interpret results on the original system even if it is practically impossible to reproduce the original results. We discuss such techniques both in the context of pure performance but also in the context of the emerging field of data science and artificial intelligence that often allows for a performance-accuracy tradeoff. All-in-all, our work provides a set of guidelines to follow to support reproducible science of performance and benchmarking.
Kate Keahey
Keynote: Adaptable Infrastructures for Reproducible Science - The Chameleon 4 Approach
The landscape of computer science research is evolving at an unprecedented pace, with innovations in AI, data science, edge computing, and beyond. These advancements demand a flexible, powerful infrastructure capable of supporting a wide array of experiments while facilitating reproducibility and collaboration. Chameleon 4, the latest iteration of the NSF-funded testbed, rises to meet these challenges. In this keynote, Dr. Keahey will unveil how Chameleon 4 extends its deeply reconfigurable edge-to-cloud architecture to support emerging research needs. She will describe the testbed's enhanced virtualization capabilities, expanded edge computing functionalities, and advanced mechanisms for sharing digital artifacts. Dr. Keahey will illustrate how these features, combined with Chameleon's existing bare-metal reconfigurability and diverse hardware offerings, create an unparalleled platform for reproducible science. The talk will explore Chameleon 4's approach to federation, enabling seamless integration with other research infrastructures, and discuss how the platform's adaptability ensures it can evolve alongside the ever-changing frontiers of computer science research. Through real-world examples and future roadmaps, attendees will gain insight into how Chameleon 4 is poised to accelerate innovation and foster a more open, collaborative scientific community in the realm of HPC and beyond.
See You in Atlanta!
Address
330 Marietta St NW
Atlanta, GA 30313
The workshop will be held at Terminus 330, a state-of-the-art venue in the heart of Atlanta, GA, conveniently located right around the corner from the Georgia World Congress Center where the biggest annual conference in supercomputing is being held November 17-22, 2024.
Lodging
Our workshop is conveniently located in the heart of downtown Atlanta. There are many high-quality hotels and lodgings in the area. If you are also in town for the big HPC conference, we recommend checking out their resources for more options.
Travel
Getting to Atlanta
Atlanta is easily accessible by air and ground transportation:
- By Air: Hartsfield-Jackson Atlanta International Airport (ATL) is the primary airport serving Atlanta.
- By Car: Atlanta is intersected by several major interstate highways, making it easily accessible by car from many parts of the United States.
- By Train: Amtrak's Crescent line serves Atlanta, connecting it to major cities in the Northeast and New Orleans.
Getting to the Venue
Terminus 330 is located in downtown Atlanta. Here are some options for getting to the venue:
- From the Airport: Take the MARTA train from the airport to downtown Atlanta. The venue is a short walk or ride-share trip from several MARTA stations.
- By Public Transit: Atlanta's MARTA system provides bus and rail service throughout the city. Check the MARTA website for routes and schedules.
- By Car: If driving, there are several parking options near the venue. We recommend checking online parking apps for the best rates.
- Ride-sharing: Services like Uber and Lyft are widely available in Atlanta and can be a convenient option for getting around the city.
For more information on getting around Atlanta, visit the official Atlanta tourism website.
Our Projects
About the REPETO Project
The REPETO (pronounce to rhyme with Geppetto) project is an NSF-funded research coordination network (RCN) that promotes the concept of practical reproducibility. This practice aims to package experiments in a way that allows cost-effective repetition, making them as accessible for exploring research as reading papers is today.
Key aspects of the REPETO project include:
- Focus on experiments in the computer science community
- Understanding the cost/benefit equation of reproducibility for these experiments
- Identifying factors that make reproducibility difficult or infeasible
- Fostering community practices to integrate reproducibility into mainstream research and education activities in computer science
About the Chameleon Testbed
Chameleon is an NSF-funded, large-scale, deeply programmable experimental platform for Computer Science systems research. It provides a configurable environment that can support a wide range of experimental needs, from bare metal reconfiguration to support for reproducible software environments. Chameleon allows researchers to explore transformative concepts in cloud computing, distributed computing, networking, and machine learning, enabling them to experiment with novel cloud architectures and pursue new applications of cloud computing. With its unique features and commitment to openness, Chameleon plays a crucial role in advancing computer science research and education.
Registration Details
Early Bird Registration (NOW UNTIL Oct. 25, 2024): $20
Regular Registration (Oct. 26 - Nov. 18, 2024): $100
All registration fees will cover event-related costs. Limited sponsorships are available to cover costs for presenters. Eligible registrations will receive a code.
Note: Seats are limited. Register early to secure your spot and take advantage of the early bird discount!
Contact Information
For any questions or feedback, send us an email at contact@chameleoncloud.org or reach out to one of our event coordinators:
Name: Marc Richardson
Email: mtrichardson@uchicago.edu
Name: Roberto Vale
Email: rvale@uchicago.edu