AWS Parallel Computing Service Released

September 04, 2024

Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company, announced the general availability of AWS Parallel Computing Service, a new managed service that helps customers easily set up and manage high performance computing (HPC) clusters so they can run scientific and engineering workloads at virtually any scale on AWS.

The service makes it easy for system administrators to build clusters using Amazon Elastic Compute Cloud (Amazon EC2) instances, low-latency networking, and storage optimized for HPC workloads. With AWS Parallel Computing Service, scientists and engineers can quickly scale simulations to validate models and designs, while system administrators and integrators can build and maintain HPC clusters on AWS using Slurm, the most popular open-source HPC workload manager. This service accelerates innovation in areas such as fast-tracking drug discovery, uncovering genomic insights, building engineering designs, running weather applications, and building scientific and engineering models.

AWS has a history of innovation in supporting HPC workloads. That history includes releases like the open source cluster orchestration toolkit AWS ParallelCluster, fully managed batch computing service AWS Batch, low latency network interconnect Elastic Fabric Adapter, Amazon FSx for Lustre high performance storage, and dedicated AMD, Intel, and Graviton-based HPC compute instances, the latter delivering up to 65% better price-performance over comparable compute optimized x86-based instances. Thousands of customers from a wide range of industries have migrated their HPC workloads to AWS to fast-track drug discovery, uncover genomic insights, maximize energy resources, and spin up supercomputers with millions of cores. Today AWS continues our innovation in HPC by releasing a fully-managed and comprehensive HPC service, which removes the undifferentiated heavy lifting of creating and managing HPC clusters.

AWS Parallel Computing Service is a new managed service that helps customers easily set up and manage HPC so they can run scientific and engineering workloads at virtually any scale on AWS. With AWS Parallel Computing Service, system administrators can use familiar tools including AWS Management Console, CLI, and SDK to deploy a managed Slurm environment. AWS Parallel Computing Service builds from open-source foundations that customers know and have experience with, and delivers a managed Slurm experience with the reliability and availability of AWS. AWS Parallel Computing Service significantly reduces the operational burden of managing a cluster and regularly delivers new capabilities and fixes through managed service updates with minimal to no downtime, eliminating the need to apply manual patches and rebuilding clusters to receive feature updates. Highly available APIs also help developers and ISVs create end-to-end HPC solutions on top of AWS, so they can focus on providing value-added features to their users and customers instead of worrying about managing infrastructure. AWS Parallel Computing Service enables customers of all sizes (e.g., startups, enterprises, or national labs) to easily create and manage HPC clusters with the scalability, reliability, and security of AWS. This means scientists and engineers using Slurm can easily migrate their existing on-premises workflows to AWS without re-architecting them—giving scientists and engineers access to cloud infrastructure that scales automatically. And administrators who want to unblock capacity or capability constraints for their end-users can spin up clusters in just minutes instead of months, to run their simulations to address the world’s most challenging problems.

“Developing a cure for a catastrophic disease, designing novel materials, advancing renewable energy, and revolutionizing transportation are problems that we just can’t afford to have waiting in a queue,” said Ian Colle, director, advanced compute and simulation at AWS. “Managing HPC workloads, particularly the most complex and challenging extreme-scale workloads, is extraordinarily difficult. Our aim is that every scientist and engineer using AWS Parallel Computing Service, regardless of organization size, is the most productive person in their field because they have the same top-tier HPC capabilities as large enterprises to solve the world’s toughest challenges, any time they need to, and at any scale.”

To get started, system administrators use the AWS Management Console to spin up a Slurm cluster securely and execute jobs in just a few clicks, compared to manual orchestration today. With CloudFormation support coming soon, customers will be able to build and deploy HPC clusters using infrastructure as code. AWS Parallel Computing Service is now available in the following Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Europe (Frankfurt), Europe (Stockholm), Europe (Ireland), Asia Pacific (Sydney), Asia Pacific (Singapore), Asia Pacific (Tokyo).

Marvel Fusion is a Germany-based fusion energy startup pursuing the creation of unlimited zero-emission energy. “We are excited that AWS Parallel Computing Service will deliver highly available and easy-to-upgrade HPC cluster management capabilities,” said Moritz von der Linden, CEO of Marvel Fusion. “It will empower our scientists and IT staff to take advantage of the latest AWS Parallel Computing Service capabilities in hours, instead of the weeks of planning and overhead previously needed.”

Maxar Intelligence provides secure, precise geospatial intelligence, enabling government and commercial customers to monitor, understand, and navigate our changing planet. “As a long-time user of AWS HPC solutions, we were excited to test the service-driven approach from AWS Parallel Computing Service,” said Travis Hartman, director of Weather and Climate at Maxar Intelligence. “We found great potential for AWS Parallel Computing Service to bring better cluster visibility, compute provisioning, and service integration to Maxar Intelligence’s WeatherDesk platform, which would enable the team to make their time-sensitive HPC clusters more resilient and easier to manage.”

RONIN is an Australia-based software company whose flagship HPC service provides a simple, intuitive web interface for researchers and scientists from leading academic and research institutions to easily run HPC simulations on AWS. "Democratizing HPC in the cloud by simplifying the user experience for researchers is our key mission," said Nathan Albrighton, CEO and founder of RONIN. "The introduction of AWS Parallel Computing Service greatly simplifies our ability to build and operate HPC environments using APIs and elevates the HPC capabilities we offer to our customers."

The U.S. Department of Energy’s National Renewable Energy Laboratory (NREL) is a leading institution focused on research, innovation, and strategic partnerships to deliver solutions for a clean energy economy. "The pursuit of scientific discovery comes with significant overhead associated with maintaining high performance computing infrastructure," said Michael Bartlett, cloud architect in the Advance Computing Operations Group at NREL. "AWS Parallel Computing Service has the potential to improve our research efficiency by reducing this overhead with its automated update and observability management features. In particular, new capabilities for automatic scaling and handling high-throughput computing tasks will allow us to efficiently process large datasets and complex simulations, ensuring that our scientists can prioritize solving high-priority problems."

Industry News

StackGen Platform Now Live on Google Cloud Marketplace

April 03, 2025

StackGen has partnered with Google Cloud Platform (GCP) to bring its platform to the Google Cloud Marketplace.

Tricentis Launches Cloud-Based Test Data Capabilities for Tosca

April 03, 2025

Tricentis announced its spring release of new cloud capabilities for the company’s AI-powered, model-based test automation solution, Tricentis Tosca.

Lucid Software Acquires airfocus

April 03, 2025

Lucid Software has acquired airfocus, an AI-powered product management and roadmapping platform designed to help teams prioritize and build the right products faster.

AutonomyAI Emerges from Stealth

April 03, 2025

AutonomyAI announced its launch from stealth with $4 million in pre-seed funding.

Kong AI Gateway 3.10 Released

April 02, 2025

Kong announced the launch of the latest version of Kong AI Gateway, which introduces new features to provide the AI security and governance guardrails needed to make GenAI and Agentic AI production-ready.

Traefik Labs EnhancesAI Gateway

April 02, 2025

Traefik Labs announced significant enhancements to its AI Gateway platform along with new developer tools designed to streamline enterprise AI adoption and API development.

Zencoder Releases New AI Coding and Unit Testing Agents

April 02, 2025

Zencoder released its next-generation AI coding and unit testing agents, designed to accelerate software development for professional engineers.

Windsurf and Netlify Launch AI IDE-Native Deployment Integration

April 02, 2025

Windsurf (formerly Codeium) and Netlify announced a new technology partnership that brings seamless, one-click deployment directly into the developer's integrated development environment (IDE.)

Opsera Raises $20M in Series B Funding

April 02, 2025

Opsera raised $20M in Series B funding.

CNCF Updates Certification Offerings

April 02, 2025

The Cloud Native Computing Foundation® (CNCF®), which builds sustainable ecosystems for cloud native software, is making significant updates to its certification offerings.

CNCF Launches Golden Kubestronaut Program

April 01, 2025

The Cloud Native Computing Foundation® (CNCF®), which builds sustainable ecosystems for cloud native software, announced the Golden Kubestronaut program, a distinguished recognition for professionals who have demonstrated the highest level of expertise in Kubernetes, cloud native technologies, and Linux administration.

Red Hat Developer Hub 1.5 Released

April 01, 2025

Red Hat announced new capabilities and enhancements for Red Hat Developer Hub, Red Hat’s enterprise-grade internal developer portal based on the Backstage project.

Platform9 Releases Free Community Edition

April 01, 2025

Platform9 announced that Private Cloud Director Community Edition is generally available.

Sonatype Expands Support for Rust

March 31, 2025

Sonatype expanded support for software development in Rust via the Cargo registry to the entire Sonatype product suite.

CloudBolt Acquires StormForge

March 31, 2025

CloudBolt Software announced its acquisition of StormForge, a provider of machine learning-powered Kubernetes resource optimization.

DEVOPSdigest

Industry News

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

The Latest

Hot Topics

Related Links

Industry News

Search form

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

User login

The Latest

Hot Topics