Life Sciences

HPC Lens for the AWS Well-Architected Framework

Issue link: https://read.uberflip.com/i/1187300

Contents of this Issue

Navigation

Page 42 of 46

Amazon Web Services – HPC Lens AWS Well-Architected Framework Page 39 opportunity to reduce costs by increasing and decreasing resource capacity on an as-needed basis. For example, a low-level run-rate HPC capacity can be provisioned and reserved upfront so as to benefit from higher discounts, while burst requirements may be provisioned with spot or on-demand pricing and brought online, only as needed. • Optimize infrastructure costs for specific jobs: Many HPC workloads are part of a data processing pipeline that includes the data transfer, pre-processing, computational calculations, post-processing, data transfer, and storage steps. In the cloud, rather than use a large and expensive server for all tasks, the computing platform is optimized at each step. For example, if a single step in a pipeline requires a large amount of memory, you only need to pay for a more expensive large memory server for the memory-intensive application, while all other steps can run well on smaller and cheaper servers. Costs are reduced by optimizing infrastructure for each task at each step of a workload. • Burst workloads in the most efficient way: With cloud, savings are often obtained for HPC workloads by bursting horizontally. When bursting horizontally, many jobs or iterations of an entire workload are run simultaneously for less total elapsed time. Depending on the application, horizontal scaling can potentially be cost neutral while offering indirect cost savings by delivering results in a fraction of the time. • Make use of spot pricing: Amazon EC2 Spot Instances offer spare compute capacity in AWS at steep discounts compared to On-Demand instances. However, Spot Instances can be interrupted when EC2 needs to reclaim the capacity. Spot Instances are frequently the most cost- effective resource for flexible or fault-tolerant workloads. The intermittent and bursty nature of HPC workloads makes them very well suited to Spot Instances (as opposed to, for example, an uninterruptible workload, such as database hosting). The risk of Spot Instance interruption can be minimized by working with the Spot Advisor, and the interruption impact can be potentially mitigated by changing the default interruption behavior and using Spot Fleet to manage your Spot Instances. The need to occasionally restart a workload is often easily offset by the cost savings of Spot Instances. • Assess the tradeoff of cost versus time: Tightly coupled, massively parallel workloads are often able to run on a wide range of core counts.

Articles in this issue

view archives of Life Sciences - HPC Lens for the AWS Well-Architected Framework