Life Sciences

Whitepaper: Genomics Data Transfer, Analytics, and Machine Learning using AWS Services

Issue link: https://read.uberflip.com/i/1358110

Contents of this Issue

Navigation

Page 14 of 33

Genomics Data Transfer, Analytics, and Machine Learning using AWS Services AWS Whitepaper Recommendations Performing tertiary analysis with machine learning using Amazon SageMaker Creating machine learning (ML) training sets and training models, and generating ML predictions using genomics data can be done using AWS Glue and Amazon SageMaker. We'll provide recommendations and a reference architecture for creating training sets, training models, and generating predictions using AWS Glue, Amazon SageMaker, and Amazon SageMaker Jupyter notebooks. Recommendations When creating ML training sets, training ML models, and generating predictions, consider the following recommendations to optimize performance and cost. Use AWS Glue extract, transform, and load (ETL) service to build your training sets—AWS Glue is a fully managed ETL service that makes it easy for you to extract data from object files in Amazon Simple Storage Service (Amazon S3), transform the dataset adding features needed for training machine learning models, and write the resulting data to an Amazon S3 bucket. Use Amazon SageMaker Autopilot to quickly generate model generation pipelines—Amazon SageMaker Autopilot automatically trains and tunes the best machine learning models for classification or regression, based on your data while allowing you to maintain full control and visibility. Sagemaker Autopilot analyses the data and produces an execution plan that includes feature engineering, algorithm identification, hyperparameter optimization to select the best model, and deploys the model to an endpoint. The process of generating the models are transparent and the execution plan is made available to the users via automatically generated notebooks that users can review and modify. Use Amazon SageMaker hosting services to host your machine learning models—Amazon SageMaker provides model hosting services for model deployment and an HTTPS endpoint where your machine learning model is available to provide inferences. Use Amazon SageMaker notebook instances to create notebooks, generate predictions, and test your deployed machine learning models—Use Jupyter notebooks in your notebook instance to prepare and process data, write code to train models, deploy models to Amazon SageMaker hosting, and test or validate your models. 12

Articles in this issue

view archives of Life Sciences - Whitepaper: Genomics Data Transfer, Analytics, and Machine Learning using AWS Services