Life Sciences

Whitepaper: Genomics Data Transfer, Analytics, and Machine Learning using AWS Services

Issue link: https://read.uberflip.com/i/1358110

Contents of this Issue

Navigation

Page 18 of 33

Genomics Data Transfer, Analytics, and Machine Learning using AWS Services AWS Whitepaper 10.An additional step is added to run an AWS Glue workflow to convert the VCF to Apache Parquet, write the Parquet files to a data lake bucket in Amazon S3 and update the AWS Glue Data Catalog. 11.A bioinformatic scientist works with the data in the Amazon S3 data lake using Amazon Athena via a Jupyter notebook, Amazon Athena console, AWS CLI, or an API. Jupyter notebooks can be launched from either Amazon SageMaker or AWS Glue. You can also use Amazon SageMaker to train machine learning models or do inference using data in your data lake. 16

Articles in this issue

Links on this page

view archives of Life Sciences - Whitepaper: Genomics Data Transfer, Analytics, and Machine Learning using AWS Services