Life Sciences

AWS genomics guide

Issue link: https://read.uberflip.com/i/1182530

Contents of this Issue

Navigation

Page 32 of 33

Amazon Web Services – Paper Title Page 29 • The Cancer Genome Atlas : Raw and processed genomic, transcriptomic, and epigenomic data from The Cancer Genome Atlas (TCGA) available to qualified res earchers via the Cancer Genomics Cloud • 1000 Genomes Project : A detailed map of human variation • Genome in a Bottle (GIAB): Several refere nce genomes to enable translation of whole human genome sequencing to clinical practice • 3000 Rice Genome on AWS : Genome sequence of 3,024 rice varieties The public datasets are h osted in two possible formats: Amazon Elastic Block Store (Amazon EBS) snapshots and/or Amazon Simple Storage Service (Amazon S3) buckets. To access a public dataset hosted in Amazon S3: You can make simple HTTP requests, use AWS Command Line Tools and SDK s (Ruby, Java, Python, .NET, PHP, etc.), download the data using Amazon EC2, or use Hadoop to process the data with Amazon EMR. To access a dataset hosted as an Amazon EBS snapshot: Sign up for an AWS account, launch an Amazon EC2 instance, and create an A mazon EBS volume using the Snapshot ID listed in one of the links above. If you have any questions or want to participate in our Public Datasets community, please email us at opendata@amazon.com . Conclusion We hope this guide was informative and helpful. The production of valuable genomic data requires careful consideration over several stages ranging f rom acquisition to storage, onto compute and finally t o distribution. AWS pre sents a wide range of capabilities that can be leveraged for each stage and help researchers unlock the potential inside of genomic data.

Articles in this issue

Links on this page

view archives of Life Sciences - AWS genomics guide