Bio-Linux and other bioinformatics tools available for EC2, Amazon’s Elastic Compute Cloud, were recently highlighted on the Amazon Web Services (AWS) blog. Customized Amazon machine images (AMIs) allow for the packaging and rapid, web based deployment of the data sets and tools needed for these specialized tasks. Because AMIs can be saved, reproducing past results is simplified and because these can also be shared, the computation environment of a particular analysis can be easily replicated both from within and outside your organization.
The Bio-Linux AMI from the J. Craig Venter Institute is an Umbutu-based AMI that includes open source and NEBC tools with all their dependencies for genome analysis. There is also an AMI for Bioperl-max , which has production packages for BioPerl and a set of open source bioinformatics tools including blast and R.
For proteomics analysis, the Medical College of Wisconsin has made available an AMI for VIPDAC, the Virtual Proteomics Data Analysis Cluster. Using VIPDAC researchers were able to analyze a protein for $0.84 / run on AWS and have made this tool available to anyone with an AWS account and a web browser.
The massively scalable computational and storage infrastrusture of the Amazon cloud make it an attractive platform for large scale scientific analysis. As many research facilities and small labs look for ways to scale their data storage and computational capabilities to handle the demands of ever more resource intensive analysis, many will continue to explore public cloud computing as a possible augmentation to – or even replacement for- a dedicated internal data center. From free hosting of large public data sets to usage grants for educational and research institutions , Amazon is clearly opening the door wide to welcome the bioinformatics community.