⚡ Quick Guide: AWS Blog | Tech Viral News

Box Zones – Giving Enterprises Control Over Data Location Using AWS

by Jeff Barr | on 12 APR 2016 | in Amazon S3, Customer Success | Permalink | Comments

Our friends over at Box provide secure content management, collaboration, and file sharing for over half of the companies on the Fortune 500 list.

Box has succeeded by paying attention to the needs of enterprise customers. For example, last year I wrote about Box Enterprise Key Management (EKM), a flexible, no-compromises encryption system that gives Box customers control over their encryption keys. This feature has evolved into Box KeySafe, which allows even the smallest IT shops use encryption to protect their proprietary documents. Other Box features such as Box Capture (mobile phone integration with business processing) and Box Governance (control and compliance) are also purpose-built to meet the business needs of enterprises.

We are happy to play a strong supporting role in today’s launch of Box Zones. This new feature uses Amazon S3 to provide Box customers with a choice of four storage locations (Germany, Ireland, Singapore, and Tokyo). Box customers can decide where to store their data while still taking advantage of other Box features such as watermarking, fine-grained control over permissions, commenting, and file preview.

To learn more about this new feature, sign up for the Box Zones webinar.

Congratulations to Box, and thank you for using AWS!

— Jeff;

AWS Week in Review – April 4, 2016

by Jeff Barr | on 11 APR 2016 | in Week in Review | Permalink | Comments

Let’s take a quick look at what happened in AWS-land last week:

Monday April 4	We announced Two New Predefined Commands and an Open Source Agent for the EC2 Run Command. We announced Release 4.5.0 of Amazon EMR, with Spark 1.6.1, Updated Hadoop and Presto, and Support for S3 SSE-KMS. The AWS Compute Blog talked about Building a Dynamic DNS for Route 53 Using CloudWatch and and Lambda. The AWSome Blog talked about Amazon Kinesis Streams. BotMetric shared a Beautiful Representation of Your AWS Infrastructure. A three-part series from CloudCheckr talked about Making AWS Security Simple (Part 1, Part 2, Part 3). Evident talked about Implementing the Top Ten AWS Security Best Practices. Cloudnaut discussed Event Driven Security Automation on AWS.
Tuesday April 5	We announced the Amazon-ECS Optimized Amazon Linux AMI. We announced Data Shuffling for Amazon Machine Learning. We announced that Amazon API Gateway Can Now Import Swagger Definitions. We announced Metric Based Health Checks, DNS Failover for Private Hosted Zones, and Configurable Health Check Locations for Route 53. My colleagues blogged about Building Bridges for Better Cancer Treatment with the Fred Hutchinson Cancer Research Center. The AWS Compute Blog discussed Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda. The AWS Windows and .NET Developer Blog shared an updated on the AWS SDK for .NET Version 2 Status. The AWS Government, Education, & Nonprofits Blog announced that AWS Has Signed a CJIS Memorandum with the State of Minnesota. The AWS Partner Network Blog showed you how to Automatically Delete Terminated Instances in Chef Server with AWS Lambda and CloudWatch Events. Spotinst talked about Elastigroup & CloudFormation. Cloud Technology Partners shared 5 Tips for Landing a Well-Paid Cloud Job. DZone Cloud Zone discussed the AWS ZSH Helper. Yarden Eitan wrote a Guide to Moving a Flask Backend from Heroku to AWS.
Wednesday April 6	We announced that Amazon CloudWatch Logs Are Now Available in the AWS China (Beijing) Region. We announced that AWS Config Rules Is Now Available in 4 New Regions: US West (Oregon), EU (Ireland), EU (Frankfurt) and Asia Pacific (Tokyo). We announced that Amazon Redshift is Now Available in the China (Beijing) Region. A guest post from LiveOps Cloud talked about Tapping the Billion Dollar Call-Center Market on AWS. The AWS Compute Blog announced some Amazon API Gateway Mapping Improvements. The AWS Government, Education, & Nonprofits Blog talked about Powering Smart Cities Through Connected Devices. Stelligent discussed Security in the Continuous Delivery Pipeline. CloudCheckr talked about Optimizing Amazon DynamoDB with CloudCheckr. Cloud Academy continued to talked about Automating a Standalone SharePoint Farm with CFN-INIT. Trek10 unveiled AWSume – AWS Assume Made Awesome. Dius wrote about Building an Auto-Scaling R Cluster Using CfnCluster. TeamCity launched an AWS CodeDeploy Running Plugin.
Thursday April 7	We launched an AWS Training Update – Revised AWS Technical Essentials and Architecting on AWS Courses. We announced Two New Deployment Policies and an Amazon Linux AMI 2016.03 Update for Elastic Beanstalk. We announced that AWS IoT is Now Available in the EU (Frankfurt) Region. The AWS Big Data Blog showed you how to Encrypt Your Amazon Redshift Loads with Amazon S3 and AWS KMS. The AWS Compute Blog announced that the Node.js 4.3.2 Runtime is Now Available on AWS Lambda. The AWS Security Blog announced Simplified Configuration of Trust Relationships in the AWS Directory Service Console. The AWS Security Blog answered some Frequently Asked Questions Compliance in the AWS Cloud. BotMetric talked about the Evolution of Agile in Enterprises Through the AWS Cloud. N2W Software showed you how to Create Consistent Backups Using the XFS File System and EBS Snapshots. 2nd Watch wondered Why Buy Amazon Web Services Through a Partner? Cloud Academy started a series that will explore the Components of Amazon Serverless Architecture with Amazon API Gateway. Cloudonaut launched New CloudFormation templates for NAT Gateway, Static Websites, and Security. They also launched New Online Training: Automatic AWS With CloudFormation. Flux7 talked about Updates to AWS CloudFormation: Change Sets Help to Get to the Root Cause of Stack Update Failures.
Friday April 8	We Announced that CloudWatch Events Now Supports Lambda Function Versions and Aliases. We announced that Amazon RDS for PostgreSQL Now Supports PostgreSQL 9.5 With Version 9.5.2, as Well as Minor Versions 9.4.7 and 9.3.1. Mark Litwintschik launched a 50-Node Presto Cluster on Amazon EMR. Sam Bashton wrote about ELK on ARK (processing 1 TB of logs on AWS per day). Batchly showed you how to Run Your EMR Workloads with 80% Savings. Michael de Silva shared a Sneak Peek of a Ruby Gem to Provision Amazon Web Services.

Upcoming Events

AWS Partner Webinars – April.
April 29 – Live Event (Singapore) – AWS Partner Summit.
AWS Zombie Microservices Roadshow.

Help Wanted

AWS Careers.

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

— Jeff;

AWS Week in Review – March 28, 2016

by Jeff Barr | on 11 APR 2016 | in Week in Review | Permalink | Comments

Let’s take a quick look at what happened in AWS-land last week:

Monday March 28	The AWS Security Blog showed you How to Easily Identify Your Federated Users by Using AWS CloudTrail. BotMetric continued their series of posts on AWS Security Best Practices with two new posts: Data Security and Detective Services. Evident discussed Programmatic Security Automation for AWS. Cloud Academy shared some tips designed to help you to Pass the AWS DevOps Pro Exam. Cloudonaut explored Amazon S3 as an Object Store. Gorillastack explained Why You Should Move Compute to Amazon EC2 Spot Instances. The Register reviewed Amazon WorkSpaces Two Years On.
Tuesday March 29	We launched Change Sets for AWS CloudFormation. We announced that Amazon Redshift Now Supports Using IAM Roles With COPY and UNLOAD Commands. We announced that AWS WAF Now Supports a Cross-Site Scripting (XSS) Match Condition. A guest post explained how the Experiment that Discovered the Higgs Boson Uses AWS to Probe Nature. The Amazon Mobile App Distribution Blog announced a New Alexa Skills Kit Template – A Step-by-step Guide to Build a Fact Skill. The AWS Big Data Blog discussed Crunching Statistics at Scale with SparkR on Amazon EMR. The AWS Compute Blog discussed Cloudmicro for AWS: Speeding up Serverless Development at the Coca-Cola Company. The AWS Partner Network Blog introduced Amazon VPC for On-Premises Network Engineers (this is Part 1 of a series). The AWS Security Blog showed How to Detect and Automatically Revoke Unintended IAM Access with Amazon CloudWatch Events. The AWS Startup Collection showed how to Trigger Lambda Functions via Text Messages. Stelligent continued their series on serverless delivery with two new posts: Bootstrapping the Pipeline and Orchestrating the Pipeline. RightScale explained how to Understand Your Effective Cloud Costs with Markups and Markdowns. Sungard talked about Lambda Formation: Rocket Fuel for AWS CloudFormation.
Wednesday March 30	We announced that CloudWatch Events now Supports Amazon SQS Queue Targets. We added Four New Checks to AWS Trusted Advisor. The Amazon GameDev Blog shared some Reflection on GDC 2016. The ABC Developer Blog talked about Configuration Management for both Onsite and in the Cloud. CloudNative introduced Yeobot to manage multiple AWS accounts. N2W Software compared AWS Backup vs. Traditional Backup: 6 Fundamental Differences. Netflix talked about Global Cloud – Active-Active and Beyond. Cloudability compared t2.large to m4.large EC2 Instances. Skeddly described a Near-Zero-Cost Insurance Policy Against Unused EC2 Instances. Dome9 advised you to Lift and Shift to the Public Cloud with Security in Mind.
Thursday March 31	We announced that Amazon Aurora is Now Available in the Asia Pacific (Seoul) Region. We announced that AWS Config is Now Available in the Asia Pacific (Seoul) Region. We announced that EC2 Container Registry is Now Available in the EU (Ireland) Region. We announced that Amazon RDS Now Has Multi-AZ support for SQL Server in the Asia Pacific (Seoul) Region. We announced Automatic Retargeting of Auto Scaling Spot Instances. We announced Updated CloudFormation Support for S3, Lambda, and Amazon EMR. The Amazon Mobile App Distribution Blog announced New ASK Features: Standard Home Card with Image Support. The AWS DevOps Blog invited you to Explore Continuous Delivery in AWS With the Pipeline Starter Kit. The AWS Big Data Blog asked Will Spark Power the Data Behind Precision Medicine? The AWS JavaScript Blog announced Support for Promises in the AWS SDK for JavaScript. The AWS Government, Education, & Nonprofits Blog explained how Time to Science and Time to Results are Transforming Research in the Cloud. BotMetric discussed The Importance of Using AWS for Disaster Recovery.
Friday April 1	We announced that the RDS Console Now Supports Cluster View for Amazon Aurora. We announced that Amazon RDS Now Supports January PSU Patches, Improved Custom Oracle Directories, and Read Privileges Support. The AWS DevOps Blog showed you how to Color-Code Your AWS OpsWorks Stacks for Better Instance and Resource Tracking. The AWS Government, Education, & Nonprofits Blog described a Mission to Make the World a Safer Place Through Crowdsourced Intelligence. The AWS Startup Collection announced the Return of the London Loft.
Sunday April 3	Rowan Udell talked about Visualizing EC2 Security Groups.

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

— Jeff;

AWS Week in Review – March 21, 2016

by Jeff Barr | on 11 APR 2016 | in Week in Review | Permalink | Comments

Let’s take a quick look at what happened in AWS-land last week:

Monday March 21	We announced New CloudWatch Metrics for Spot Fleets. We announced Retry Throttling for the AWS SDK for Java. We announced that it is Now Easier to Connect Amazon Machine Learning to Amazon Redshift via the AWS Management Console. We announced that AWS Storage Gateway Now Supports up to 1 PB Virtual Tape Libraries and up to 512 TB of Stored Volumes per Gateway. The AWS Government, Education, & Nonprofits Blog shared Lessons Learned When Going All-In on AWS. The AWS Mobile Development Blog talked about Setting up Parse Server and MongoDB on AWS Using CloudFormation. Stelligent showed you how to Create a Cross-Account Pipeline in AWS CloudFormation. RightScale showed you how to Manage Cloud Costs via a Single API. Mark Litwintschik explored A Billion Taxi Rides on Amazon EMR running Spark. Cloud Academy talked about AWS Certification Exams: What to Expect. Jordan Farrer decided that Lambda + Alexa = VoiceOps.
Tuesday March 22	We announced Additional Pricing Options for AWS Marketplace Products. We announced the AWS Encryption SDK. We announced that Spot Instances are Now Available in the Asia Pacific (Seoul) Region. We announced the Amazon Linux AMI 2016.03 is Now Available. We announced that you can now Pay a Flat Monthly Fee for Unlimited Testing of iOS and Android Apps with AWS Device Farm. The AWS Big Data Blog showed you how to Import Zeppelin Notes from GitHub or JSON in Zeppelin 0.5.6 on Amazon EMR. The AWS Security Blog showed you How to Use the New AWS Encryption SDK to Simplify Data Encryption and Improve Application Availability. The AWS Startup Collection talked about Adapting Deep Learning to Medicine with Behold.ai. 2nd Watch talked about how SCOR Velogica Moved to the AWS Cloud for Better Security & SOC2. Flux7 shared an AWS Autoscaling Case Study.
Wednesday March 23	We launched Windows Authentication for Amazon RDS for SQL Server. I wrote about Airbnb – Reinventing the Hospitality Industry on AWS. The AWS Government, Education, & Nonprofits Blog noted that NASA’s Data in in the Amazon Cloud and asked Is Yours? The AWS Partner Network Blog talked about Optimizing SaaS Tenant Workflows and Costs. TriNimbus talked about High-Caffeine Fun with AWS IoT: Introducing TriNimbus IoT Coffeebot. Cloudonaut explored Optional Parameters in CloudFormation. Dome9 shared 5 CSO Myths related to Migrating Enterprise Workloads to AWS.
Thursday March 24	We launched an ElastiCache for Redis Update that can Upgrade Engines and Scale Up. We announced Support for Multiple Trails in AWS GovCloud (US). We announced that the APN Partner Accreditation Courses are Now Available in More Languages. We announced that AWS Mobile Hub Now Supports Swift. The AWS Big Data Blog talked about Anomaly Detection Using PySpark, Hive, and Hue on Amazon EMR. N2W Software listed The 5 AWS Building Blocks for Your Cloud Startup. Powerupcloud talked about Deployment Automation Using AWS CodeDeploy.
Friday March 25	We announced that Amazon RDS for Postgres Now Supports Enhanced Monitoring and can Enforce SSL Connections. We announced that Amazon RDS for MariaDB is Now Covered by the RDS SLA. The AWS Java Blog talked about Testing Lambda Function Using the AWS Toolkit for Eclipse.
Saturday March 26	Powerupcloud talked about Automating Couchbase Backups to S3 and Automating RDS Snapshots with AWS Lambda.

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

— Jeff;

Amazon Kinesis Agent Update – New Data Preprocessing Features

by Jeff Barr | on 11 APR 2016 | in Amazon Kinesis | Permalink | Comments

My colleague Ray Zhu wrote the guest post below to introduce you to some new data preprocessing features for the Amazon Kinesis Agent.

— Jeff;

Amazon Kinesis Agent is a stand-alone Java software application that provides an easy and reliable way to send data to Amazon Kinesis Streams and Amazon Kinesis Firehose. The agent monitors a set of files for new data and then sends it to Kinesis Streams or Kinesis Firehose continuously. It handles file rotation, checkpointing, and retrial upon failures. It also supports Amazon CloudWatch so that you can closely monitor and troubleshoot the data flow from the agent.

Data Preprocessing with Kinesis Agent
Today we are adding data preprocessing capabilities to the agent so that your data can be well formatted before it is sent to Kinesis Streams or Kinesis Firehose. The agent currently supports the three processing options listed below. Because the agent is open source, you can further develop and extend these processing options.

SINGLELINE – This option converts a multi-line record to a single line record by removing newline characters, and leading and trailing spaces.

CSVTOJSON – This option converts a record from delimiter separated format to JSON format.

LOGTOJSON – This option converts a record from several commonly used log formats to JSON format. Currently supported log formats are Apache Common Log, Apache Combined Log, Apache Error Log, and RFC3164 (syslog).

Analyze Apache Tomcat Access Log in Near Real-Time
Let’s look at an example of analyzing Tomcat access logs in near real-time using Kinesis Agent’s preprocessing feature, Amazon Kinesis Firehose, and Amazon Redshift. Here’s the overall flow:

First I need to create a table in my Redshift cluster to store the Tomcat access log. The following SQL statement is used to create the table:

CREATE TABLE logs(
host VARCHAR(40),
ident VARCHAR(25),
authuser VARCHAR(25),
datetime VARCHAR(60),
request VARCHAR(2048),
response SMALLINT NOT NULL,
bytes INTEGER,
referer VARCHAR(2048),
agent VARCHAR(256));

Then I need to create a Kinesis Firehose delivery stream that continuously delivers data to the Redshift table created above:

Now I’ve set up my Redshift table and Firehose delivery stream. Next I need to install the Kinesis Agent on my Tomcat server to monitor my Tomcat access log files and continuously send the log data to my delivery stream. Here is a screenshot of the raw Tomcat access log:

In the agent configuration, I use the LOGTOJSON processing option to convert raw Tomcat access log data to JSON format before sending the data to my delivery stream. Here’s how I set that up:

{
   "cloudwatch.emitMetrics":true,
   "flows":[
      {
         "filePattern":"/data/access.log*",
         "deliveryStream":"access_log_stream",
         "initialPosition":"START_OF_FILE",
         "dataProcessingOptions":[
            {
               "optionName":"LOGTOJSON",
               "logFormat":"COMBINEDAPACHELOG"
            }
         ]
      }
   ]
}

Everything is set up now and let’s start the agent! After a minute or two, my Tomcat access log data shows up in my S3 bucket and Redshift table. Here is how the data looks like in my S3 bucket. Notice that the raw log data has been nicely formatted as JSON:

Here is how the data looks like in my Redshift table:

I can run SQL queries to analyze my Tomcat access log, or use the Business Intelligence tool of my choice to visualize the data:

It took me less than an hour to set up the whole data pipeline. Now I can analyze and visualize access log data using my favorite Business Intelligence tool, only minutes after the data is generated on my Tomcat server!

Available Now
Kinesis Agent’s data preprocessing feature is available now and you can start using it today – visit the Amazon Kinesis Agent Repository! To learn more, read Use Agent to Preprocess Data in the Kinesis Firehose Developer Guide.

— Ray Zhu, Senior Product Manager

AWS Training Update – Revised AWS Technical Essentials and Architecting on AWS Courses

by Jeff Barr | on 07 APR 2016 | in Training and Certification | Permalink | Comments

We continuously enhance our technical courses to stay current with the pace of AWS platform updates and incorporate student feedback. We have made substantial updates to our two most popular foundational training courses, AWS Technical Essentials and Architecting on AWS, to better provide students with actionable knowledge to get started creating solutions with AWS and a path to advanced learning.

AWS Technical Essentials – What’s New
This is a one-day course for solutions architects, developers, sysops administrators, and anyone who wants to get started using AWS. It covers the foundations of cloud computing, storage, and networking. It’s also used as the content for AWSome Days. The updated course now addresses 18 AWS services, with in-depth coverage of 10 core services: EC2, S3, EBS, IAM, Auto Scaling, ELB, RDS, DynamoDB, Auto Scaling, and CloudWatch. New, comprehensive hands-on lab exercises and instructor-led demonstrations help students learn how to get started creating real-world solutions on the AWS platform. The updated course also provides students with a clearer path to continue their education with more advanced courses such as Architecting on AWS and Systems Operations on AWS. You can read the course description to learn more.

Architecting on AWS – What’s New
This is a three-day course for solutions architects and solution design engineers. It aligns with the changes to AWS Technical Essentials, making the concepts learned in that course a prerequisite. The updated course now focuses on cloud best practices, architecture patterns, case studies, and other practical ways of thinking about how to architect infrastructure on AWS. Hands-on lab exercises walk you through how to build complete application environments on AWS using a variety of AWS services, including Amazon VPC, Amazon EC2, Amazon S3, AWS Lambda, and more. New content also addresses automating and de-coupling infrastructures using architectures less dependent on servers, troubleshooting commonly misconfigured architectures, and concepts from the Well-Architected Framework. Read the course description to learn more.

Accessing the Courses
These classes (and many more) are available through AWS and our Training Partners. You can find upcoming classes in our global training schedule or learn more at AWS Training.

— Jeff;

LiveOps Cloud – Tapping the Billion Dollar Call-Center Market on AWS

by Jeff Barr | on 06 APR 2016 | in Customer Success, Guest Post | Permalink | Comments

LiveOps Cloud is ready to break open a huge untapped market. The company is a long-time solutions provider for the contact center industry, and just recently launched CxEngage, a new contact center-as-a-service platform built and run entirely on AWS. I asked Jeff Thompson, LiveOps Cloud’s CTO and SVP for engineering, to tell us a bit about their decision to launch this great new service.

— Jeff;

We like to say that LiveOps Cloud is a 16-year-old startup. We’re a new company carved out of LiveOps Inc., and our mission is to take the original company’s long history of providing contact center solutions into a new era of cloud-first convenience, performance, and lower costs.

The contact center business is huge, with estimates of at least 15 million seats worldwide that comprise a multi-billion-dollar market. But the industry is a late-comer to cloud computing, with only about 10 percent of contact center operations working in some capacity with cloud infrastructure and tools. So there still are a lot of legacy, on-premises call center systems in place—especially in traditional industries like banking and retail—that are quickly reaching their expiration date. These systems are inadequate for meeting the demands of today’s market, with companies having to hold down costs, provide ever-better performance and sophistication, and serve emerging markets.

Cloud Bake-Off
Our plan was to create a pure cloud contact center-as-a-service (CCaaS) that could deliver an always-on, secure, multi-tenant, and instantly scalable platform so businesses can deliver exceptional customer experiences anywhere, anytime. We anticipated that if done right, our CCaaS would take off in in a true ‘hockey stick’ growth pattern. To get there, we needed to carefully consider what platform would make the most sense.

We held a bake-off that looked at a number of alternatives. Azure, Rackspace, Google, and AWS were in the cloud provider mix, and we also looked at using a colocated facility. That last one was the first to go. We knew from experience that running the platform out of a co-lo would simply not provide the redundancy and scalability we wanted to bake into the new platform. We ruled out Azure because we’re not a Microsoft shop, and were using a lot on non-Windows tools and Linux to create the platform. Rackspace has good IaaS, but their global reach was insufficient for our business goals. We also ruled out Google because they didn’t have the breadth of apps we felt were required to build our platform.

A Clear Winner
In our view, AWS was the clear winner. It delivers all the features and benefits we were seeking. It has an incredibly rich catalog of services, with new ones being released at a pace that competing cloud providers simply are not matching. We might not need them all now, but knowing those services are there, and that AWS is innovating and adding to them all the time, instills real confidence. We know that if we have some need or feature request in the future, chances are AWS already has a service that can address it. Good examples—and just a small portion of the AWS services that we use—are Amazon Redshift and Amazon Kinesis, two powerful data services that are essential to our platform, and Amazon Simple Queue Service (SQS), which drives messaging out to agent toolbars.

AWS also has broad global reach, which is critical to the CxEngage business model. The North American and Western European markets are certainly an important source of revenue. We also see great opportunity in emerging call center markets in places like China, India, and the Asia-Pacific region. AWS operates in 12 regions around the world. That means we can provide services in close physical proximity to new customers, which boosts performance by reducing latency. When a call comes in, the businesses using our platform don’t want lag times in the system. And in some cases, it helps when there are sovereignty issues related to keeping data within particular boundaries.

AWS also provides major benefits in terms of flexibility and financial performance. For example, we can carefully plan for specific Amazon EC2 instance types to match the performance needs of particular services in the CxEngage platform. Some may require more I/O, some more memory. We can pick exactly what we need and not overprovision, which helps us not only optimize for performance, but also meet our financial goals. That, and the pay-as-you-go model of AWS, has made AWS very popular with our finance department.

Simple, with No Drama
AWS also makes it easier to build the business. For example, the built-in support for PCI and HIPAA, and the compliance and regulatory standards included with AWS GovCloud (US), help us quickly overcome potential barriers to signing new and important customers. We can check off those boxes and keep moving.

We started the journey of building the next-generation solution for call centers in 2014. We placed our bet on AWS, and 18 months later when we launched CxEngage, all of our financial and performance predictions for the platform were borne out. Everything we thought would happen by using AWS happened. It was simple, with no drama. We’re looking at AWS as a partner that is fundamental to our business, and to our growth plan.

— Jeff Thompson, CTO and SVP, LiveOps Cloud Platform

Building Bridges for Better Cancer Treatment with the Fred Hutchinson Cancer Research Center

by Jeff Barr | on 05 APR 2016 | in Customer Success, Guest Post | Permalink | Comments

My colleagues Jessica Beegle and Christopher Crosbie shared the inspiring story below!

— Jeff;

The science of cancer research is continually evolving to include new fields of study. Examples include development of chemotherapies, radiology amplified treatments, and epidemiology for identifying carcinogens. Pathology continues to help deepen the understanding of the disease’s manifestations.

The discipline of computer science is a relatively new entrant in the quest to understand, treat and cure cancer. Computer science is needed to decipher how certain variations in our DNA relate to cancer and what treatment paths have the greatest potential for success for each individual. This type of task is best suited for computer science because the study of DNA, typically referred to as genomics, requires significant big data processing capabilities. In fact, scientists predict genomics will generate more digital information than astronomy, YouTube, and Twitter by 2025.

Today, much of the software developed to collect, analyze and visualize this data is created in silos among different IT systems, research departments, health care institutions, and even nations. This separation greatly hinders the speed of scientific discovery.

Researchers at Fred Hutchinson Cancer Research Center in Seattle wanted to change this. Led by Eric Holland, M.D., Ph.D., director of the Human Biology Division and Solid Tumor Translational Research at Fred Hutch, the team developed Oncoscape, an open-source web application to apply and develop analysis tools for molecular and clinical data. Oncoscape enables researchers to discover new patterns and relationships, which further cancer research.

To utilize current technology in computer science, the Oncoscape team collaborated with GitHub and AWS. The goal was to leverage the code-sharing platform that GitHub provides with the cloud computing capabilities that AWS offers. According to Dr. Holland:

Hosting Oncoscape in the cloud makes it easy for our development team to make changes and redeploy the software in order to keep up with the needs of the research community. Knowing I can securely access the site from anywhere in the world allows me to show collaborators what is possible with data visualization and how we can use a common platform to work together in cancer research.

Robert McDermott, the IT Solutions Architect behind the AWS deployment of Oncoscape explains: “AWS is a very capable, reliable and flexible platform that allows us to quickly adapt to the needs of the project.” He cites maturity, reliability, breadth and depth of services and security as the key drivers for using AWS.

Oncoscape uses several AWS services including Amazon EC2, Elastic Load Balancing, Amazon CloudWatch, and Amazon S3. This approach makes it easy to distribute traffic across physical locations (Availability Zones), as well as quickly obtain actionable notifications in the event of a site issue. Amazon Route 53 has also proven useful for quickly making modifications to the development environment.

The diagram below depicts the full Oncoscape integration and deployment pipeline, including the merger points between GitHub, Circleci, DockerHub, Slack, and AWS.

To learn more about the collaboration behind the Oncoscape project please watch this video or visit the Oncoscape home page.

— Jessica Beegle (Global Healthcare & Life Sciences Ecosystem Leader) and Christopher Crosbie (Healthcare and Life Science Solution Architect)

Experiment that Discovered the Higgs Boson Uses AWS to Probe Nature

by Jeff Barr | on 30 MAR 2016 | in Amazon EC2, Case Studies, Guest Post | Permalink | Comments

My colleague Sanjay Padhi is part of the AWS Scientific Computing team. He wrote the guest post below to share the story of how AWS provided computational resources that aided in an important scientific discovery.

— Jeff;

The Higgs boson (sometimes referred to as the God Particle), responsible for providing insight into the origin of mass, was discovered in 2012 by the world’s largest experiments, ATLAS and CMS, at the Large Hadron Collider (LHC) at CERN in Geneva, Switzerland. The theorists behind this discovery were awarded the 2013 Nobel Prize in Physics.

Deep underground on the border between France and Switzerland, the LHC is the world’s largest (17 miles in circumference) and highest-energy particle accelerator. It explores nature on smaller scales than any human invention has ever explored before.

From Experiment to Raw Data
The high energy particle collisions turn mass in to energy, which then turns back in to mass, creating new particles that are observed in the CMS detector. This detector is 69 feet long, 49 feet wide and 49 feet high, and sits in a cavern 328 feet underground near the village of Cessy in France. The raw data from the CMS is recorded every 25 nanoseconds at a rate of approximately 1 petabyte per second.

After online and offline processing of the raw data at the CERN Tier 0 data center, the datasets are distributed to 7 large Tier 1 data centers across the world within 48 hours, ready for further processing and analysis by scientists (the CMS collaboration, one of the largest in the world, consists of more than 3,000 participating members from over 180 institutes and universities in 43 countries).

Processing at Fermilab
Fermilab is one of 16 National Laboratories operated by the United States Department of Energy. Located just outside Batavia Illinois, Fermilab serves as one of the Tier 1 data centers for Cern’s CMS experiment.

With the increase in LHC collision energy last year, the demand for data assimilation, event simulations, and large-scale computing increased as well. With this increase came a desire to maximize cost efficiency by dynamically provisioning resources on an as-needed basis.

In order to address this issue, the Fermilab Scientific Computing Division launched the HEP (High Energy Physics) Cloud project in June of 2015. They planned to develop a virtual facility that would provide a common interface to access a variety of computing resources including commercial clouds. Using AWS, the HEP Cloud project successfully demonstrated the ability to add 58,000 cores elastically to their on-premises facility for the CMS experiment.

The image below depicts one of the simulations that was run on AWS. It shows how the collision of two protons creates energy that then becomes new particles.

The additional 58,000 cores represents a 4x increase in Fermilab’s computational capacity, all of which is dedicated to the CMS experiment in order to generate and reconstruct Monte Carlo simulation events. More than 500 million events were fully simulated in 10 days using 2.9 million jobs. Without help from AWS, this job would have taken 6 weeks to complete using the on-premises compute resources at Fermilab.

This simulation was done in preparation for one of the major high energy physics international conferences, Recontres de Moriond. Physicists across the world will use these simulations to probe nature in detail and will share their findings with their international colleagues during the conference.

Saving Money with HEP Cloud
The HEP Cloud project aims to minimize the costs of computation. The R&D and demonstration effort was supported by an award from the AWS Cloud Credit for Research.

HEP Cloud’s decision engine, the brain of the facility, has several duties. It oversees EC2 Spot Market price fluctuations using tools and techniques provided by Amazon’s Spot team, initializes Amazon EC2 instances using HTCondor, tracks the DNS names of the instances using Amazon Route 53 , and makes use of AWS CloudFormation templates for infrastructure as a code.

While on the road to success, the project team had to overcome several challenges, ranging from fine-tuning configurations to optimizing their use of Amazon S3 and other resources. For example, they devised a strategy to distribute the auxiliary data across multiple AWS Regions in order to minimize storage costs and data-access latency.

Automatic Scaling into AWS
The figure below shows elastic, automatic expansion of Fermilab’s Computing Facility into the AWS Cloud using Spot instances for CMS workflows. Monitoring of the resources was done using open source software provided by Grafana with custom modifications provided by the HEP Cloud.

Panagiotis Spentzouris (head of the Scientific Computing Division at Fermilab), told me:

Modern HEP experiments require massive computing resources in irregular cycles, so it is imperative for the success of our program that our computing facilities can rapidly expand and contract resources to match demand. Using commercial clouds is an important ingredient for achieving this goal, and our work with AWS on the CMS experiment’s workloads though HEPCloud was a great success in demonstrating the value of this approach.

I hope that you enjoyed this brief insight into the ways in which AWS is helping to explore the frontiers of physics!

— Sanjay Padhi, Ph.D, AWS Scientific Computing

New – Change Sets for AWS CloudFormation

by Jeff Barr | on 29 MAR 2016 | in AWS CloudFormation | Permalink | Comments

AWS CloudFormation lets you create, manage, and update a collection of AWS resources (a “stack”) in a controlled, predictable manner. Every day, customers use CloudFormation to perform hundreds of thousands of updates to the stacks that support their production workloads. They define an initial template and then revise it as their requirements change.

This model, commonly known as infrastructure as code, gives developers, architects, and operations teams detailed control of the provisioning and configuration of their AWS resources. This detailed level of control and accountability is one of the most visible benefits that you get when you use CloudFormation. However, there are several others that are less visible but equally important:

Consistency – The CloudFormation team works with the AWS teams to make sure that newly added resource models have consistent semantics for creating, updating, and deleting resources. They take care to account for retries, idempotency, and management of related resources such as KMS keys for encrypting EBS or RDS volumes.

Stability – In any distributed system, issues related to eventual consistency often arise and must be dealt with. CloudFormation is intimately aware of these issues and automatically waits for any necessary propagation to complete before proceeding. In many cases they work with the service teams to ensure that their APIs and success signals are properly tuned for use with CloudFormation.

Uniformity – CloudFormation will choose between in-place updates and resource replacement when you make updates to your stacks.

All of this work takes time, and some of it cannot be completely tested until the relevant services have been launched or updated.

Improved Support for Updates
As I mentioned earlier, many AWS customers use CloudFormation to manage updates to their production stacks. They edit their existing template (or create a new one) and then use CloudFormation’s Update Stack operation to activate the changes.

Many of our customers have asked us for additional insight into the changes that CloudFormation is planning to perform when it updates a stack in accord with the more recent template and/or parameter values. They want to be able to preview the changes, verify that they are in line with their expectations, and proceed with the update.

In order to support this important CloudFormation use case, we are introducing the concept of a change set. You create a change set by submitting changes against the stack you want to update. CloudFormation compares the stack to the new template and/or parameter values and produces a change set that you can review and then choose to apply (execute).

In addition to additional insight into potential changes, this new model also opens the door to additional control over updates. You can use IAM to control access to specific CloudFormation functions such as UpdateStack, CreateChangeSet, DescribeChangeSet, and ExecuteChangeSet. You could allow a large group developers to create and preview change sets, and restrict execution to a smaller and more experienced group. With some additional automation, you could raise alerts or seek additional approvals for changes to key resources such as database servers or networks.

Using Change Sets
Let’s walk through the steps involved in working with change sets. As usual, you can get to the same functions using the AWS Command Line Interface (CLI), AWS Tools for Windows PowerShell, and the CloudFormation API.

I started by creating a stack that runs a LAMP stack on a single EC2 instance. Here are the resources that it created:

Then I decided to step up to a more complex architecture. One of my colleagues shared a suitable template with me. Using the “trust but verify” model, I created a change set in order to see what would happen were I to use the template. I clicked on Create Change Set:

Then I uploaded the new template and assigned a name to the change set. If the template made use of parameters, I could have entered values for them at this point.

At this point I had the option to modify the existing tags and to add new ones. I also had the option to set up advanced options for the stack (none of these will apply until I actually execute the change set, of course):

After another click or two to confirm my intent, the console analyzed the template, checks the results against the stack, and displayed the list of changes:

At this point I can click on Execute to effect the changes. I can also leave the change set as-is, or create several others in order to explore some alternate paths forward. When I am ready to go, I can locate the change set and execute it:

CloudFormation springs to action and implements the changes per the change set:

A few minutes later my new stack configuration was in place and fully operational:

And there you have it! As I mentioned earlier, I can create and inspect multiple change sets before choosing the one that I would like to execute. When I do this, the other change sets are no longer meaningful and are discarded.

Managing Rollbacks
If a stack update fails, CloudFormation does its best to put things back the way there were before the update. The rollback operation can fail on occasion; in many cases this is due to a change that was made outside of CloudFormation’s purview. We recently launched a new option that gives you additional control over what happens next. To learn more about this option, read Continue Rolling Back an Update for AWS CloudFormation stacks in the UPDATE_ROLLBACK_FAILED state.

Available Now
This functionality is available now and you can start using it today!

— Jeff;

Mar	APR	May
	14
2015	2016	2017

AWS Blog

Connect with AWS

AWS Blogs

RSS Feed

Brought to you by