AWS Blog

Data Compression Improvements in Amazon Redshift Bring Compression Ratios Up to 4x

by Ana Visneski | on | in Amazon Redshift | | Comments

Maor Kleider, Senior Product Manager with Amazon Redshift, wrote today’s guest post.

-Ana


Amazon Redshift, is a fast, fully managed, petabyte-scale data warehousing service that makes it simple and cost-effective to analyze all of your data. Many of our customers, including Scholastic, King.com, Electronic Arts, TripAdvisor and Yelp, migrated to Amazon Redshift and achieved agility and faster time to insight, while dramatically reducing costs.

Columnar compression is an important technology in Amazon Redshift. It both helps reduce customer costs by increasing the effective storage capacity of our nodes and improves performance by reducing I/O needed to process SQL requests. Improving I/O efficiency is very important for data warehousing. Last year, our I/O enhancements doubled query throughput. Let’s talk about some of the new compression improvements we’ve recently added to Amazon Redshift.

First, we added support for the Zstandard compression algorithm, which offers a good balance between a high compression ratio and speed in build 1.0.1172. When applied to raw data in the standard TPC-DS, 3 TB benchmark, Zstandard achieves 65% reduction in disk space. Zstandard is broadly applicable. You can apply it to any of the following data types: SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, DOUBLE PRECISION, BOOLEAN, CHAR, VARCHAR, DATE, TIMESTAMP and TIMESTAMPTZ.

Second, we’ve improved the automation of compression on tables created by the CREATE TABLE AS, CREATE TABLE or ALTER TABLE ADD COLUMN commands. Starting with Build 1.0.1161, Amazon Redshift automatically chooses a default compression for the columns created by those commands. Automated compression happens when we estimate that we can reduce disk space without degrading query performance. Our customers have seen up to 40% reduction in disk space.

Third, we’ve been optimizing our internal on-disk data structures. Our preview customers averaged a 7% reduction in disk space usage with this improvement. This feature is delivered starting with Build 1.0.1271.

Finally, we have enhanced the ANALYZE COMPRESSION command to estimate disk space reduction. You can now easily identify opportunities to further compress data and improve performance. Behind the scenes, we sample your data and suggest the most effective compression. You can then specify the recommended encodings or your preferred encodings based on your own evaluation.

“Before all the recent compression features, our largest table was over 7 TB. It’s now only 4.85 TB, which is an additional 30.7% reduction in disk space. This allows us to reduce our disk space by 4X in total and our effective cost to less than $250/TB/Year on an uncompressed data basis. We’re now able to analyze more data with Amazon Redshift, and our query performance has gotten even better.” Chuong Do, Director of Analytics, Coursera

Of course, the actual benefits you see on your clusters will depend upon your workload and your data. In combination, these improvements may reduce your data sets by up to 4x vs. the 3x most of our customers saw before.

You may have heard us talk about how an Amazon Redshift data warehouse can cost as little as $1,000 per terabyte per year. It is important to realize that we’re talking about compressed data in this number. After all, that’s what we store. Not all vendors do this – many compress your data under the covers but describe per-terabyte costs in terms of uncompressed data. That’s unfortunate – the difference between talking in terms of uncompressed data and compressed data can be a significant overstatement.

-Maor Kleider

New – Host-Based Routing Support for AWS Application Load Balancers

by Jeff Barr | on | in Amazon Elastic Load Balancer | | Comments

Last year I told you about the new AWS Application Load Balancer (an important part of Elastic Load Balancing) and showed you how to set it up to route incoming HTTP and HTTPS traffic based on the path element of the URL in the request. This path-based routing allows you to route requests to, for example, /api to one set of servers (also known as target groups) and /mobile to another set. Segmenting your traffic in this way gives you the ability to control the processing environment for each category of requests. Perhaps /api requests are best processed on Compute Optimized instances, while /mobile requests are best handled by Memory Optimized instances.

Host-Based Routing & More Rules
Today we are giving you another routing option. You can now create Application Load Balancer rules that route incoming traffic based on the domain name specified in the Host header. Requests to api.example.com can be sent to one target group, requests to mobile.example.com to another, and all others (by way of a default rule) can be sent to a third. You can also create rules that combine host-based routing and path-based routing. This would allow you to route requests to api.example.com/production and api.example.com/sandbox to distinct target groups.

In the past, some of our customers set up and ran a fleet of proxy servers and used them for host-based routing. With today’s launch, the proxy server fleet is no longer needed since the routing can be done using Application Load Balancer rules. Getting rid of this layer of processing will simplify your architecture and reduce operational overhead.

Application Load Balancer already provides several features that support container-based applications including port mapping, health checks, and service discovery. The ability to route on both host and path allows you to build and efficiently scale applications that are comprised of multiple microservices running in individual Amazon EC2 Container Service containers. You can use host-based routing to further simplify your service discovery mechanism by aligning your service names and your container names.

As part of today’s launch we are raising the maximum number of rules per Application Load Balancer from 10 to 75, and also introducing a new rule editor. I’ll start with the following target groups:

The Load Balancing Console shows the listeners that are associated with my Application Load Balancer: From there I simply click on View/edit rules to access the new rule editor:

I already have a default rule that forwards all requests to my web-target-production target:

I click on the Insert icon (the “+” sign) and then select a location. Rules are processed in the order that they are displayed:

I click on Insert Rule and define my new rule. Rules can reference a host, a path, or both. I’ll start with just a host:

I add two rules for host-based routing and the editor now looks like this:

If I want to route production and sandbox traffic to distinct targets, I can create new rules that reference the path. Here’s the first one:

With a few more clicks and a little typing I can create a powerful set of rules:

Rules that match the Host header can include up to three “*” (match 0 or more characters) or “?” (match 1 character) wildcards. Let’s say that I give each of my large customers a unique host name for tracking purposes. I can write rules that route all of the requests to the same target group, regardless of the final portion of the host name. Here’s a simple example:

The pencil icon in the rule editor allows me to make changes to the rule sequence. I select rules, move them to a new position, and then save the updated sequence:

I can also edit existing rules or delete unneeded ones.

Available Now
This feature is available today in all 15 AWS public AWS regions.

There is no extra charge for the first 10 rules (host-based, path-based, or both) evaluated by each load balancer. After that you will be charged based on the number of rule evaluations (this is a new dimension added to the Load Balancer Capacity units (LCU) model that I described in an earlier post). Each LCU supports up to 1000 rule evaluations. We measure on all four dimensions of the LCU, but you are charged only for the dimension with the highest usage in the given hour. Rules that are configured, but not processed will not be charged.

Jeff;

 

Coming in 2018 – New AWS Region in Sweden

by Jeff Barr | on | in Announcements | | Comments

Last year we launched new AWS Regions in Canada, India, Korea, the UK (London), and the United States (Ohio), and announced that new regions are coming to France (Paris) and China (Ningxia).

Today, I am happy to be able to tell you that we are planning to open up an AWS Region in Stockholm, Sweden in 2018. This region will give AWS partners and customers in Denmark, Finland, Iceland, Norway, and Sweden low-latency connectivity and the ability to run their workloads and store their data close to home.

The Nordics is well known for its vibrant startup community and highly innovative business climate. With successful global enterprises like ASSA ABLOY, IKEA, and Scania along with fast growing startups like Bambora, Supercell, Tink, and Trustpilot, it comes as no surprise that Forbes ranks Sweden as the best country for business, with all the other Nordic countries in the top 10. Even better, the European Commission ranks Sweden as the most innovative country in EU.

This will be the fifth AWS Region in Europe joining four other Regions there — EU (Ireland), EU (London), EU (Frankfurt) and an additional Region in France expected to launch in the coming months. Together, these Regions will provide our customers with a total of 13 Availability Zones (AZs) and allow them to architect highly fault tolerant applications while storing their data in the EU.

Today, our infrastructure comprises 42 Availability Zones across 16 geographic regions worldwide, with another three AWS Regions (and eight Availability Zones) in France, China and Sweden coming online throughout 2017 and 2018, (see the AWS Global Infrastructure page for more info).

We are looking forward to serving new and existing Nordic customers and working with partners across Europe. Of course, the new region will also be open to existing AWS customers who would like to process and store data in Sweden. Public sector organizations (government agencies, educational institutions, and nonprofits) in Sweden will be able to use this region to store sensitive data in-country (the AWS in the Public Sector page has plenty of success stories drawn from our worldwide customer base).

If you are a customer or a partner and have specific questions about this Region, you can contact our Nordic team.

Help Wanted
As part of our launch, we are hiring individual contributors and managers for IT support, electrical, logistics, and physical security positions. If you are interested in learning more, please contact awsjobs-sweden@amazon.com.

Jeff;

 

Welcome to the Newest AWS Community Heroes (Spring 2017)

by Ana Visneski | on | in AWS Community Heroes | | Comments

We would like to extend a very warm welcome to the newest AWS Community Heroes:

AWS Community Heroes share their knowledge and demonstrate their enthusiasm for AWS in a plethora of ways. They go above and beyond to share AWS insights via social media, blog posts, open source projects, and through in-person events, user groups, and workshops.


Mark Nunnikhoven
Mark Nunnikhoven explores the impact of technology on individuals, organizations, and communities through the lens of privacy and security. Asking the question, “How can we better protect our information?” Mark studies the world of cybercrime to better understand the risks and threats to our digital world.

As the Vice President of Cloud Research at Trend Micro, a long time Amazon Web Services Advanced Technology Partner and provider of security tools for the AWS Cloud, Mark uses that knowledge to help organizations around the world modernize their security practices by taking advantage of the power of the AWS Cloud.

With a strong focus on automation, he helps bridge the gap between DevOps and traditional security through his writing, speaking, teaching, and by engaging with the AWS community.

 

SangUk Park
SangUk Park is a Chief Solutions Architect at Megazone, which became Korea’s first AWS Partner in 2012 and is the only AWS Premier Consulting Partner to provide AWS support in Korean.

He served as a System Architect for KT’s public cloud and VDI design, and led the system operation of YDOnline and Nexon Japan, one of the leading online gaming companies. Certified both as an AWS Solutions Architect – Professional and AWS DevOps Engineer – Professional, SangUk has authored AWS books, including DevOps and AWS Cloud Design Patterns, and translated four books related to the AWS Cloud.

He’s been making efforts to revitalize the local AWS Korea User Group community as co-leader by presenting at AWS Korea User Group meetings and AWS Summits, and helping to establish small group gatherings such as the AWSKRUG System Engineers in Gangnam. Also, he has done many hands-on labs and has been running a booth as a leader of the user groups at AWS events to cultivate developers and system engineers.

SangUk maintains a close relationship with the Japanese AWS User Group (JAWS UG), using his excellent Japanese communication skills and experiences in Japan. He makes every effort to participate in events held between Japanese and Korean user groups as a facilitator and translator, and will promote cross-regional communications beyond APAC going forward.

 

James Hall
James Hall has been working in the digital sector for over a decade. He is the author of the popular jsPDF library, and is a founder/Director of Parallax, a digital agency in the UK. He’s worked as a software developer on a wide variety of projects, from LED Billboards, car unlocking apps, to large web applications and tools.

Parallax built an online recording studio for David Guetta and UEFA using Serverless technology shortly after API Gateway was released. Since then they have consulted on various serverless projects and technologies. They run the AWS Meetup in Leeds, and help companies around the world build their businesses online. James has contributed to and promotes the Serverless Framework which allows you to elegantly build web applications on top of Lambda and related services.

 

Drew Firment
Drew Firment works with business leaders and technology teams from organizations that seek to accelerate cloud adoption. He has over twenty years of experience leading large-scale technology programs, enterprise platforms, and cultural transformations in a fast-paced agile environment.

After migrating Capital One’s early adopters of AWS into production, his focus shifted toward accelerating a scaleable and sustainable transition to cloud computing. Drew pioneered the intersection of strategy, governance, engineering, agile, and education to drive an enterprise-wide talent transformation. He founded Capital One’s cloud engineering college, and implemented an innovative outcome-based curriculum oriented towards learning communities. Several thousand employees have enrolled in his cloud-fluency program, enabling well over 1,000 AWS certifications since its inception.

Drew has earned all three of the AWS associate-level certifications, enjoys developing custom Amazon Alexa skills using AWS Lambda, and believes serverless is the future of cloud computing. He also serves as an advisory partner to A Cloud Guru and is editor-in-chief of the their community-sourced publication.

Welcome
Please join me in welcoming to our newest AWS Community Heroes!

-Ana

AWS Hot Startups – March 2017

by Ana Visneski | on | in Startups | | Comments

As the madness of March rounds up, take a break from all the basketball and check out the cool startups Tina Barr brings you for this month!

-Ana


The arrival of spring brings five new startups this month:

  • Amino Apps – providing social networks for hundreds of thousands of communities.
  • Appboy – empowering brands to strengthen customer relationships.
  • Arterys – revolutionizing the medical imaging industry.
  • Protenus – protecting patient data for healthcare organizations.
  • Syapse – improving targeted cancer care with shared data from across the country.

In case you missed them, check out February’s hot startups here.

Amino Apps (New York, NY)
Amino Logo
Amino Apps was founded on the belief that interest-based communities were underdeveloped and outdated, particularly when it came to mobile. CEO Ben Anderson and CTO Yin Wang created the app to give users access to hundreds of thousands of communities, each of them a complete social network dedicated to a single topic. Some of the largest communities have over 1 million members and are built around topics like popular TV shows, video games, sports, and an endless number of hobbies and other interests. Amino hosts communities from around the world and is currently available in six languages with many more on the way.

Navigating the Amino app is easy. Simply download the app (iOS or Android), sign up with a valid email address, choose a profile picture, and start exploring. Users can search for communities and join any that fit their interests. Each community has chatrooms, multimedia content, quizzes, and a seamless commenting system. If a community doesn’t exist yet, users can create it in minutes using the Amino Creator and Manager app (ACM). The largest user-generated communities are turned into their own apps, which gives communities their own piece of real estate on members’ phones, as well as in app stores.

Amino’s vast global network of hundreds of thousands of communities is run on AWS services. Every day users generate, share, and engage with an enormous amount of content across hundreds of mobile applications. By leveraging AWS services including Amazon EC2, Amazon RDS, Amazon S3, Amazon SQS, and Amazon CloudFront, Amino can continue to provide new features to their users while scaling their service capacity to keep up with user growth.

Interested in joining Amino? Check out their jobs page here.

Appboy (New York, NY)
In 2011, Bill Magnuson, Jon Hyman, and Mark Ghermezian saw a unique opportunity to strengthen and humanize relationships between brands and their customers through technology. The trio created Appboy to empower brands to build long-term relationships with their customers and today they are the leading lifecycle engagement platform for marketing, growth, and engagement teams. The team recognized that as rapid mobile growth became undeniable, many brands were becoming frustrated with the lack of compelling and seamless cross-channel experiences offered by existing marketing clouds. Many of today’s top mobile apps and enterprise companies trust Appboy to take their marketing to the next level. Appboy manages user profiles for nearly 700 million monthly active users, and is used to power more than 10 billion personalized messages monthly across a multitude of channels and devices.

Appboy creates a holistic user profile that offers a single view of each customer. That user profile in turn powers contextual cross-channel messaging, lifecycle engagement automation, and robust campaign insights and optimization opportunities. Appboy offers solutions that allow brands to create push notifications, targeted emails, in-app and in-browser messages, news feed cards, and webhooks to enhance the user experience and increase customer engagement. The company prides itself on its interoperability, connecting to a variety of complimentary marketing tools and technologies so brands can build the perfect stack to enable their strategies and experiments in real time.

AWS makes it easy for Appboy to dynamically size all of their service components and automatically scale up and down as needed. They use an array of services including Elastic Load Balancing, AWS Lambda, Amazon CloudWatch, Auto Scaling groups, and Amazon S3 to help scale capacity and better deal with unpredictable customer loads.

To keep up with the latest marketing trends and tactics, visit the Appboy digital magazine, Relate. Appboy was also recently featured in the #StartupsOnAir video series where they gave insight into their AWS usage.

Arterys (San Francisco, CA)
Getting test results back from a physician can often be a time consuming and tedious process. Clinicians typically employ a variety of techniques to manually measure medical images and then make their assessments. Arterys founders Fabien Beckers, John Axerio-Cilies, Albert Hsiao, and Shreyas Vasanawala realized that much more computation and advanced analytics were needed to harness all of the valuable information in medical images, especially those generated by MRI and CT scanners. Clinicians were often skipping measurements and making assessments based mostly on qualitative data. Their solution was to start a cloud/AI software company focused on accelerating data-driven medicine with advanced software products for post-processing of medical images.

Arterys’ products provide timely, accurate, and consistent quantification of images, improve speed to results, and improve the quality of the information offered to the treating physician. This allows for much better tracking of a patient’s condition, and thus better decisions about their care. Advanced analytics, such as deep learning and distributed cloud computing, are used to process images. The first Arterys product can contour cardiac anatomy as accurately as experts, but takes only 15-20 seconds instead of the 45-60 minutes required to do it manually. Their computing cloud platform is also fully HIPAA compliant.

Arterys relies on a variety of AWS services to process their medical images. Using deep learning and other advanced analytic tools, Arterys is able to render images without latency over a web browser using AWS G2 instances. They use Amazon EC2 extensively for all of their compute needs, including inference and rendering, and Amazon S3 is used to archive images that aren’t needed immediately, as well as manage costs. Arterys also employs Amazon Route 53, AWS CloudTrail, and Amazon EC2 Container Service.

Check out this quick video about the technology that Arterys is creating. They were also recently featured in the #StartupsOnAir video series and offered a quick demo of their product.

Protenus (Baltimore, MD)
Protenus Logo
Protenus founders Nick Culbertson and Robert Lord were medical students at Johns Hopkins Medical School when they saw first-hand how Electronic Health Record (EHR) systems could be used to improve patient care and share clinical data more efficiently. With increased efficiency came a huge issue – an onslaught of serious security and privacy concerns. Over the past two years, 140 million medical records have been breached, meaning that approximately 1 in 3 Americans have had their health data compromised. Health records contain a repository of sensitive information and a breach of that data can cause major havoc in a patient’s life – namely identity theft, prescription fraud, Medicare/Medicaid fraud, and improper performance of medical procedures. Using their experience and knowledge from former careers in the intelligence community and involvement in a leading hedge fund, Nick and Robert developed the prototype and algorithms that launched Protenus.

Today, Protenus offers a number of solutions that detect breaches and misuse of patient data for healthcare organizations nationwide. Using advanced analytics and AI, Protenus’ health data insights platform understands appropriate vs. inappropriate use of patient data in the EHR. It also protects privacy, aids compliance with HIPAA regulations, and ensures trust for patients and providers alike.

Protenus built and operates its SaaS offering atop Amazon EC2, where Dedicated Hosts and encrypted Amazon EBS volume are used to ensure compliance with HIPAA regulation for the storage of Protected Health Information. They use Elastic Load Balancing and Amazon Route 53 for DNS, enabling unique, secure client specific access points to their Protenus instance.

To learn more about threats to patient data, read Hospitals’ Biggest Threat to Patient Data is Hiding in Plain Sight on the Protenus blog. Also be sure to check out their recent video in the #StartupsOnAir series for more insight into their product.

Syapse (Palo Alto, CA)
Syapse provides a comprehensive software solution that enables clinicians to treat patients with precision medicine for targeted cancer therapies — treatments that are designed and chosen using genetic or molecular profiling. Existing hospital IT doesn’t support the robust infrastructure and clinical workflows required to treat patients with precision medicine at scale, but Syapse centralizes and organizes patient data to clinicians at the point of care. Syapse offers a variety of solutions for oncologists that allow them to access the full scope of patient data longitudinally, view recommended treatments or clinical trials for similar patients, and track outcomes over time. These solutions are helping health systems across the country to improve patient outcomes by offering the most innovative care to cancer patients.

Leading health systems such as Stanford Health Care, Providence St. Joseph Health, and Intermountain Healthcare are using Syapse to improve patient outcomes, streamline clinical workflows, and scale their precision medicine programs. A group of experts known as the Molecular Tumor Board (MTB) reviews complex cases and evaluates patient data, documents notes, and disseminates treatment recommendations to the treating physician. Syapse also provides reports that give health system staff insight into their institution’s oncology care, which can be used toward quality improvement, business goals, and understanding variables in the oncology service line.

Syapse uses Amazon Virtual Private Cloud, Amazon EC2 Dedicated Instances, and Amazon Elastic Block Store to build a high-performance, scalable, and HIPAA-compliant data platform that enables health systems to make precision medicine part of routine cancer care for patients throughout the country.

Be sure to check out the Syapse blog to learn more and also their recent video on the #StartupsOnAir video series where they discuss their product, HIPAA compliance, and more about how they are using AWS.

Thank you for checking out another month of awesome hot startups!

-Tina Barr

 

New – AWS Resource Tagging API

by Jeff Barr | on | in Announcements, Developers | | Comments

AWS customers frequently use tags to organize their Amazon EC2 instances, Amazon EBS volumes, Amazon S3 buckets, and other resources. Over the past couple of years we have been working to make tagging more useful and more powerful. For example, we have added support for tagging during Auto Scaling, the ability to use up to 50 tags per resource, console-based support for the creation of resources that share a common tag (also known as resource groups), and the option to use Config Rules to enforce the use of tags.

As customers grow to the point where they are managing thousands of resources, each with up to 50 tags, they have been looking to us for additional tooling and options to simplify their work. Today I am happy to announce that our new Resource Tagging API is now available. You can use these APIs from the AWS SDKs or via the AWS Command Line Interface (CLI). You now have programmatic access to the same resource group operations that had been accessible only from the AWS Management Console.

Recap: Console-Based Resource Group Operations
Before I get in to the specifics of the new API functions, I thought you would appreciate a fresh look at the console-based grouping and tagging model. I already have the ability to find and then tag AWS resources using a search that spans one or more regions. For example, I can select a long list of regions and then search them for my EC2 instances like this:

After I locate and select all of the desired resources, I can add a new tag key by clicking Create a new tag key and entering the desired tag key:

Then I enter a value for each instance (the new ProjectCode column):

Then I can create a resource group that contains all of the resources that are tagged with P100:

After I have created the resource group, I can locate all of the resources by clicking on the Resource Groups menu:

To learn more about this feature, read Resource Groups and Tagging for AWS.

New API for Resource Tagging
The Resource Tagging API that we are announcing today gives you power to tag, untag, and locate resources using tags, all from your own code. With these new API functions, you are now able to operate on multiple resource types with a single set of functions.

Here are the new functions:

TagResources – Add tags to up to 20 resources at a time.

UntagResources – Remove tags from up to 20 resources at a time.

GetResources – Get a list of resources, with optional filtering by tags and/or resource types.

GetTagKeys – Get a list of all of the unique tag keys used in your account.

GetTagValues – Get all tag values for a specified tag key.

These functions support the following AWS services and resource types:

AWS Service Resource Types
Amazon CloudFront Distribution.
Amazon EC2 AMI, Customer Gateway, DHCP Option, EBS Volume, Instance, Internet Gateway, Network ACL, Network Interface, Reserved Instance, Reserved Instance Listing, Route Table, Security Group – EC2 Classic, Security Group – VPC, Snapshot, Spot Batch, Spot Instance Request, Spot Instance, Subnet, Virtual Private Gateway, VPC, VPN Connection.
Amazon ElastiCache Cluster, Snapshot.
Amazon Elastic File System Filesystem.
Amazon Elasticsearch Service Domain.
Amazon EMR Cluster.
Amazon Glacier Vault.
Amazon Inspector Assessment.
Amazon Kinesis Stream.
Amazon Machine Learning Batch Prediction, Data Source, Evaluation, ML Model.
Amazon Redshift Cluster.
Amazon Relational Database Service DB Instance, DB Option Group, DB Parameter Group, DB Security Group, DB Snapshot, DB Subnet Group, Event Subscription, Read Replica, Reserved DB Instance.
Amazon Route 53 Domain, Health Check, Hosted Zone.
Amazon S3 Bucket.
Amazon WorkSpaces WorkSpace.
AWS Certificate Manager Certificate.
AWS CloudHSM HSM.
AWS Directory Service Directory.
AWS Storage Gateway Gateway, Virtual Tape, Volume.
Elastic Load Balancing Load Balancer, Target Group.

Things to Know
Here are a couple of things to keep in mind when you build code or write scripts that use the new API functions or the CLI equivalents:

Compatibility – The older, service-specific functions remain available and you can continue to use them.

Write Permission – The new tagging API adds another layer of permission on top of existing policies that are specific to a single AWS service. For example, you will need to have access to tag:tagResources and EC2:createTags in order to add a tag to an EC2 instance.

Read Permission – You will need to have access to tag:GetResources, tag:GetTagKeys, and tag:GetTagValues in order to call functions that access tags and tag values.

Pricing – There is no charge for the use of these functions or for tags.

Available Now
The new functions are supported by the latest versions of the AWS SDKs. You can use them to tag and access resources in all commercial AWS regions.

Jeff;

 

Amazon CloudWatch launches Alarms on Dashboards

by Tara Walker | on | in Amazon CloudWatch, Launch | | Comments

Amazon CloudWatch is a service that gives customers the ability to monitor their applications, systems, and solutions running on Amazon Web Services by providing and collecting metrics, logs, and events about AWS resources in real time. CloudWatch automatically provides key resource measurements such as; latency, error rates, and CPU usage, while also enabling monitoring of custom metrics via customer-supplied logs and system data.

Last November, Amazon CloudWatch added new Dashboard Widgets to provide additional data visualization options for all available metrics. In order to provide customers with even more insight into their solutions and resources running on AWS, CloudWatch has launched Alarms on Dashboards. With this alarms enhancement, customers can view alarms and metrics in the same dashboard widget enabling them to perform data-driven troubleshooting and analysis.

CloudWatch dashboards are designed with a goal of providing better visibility when monitoring AWS resources across regions in a consolidated view. Since CloudWatch dashboards are highly customizable, users can create their own custom dashboards to graphically represent data for varying metrics such as utilization, performance, estimated billing, and now alarm conditions. An alarm tracks a single metric over time based on the value of the metric in relation to a specified threshold. When the alarm state changes, an action such an Auto Scaling policy is executed or a notification is sent to Amazon SNS, among other options.

With the ability to add alarms to dashboards, CloudWatch users have another mechanism to proactively monitor and receive alerts about their AWS resources and applications across multiple regions. In addition, the metric data associated with an alarm, which has been added to a dashboard, can be charted and reviewed. Alarms have three possible states:

  • OK: The value of the alarm metric does not meet the threshold
  • INSUFFICIENT DATA: Initial triggering of alarm metric or alarm metric data does not have enough data to determine whether it’s in the OK state or the ALARM state
  • ALARM: The value of the alarm metric meets the threshold

When added to a dashboard, alarms are displayed in red when in the Alarm state, gray when in the Insufficient data state and shown with no color fill when the alarm is in the OK state. Alarms added to a dashboard are supported with the following widgets: Line, Number, and Stacked Graph widgets.

  • Number widget: provides a quick and efficient view of the latest value of any desired metric. Using the widget with alarms, the view of the state of the alarm is shown with different background colors for the latest metric data.
  • Line widget: allows the visualization of the actual value of any collection of chosen metrics. Provides a view on the dashboard of the state of the alarm, which displays the alarm threshold and condition as a horizontal line. The threshold line can act as a good indicator to view the degree of the alarm.
  • Stack graph widget: allows customers to visualize the net total effect of any collection of chosen metrics. The stacked graph widget loads one metric over another in order to illustrate the distribution and contribution of a metric and has the option to display the contribution of metrics in percentages. With alarms, it also provides a view of the state of the alarm, which displays the alarm threshold and condition as a horizontal line.

Currently, adding multiple metrics onto the same widget for an alarm is in the works and this feature is evolving based on customer feedback.

Adding Alarms on Dashboards

Let’s take a quick look at the utilizing the Alarms on a CloudWatch Dashboard. In the AWS Console, I will go to the CloudWatch service. When in the CloudWatch console, select Dashboards. I will click the Create dashboard button and create the CloudWatchBlog dashboard.

 

Upon creation of my CloudWatchBlog dashboard, a dialog box will open to allow me to add widgets to the dashboard. I will forego adding widgets for now since I want to focus on adding alarms on my dashboard. Therefore, I will hit the Cancel button here and go to the Alarms section of the CloudWatch console.

Once in the Alarms section of the CloudWatch console, you will see all of your alarms and the state of each of the alarms for the current region displayed.

As we mentioned earlier, there are three types of alarm states and as you can see in my console above that all of the different alarms states for various alarms are being displayed. If desired, you can adjust your filter on the console to display alarms filtered by the alarm state type.

As an example, I am only interested in viewing the alarms with an alarm state of ALARM. Therefore, I will adjust the filter to show only the alarms in the current region with an alarm state as ALARM.

Now only the two alarms that have a current alarm state of ALARM are displayed. One of these alarms is for monitoring the provisioned write capacity units of an Amazon DynamoDB table, and the other is to monitor the CPU utilization of my active Amazon Elasticsearch instance.

Let’s examine the scenario in which I leverage my CloudWatchBlog dashboard as my troubleshooting mechanism for identifying and diagnosing issues with my Elasticsearch solution and its instances. I will first add the Amazon Elasticsearch CPU utilization alarm, ES Alarm, to my CloudWatchBlog dashboard. To add the alarm, I simply select the checkbox by the desired alarm, which in this case is ES Alarm. Then with the alarm selected, I click the Add to Dashboard button.

The Add to dashboard dialog box will open, allowing me to select my CloudWatchBlog dashboard. Additionally, I can select the widget type I would like to use for the display of my alarm. For the ES Alarm, I will choose the Line widget and complete the process of adding this alarm to my dashboard by clicking the Add to dashboard button.

Upon successfully adding ES Alarm to the CloudWatchBlog dashboard, you will see a confirmation notice displayed in the CloudWatch console.

If I then go to the Dashboard section of the console and select my CloudWatchBlog dashboard, I will see the line widget for my alarm, ES Alarm, on the dashboard. To ensure that my ES Alarm widget is a permanent part of the dashboard, I will click the Save dashboard button to preserve the addition of this widget on the dashboard.

As we discussed, one of the benefits of utilizing a CloudWatch dashboard is the ability to add several alarms from various regions onto a dashboard. Since my scenario is leveraging my dashboard as a troubleshooting mechanism for my Elasticsearch solution, I would like to have several alarms and metrics related to my solution displayed on the CloudWatchBlog dashboard. Given this, I will create another alarm for my Elasticsearch instance and add it to my dashboard.

I will first return to the Alarms section of the console and click the Create Alarm button.

The Create Alarm dialog box is displayed showing all of the current metrics available in this region. From the summary, I can quickly see that there are 21 metrics being tracked for Elasticsearch. I will click on the ES Metrics link to view the individual metrics that can be used to create my alarm.

I can review the individual metrics shown for my Elasticsearch instance, and choose which metric I want to base my new alarm on. In this case, I choose the WriteLatency metric by selecting the checkbox for this metric and then click the Next button.

 

The next screen is where I fill in all the details about my alarm: name, description, alarm threshold, time period, and alarm action. I will name my new alarm, ES Latency Alarm, and complete the rest of the aforementioned data fields. To complete the creation of my new alarm, I click the Create Alarm button.

I will see a confirmation message box at the top of the Alarms console upon successful completion of adding the alarm, and the status of the newly created alarm will be displayed in the alarms list.

Now I will add my ES Latency Alarm to my CloudWatchBlog dashboard. Again, I click on the checkbox by the alarm and then click the Add to Dashboard button.

This time when the Add to Dashboard dialog comes up, I will choose the Stacked area widget to display the ES Latency Alarm on my CloudWatchBlog dashboard. Clicking the Add to Dashboard button will complete the addition of my ES Latency Alarm widget to the dashboard.

Once back in the console, again I will see the confirmation noting the successful addition of the widget. I go to the Dashboards and click on the CloudWatchBlog dashboard and I can now view the two widgets in my dashboard. To include this widget in the dashboard permanently, I click the Save dashboard button.

The final thing to note about the new CloudWatch feature, Alarms on Dashboards, is that alarms and metrics from other regions can be added to the dashboard for a complete view for troubleshooting. Let’s add a metric to the dashboard with the alarms widget.

Within the console, I will move from my current region, US East (Ohio), to the US East (N. Virginia) region.

Now I will go to the Metric section of the CloudWatch console. This section displays the metrics from services used in the US East (N. Virginia) region.

My Elasticsearch solution triggers Lambda functions to capture all of the EmployeeInfo DynamoDB database CRUD (Create, Read, Update, Delete) changes via DynamoDB streams and write those changes into my Elasticsearch domain, taratestdomain. Therefore, I will add metrics to my CloudWatchBlog dashboard to track table metrics from DynamoDB.

Therefore, I am going to add the EmployeeInfo database ProvisionedWriteCapacityUnits metric to my CloudWatchBlog dashboard.

Back again in the Add to Dashboard dialog, I will select my CloudWatchBlog dashboard and choose to display this metric using the Number widget.

Now, the ProvisionedWriteCapacityUnits metric from the US East (N. Virginia) is displayed in the CloudWatchBlog dashboard with the Number widget added to the dashboard to with the alarms from the US East (Ohio). To make this update permanent in the dashboard, I will (you guessed it!) click the Save dashboard button.

Summary

Getting started with alarms on dashboards is easy. You can use alarms on dashboards across regions for another means of proactively monitoring alarms, build troubleshooting playbooks, and view desired metrics. You can also choose the metric first in the Metric UI and then change the type of widget according to the visualization that fits the metric.

Alarms on Dashboards are supported on Line, Stacked Area, and Number widgets. In addition, you can use Text widgets next to alarms on a dashboard to add steps or observations on how to handle changes in the alarm state. To learn more about Amazon CloudWatch widgets and about the additional dashboard widgets, visit the Amazon CloudWatch documentation and the CloudWatch Getting Started guide.

 

Tara

Is it on AWS? Domain Identification Using AWS Lambda

by Jeff Barr | on | in Amazon API Gateway, AWS Lambda, Guest Post | | Comments

In the guest post below, my colleague Tim Bray explains how he built IsItOnAWS.com . Powered by the list of AWS IP address ranges and using a pair of AWS Lambda functions that Tim wrote, the site aims to tell you if your favorite website is running on AWS.

Jeff;


Is it on AWS?
I did some recreational programming over Christmas and ended up with a little Lambda function that amused me and maybe it’ll amuse you too. It tells you whether or not a given domain name (or IP address) (even IPv6!) is in the published list of AWS IP address ranges. You can try it out over at IsItOnAWS.com. Part of the construction involves one Lambda function creating another.

That list of of ranges, given as IPv4 and IPv6 CIDRs wrapped in JSON, is here; the how-to documentation is here and there’s a Jeff Barr blog. Here are a few lines of the “IP-Ranges” JSON:

{
  "syncToken": "1486776130",
  "createDate": "2017-02-11-01-22-10",
  "prefixes": [
    {
      "ip_prefix": "13.32.0.0/15",
      "region": "GLOBAL",
      "service": "AMAZON"
    },
    ...
  "ipv6_prefixes": [
    {
      "ipv6_prefix": "2400:6500:0:7000::/56",
      "region": "ap-southeast-1",
      "service": "AMAZON"
    },

As soon as I saw it, I thought “I wonder if IsItOnAWS.com is available?” It was, and so I had to build this thing. I wanted it to be:

  1. Serverless (because that’s what the cool kids are doing),
  2. simple (because it’s a simple problem, look up a number in a range of numbers), and
  3. fast. Because well of course.

Database or Not?
The construction seemed pretty obvious: Simplify the IP-Ranges into a table, then look up addresses in it. So, where to put the table? I thought about Amazon DynamoDB, but it’s not obvious how best to search on what in effect is a numeric range. I thought about SQL databases, where it is obvious, but note #2 above. I thought about Redis or some such, but then you have to provision instances, see #1 above. I actually ended up stuck for a few days scratching my head over this one.

Then a question occurred to me: How big is that list of ranges? It turns out to have less than a thousand entries. So who needs a database anyhow? Let’s just sort that JSON into an array and binary-search it. OK then, where does the array go? Amazon S3 would be easy, but hey, look at #3 above; S3’s fast, but why would I want it in the loop for every request? So I decided to just generate a little file containing the ranges as an array literal, and include it right into the IsItOnAWS Lambda function. Which meant I’d have to rebuild and upload the function every time the IP addresses change.

It turns out that if you care about those addresses, you can subscribe to an Amazon Simple Notification Service (SNS) topic that will notify you whenever it changes (in my recent experience, once or twice a week). And you can hook your subscription up to a Lambda function. With that, I felt I’d found all the pieces anyone could need. There are two Lambda functions: the first, newranges.js, gets the change notifications, generates the JavaScript form of the IP-Ranges data, and uploads a second Lambda function, isitonaws.js, which includes that JavaScript. Vigilant readers will have deduced this is all with the Node runtime.

The new-ranges function, your typical async/waterfall thing, is a little more complex than I’d expected going in.

Postmodern IP Addresses
Its first task is to fetch the IP-Ranges, a straightforward HTTP GET. Then you take that JSON and smooth it out to make it more searchable. Unsurprisingly, there are both IPv4 and IPv6 ranges, and to make things easy I wanted to mash ’em all together into a single array that I could search with simple string or numeric matching. And since IPv6 addresses are way too big for JavaScript numbers to hold, they needed to be strings.

It turns out the way the IPv4 space embeds into IPv6’s ("::ffff:0:0/96") is a little surprising. I’d always assumed it’d be like the BMP mapping into the low bits of Unicode. I idly wonder why it’s this way, but not enough to research it.

The code for crushing all those CIDRs together into a nice searchable array ended up being kind of brutish, but it gets the job done.

Building Lambda in Lambda
Next, we need to construct the lambda that’s going to actually handle the IsItOnAWS request. This has to be a Zipfile, and NPM has tools to make those. Then it was a matter of jamming the zipped bytes into S3 and uploading them to make the new Lambda function.

The sharp-eyed will note that once I’d created the zip, I could have just uploaded it to Lambda directly. I used the S3 interim step because I wanted to to be able to download the generated “ranges” data structure and actually look at it; at some point I may purify the flow.

The actual IsItOnAWS runtime is laughably simple, aside from a bit of work around hitting DNS to look up addresses for names, then mashing them into the same format we used for the ranges array. I didn’t do any HTML templating, just read it out of a file in the zip and replaced an invisible <div> with the results if there were any. Except for, I got to code up a binary search method, which only happens once a decade or so but makes me happy.

Putting the Pieces Together
Once I had all this code working, I wanted to connect it to the world, which meant using Amazon API Gateway. I’ve found this complex in the past, but this time around I plowed through Create an API with Lambda Proxy Integration through a Proxy Resource, and found it reasonably linear and surprise-free.

However, it’s mostly focused on constructing APIs (i.e. JSON in/out) as opposed to human experiences. It doesn’t actually say how to send HTML for a human to consume in a browser, but it’s not hard to figure out. Here’s how (from Node):

context.succeed({
  "statusCode": 200,
  "headers": { "Content-type": "text/html" },
  "body": "<html>Your HTML Here</html>"
});

Once I had everything hooked up to API Gateway, the last step was pointing isitonaws.com at it. And that’s why I wrote this code in December-January, but am blogging at you now. Back then, Amazon Certificate Manager (ACM) certs couldn’t be used with API Gateway, and in 2017, life is just too short to go through the old-school ceremony for getting a cert approved and hooked up. ACM makes the cert process a real no-brainer. What with ACM and Let’s Encrypt loose in the wild, there’s really no excuse any more for having a non-HTTPS site. Both are excellent, but if you’re using AWS services like API Gateway and CloudFront like I am here, ACM is a smoother fit. Also it auto-renews, which you have to like.

So as of now, hooking up a domain name via HTTPS and CloudFront to your API Gateway API is dead easy; see Use Custom Domain Name as API Gateway API Host Name. Worked for me, first time, but something to watch out for (in March 2017, anyhow): When you get to the last step of connecting your ACM cert to your API, you get a little spinner that wiggles at you for several minutes while it hooks things up; this is apparently normal. Fortunately I got distracted and didn’t give up and refresh or cancel or anything, which might have screwed things up.

By the way, as a side-effect of using API Gateway, this is all running through CloudFront. So what with that, and not having a database, you’d expect it to be fast. And yep, it sure is, from here in Vancouver anyhow. Fast enough to not bother measuring.

I also subscribed my email to the “IP-Ranges changed” SNS topic, so every now and then I get an email telling me it’s changed, and I smile because I know that my Lambda wrote a new Lambda, all automatic, hands-off, clean, and fast.

Tim Bray, Senior Principal Engineer

 

AWS Global Summits are Coming!

by Ana Visneski | on | in Events | | Comments

One of the first things I got to do when I joined the AWS Blog team was to attend the summit in New York City last August. Meeting all of our customers, checking out Game Day, and getting to see the enthusiasm of the AWS community made me even more excited to be starting my adventure working on the blog with Jeff.

This year’s AWS Summit dates have been announced and whether you are new to the cloud or an experienced user, you can always learn something new at an AWS Summit. These free events, held around the world, are designed to educate you about the AWS platform. Our team has built a program that offers a multitude of learning opportunities covering a broad range of topics, and technical depth. Join us to develop the skills needed to design, deploy, and operate infrastructure and applications on AWS.

We have Summits taking place across North America, Latin America, Asia Pacific, Europe, the Middle East, Japan, and Greater China. To see the full list of cities and dates, check out the AWS Summits page.

Registration is now open for six locations including; San Francisco, Sydney, Singapore, Kuala Lumpur, Seoul, Manila, and Bangkok. You can also subscribe to the AWS Events RSS feed, follow @awscloud, and find us on Facebook.

And you never know, along with learning all sorts of new things at the summit, you just might run into me or Jeff and snag a blog sticker too!

-Ana

New – Tag EC2 Instances & EBS Volumes on Creation

by Jeff Barr | on | in Amazon EC2, Amazon Elastic Block Store | | Comments

Way back in 2010, we launched Resource Tagging for EC2 instances and other EC2 resources. Since that launch, we have raised the allowable number of tags per resource from 10 to 50, and we have made tags more useful with the introduction of resource groups and a tag editor. Our customers use tags to track ownership, drive their cost accounting processes, implement compliance protocols, and to control access to resources via IAM policies.

The AWS tagging model provides separate functions for resource creation and resource tagging. While this is flexible and has worked well for many of our users, it does result in a small time window where the resources exist in an untagged state. Using two separate functions means that resource creation could succeed only for tagging to fail, again leaving resources in an untagged state.

Today we are making tagging more flexible and more useful, with four new features:

Tag on Creation – You can now specify tags for EC2 instances and EBS volumes as part of the API call that creates the resources.

Enforced Tag Usage – You can now write IAM policies that mandate the use of specific tags on EC2 instances or EBS volumes.

Resource-Level Permissions – By popular request, the CreateTags and DeleteTags functions now support IAM’s resource-level permissions.

Enforced Volume Encryption – You can now write IAM policies that mandate the use of encryption for newly created EBS volumes.

Tag on Creation
You now have the ability to specify tags for EC2 instances and EBS volumes as part of the API call that creates the resources (if the call creates both instances and volumes, you can specify distinct tags for the instance and for each volume). The resource creation and the tagging are performed atomically; both must succeed in order for the operation (RunInstances, CreateVolume, and other functions that create resources) to succeed. You no longer need to build tagging scripts that run after instances or volumes have been created.

Here’s how you specify tags when you launch an EC2 instance (the CostCenter and SaveSnapshotFlag tags are also set on any EBS volumes created when the instance is launched):

To learn more, read Using Tags.

Resource-Level Permissions
CreateTags and DeleteTags now support IAM’s resource-level permissions, as requested by many customers. This gives you additional control over the tag keys and values on existing resources.

Also, RunInstances and CreateVolume now support additional resource-level permissions. This allows you to exercise control over the users and groups that can tag resources on creation.

To learn more, see Example Policies for Working with the AWS CLI or an AWS SDK.

Enforced Tag Usage
You can now write IAM policies that enforce the use of specific tags. For example, you could write a policy that blocks the deletion of tags named Owner or Account. Or, you could write a “Deny” policy that disallows the creation of new tags for specific existing resources. You could also use an IAM policy to enforce the use of Department and CostCenter tags to help you achieve more accurate cost allocation reporting. In order to implement stronger compliance and security policies, you could also restrict access to DeleteTags if the resource is not tagged with the user’s name. The ability to enforce tag usage gives you precise control over access to resources, ownership, and cost allocation.

Here’s a statement that requires the use of costcenter and stack tags (with values of “115” and “prod,” respectively) for all newly created volumes:

"Statement": [
    {
      "Sid": "AllowCreateTaggedVolumes",
      "Effect": "Allow",
      "Action": "ec2:CreateVolume",
      "Resource": "arn:aws:ec2:us-east-1:123456789012:volume/*",
      "Condition": {
        "StringEquals": {
          "aws:RequestTag/costcenter": "115",
          "aws:RequestTag/stack": "prod"
         },
         "ForAllValues:StringEquals": {
             "aws:TagKeys": ["costcenter","stack"]
         }
       }
     },
     {
       "Effect": "Allow",
       "Action": [
         "ec2:CreateTags"
       ],
       "Resource": "arn:aws:ec2:us-east-1:123456789012:volume/*",
       "Condition": {
         "StringEquals": {
             "ec2:CreateAction" : "CreateVolume"
        }
      }
    }
  ]

Enforced Volume Encryption
Using the additional IAM resource-level permissions now supported by RunInstances and CreateVolume, you can now write IAM policies that mandate the use of encryption for any EBS boot or data volumes created. You can use this to comply with regulatory requirements, enforce enterprise security policies, and to protect your data in compliance with applicable auditing requirements.

Here’s a sample statement that you can incorporate into an IAM policy for RunInstances and CreateVolume to enforce EBS volume encryption:

"Statement": [
        {
            "Effect": "Deny",
            "Action": [
                       "ec2:RunInstances",
                       "ec2:CreateVolume"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:volume/*"
            ],
            "Condition": {
                "Bool": {
                    "ec2:Encrypted": "false"
                }
            }
        },

To learn more and to see some sample policies, take a look at Example Policies for Working with the AWS CLI or an AWS SDK and IAM Policies for Amazon EC2.

Available Now
As you can see, the combination of tagging and the new resource-level permissions on the resource creation and tag manipulation functions gives you the ability to track and control access to your EC2 resources.

This new feature is available now in all regions except AWS GovCloud (US) and China (Beijing). You can start using it today from the AWS Management Console, AWS Command Line Interface (CLI), AWS Tools for Windows PowerShell, or the AWS APIs.

We are planning to add support for additional EC2 resource types over time; stay tuned for more information!

Jeff;