AWS Official Blog

New – Launch Amazon EMR Clusters in Private Subnets

by Jeff Barr | on | in Amazon EMR, Virtual Private Cloud | | Comments

My colleague Jon Fritz wrote the guest post below to introduce you to an important new feature for Amazon EMR.

— Jeff;


Today we are announcing that Amazon EMR now supports launching clusters in Amazon Virtual Private Cloud (VPC) private subnets, allowing you to quickly, cost-effectively, and securely create fully configured clusters with Hadoop ecosystem applications, Spark, and Presto in the subnet of your choice. With Amazon EMR release 4.2.0 and later, you can launch your clusters in a private subnet with no public IP addresses or attached Internet gateway. You can create a private endpoint for Amazon S3 in your subnet to give your Amazon EMR cluster direct access to data in S3, and optionally create a Network Address Translation (NAT) instance for your cluster to interact with other AWS services, like Amazon DynamoDB and AWS Key Management Service (KMS). For more information on Amazon EMR in VPC, visit the Amazon EMR documentation.

Network Topology for Amazon EMR in a VPC Private Subnet
Before launching an Amazon EMR cluster in a VPC private subnet, please make sure you have the required permissions in your EMR service role and EC2 instance profile, and that you have a route (either through a route from your subnet to an S3 endpoint in your VPC or a NAT/Proxy instance) to the required S3 buckets for your cluster’s initialization. Click here for more information about configuring your subnet.

You can use the new VPC Subnets page in the EMR Console to view the VPC subnets available for your clusters, and configure them by adding S3 endpoints and NAT instances:

Also, here is a sample network topology for an Amazon EMR cluster in a VPC private subnet with a S3 endpoint and NAT instance. However, if you do not need to use your cluster with AWS services besides S3, you do not need a NAT instance to provide a route to those public endpoints:

Encryption at Rest for Amazon S3 (with EMRFS), HDFS, and Local Filesystem
A typical Hadoop or Spark workload on Amazon EMR utilizes Amazon S3 (using the EMR Filesystem – EMRFS) for input datasets/output results and two filesystems located on your cluster: the Hadoop Distributed Filesystem (HDFS) distributed across your cluster and the Local Filesystem on each instance. Amazon EMR makes it easy to enable encryption for each filesystem, and there are a variety of options depending on your requirements:

  1. Amazon S3 Using the EMR Filesystem (EMRFS) – EMRFS supports several Amazon S3 encryption options (using AES-256 encryption), allowing Hadoop and Spark on your cluster to performantly and transparently process encrypted data in S3. EMRFS seamlessly works with objects encrypted by S3 Server-Side Encryption or S3 client-side encryption. When using S3 client-side encryption, you can use encryption keys stored in the AWS Key Management Service or in a custom key management system in AWS or on-premises.
  2. HDFS Transparent Encryption with Hadoop KMS – The Hadoop Key Management Server (KMS) can supply keys for HDFS Transparent Encryption, and it is installed on the master node of your EMR cluster with HDFS. Because encryption and decryption activities are carried out in the client, data is also encrypted in-transit in HDFS. Click here for more information.
  3. Local Filesystem on Each Node – The Hadoop MapReduce and Spark frameworks utilize the Local Filesystem on each slave instance for intermediate data throughout a workload. You can use a bootstrap action to encrypt the directories used for these intermediates on each node using LUKS.

Encryption in Transit for Hadoop MapReduce and Spark
Hadoop ecosystem applications installed on your Amazon EMR cluster typically have different mechanisms to encrypt data in transit:

  1. Hadoop MapReduce Shuffle – In a Hadoop MapReduce job, Hadoop will send data between nodes in your cluster in the shuffle phase, which occurs before the reduce phase of the job. You can use SSL to encrypt this process by enabling the Hadoop settings for Encrypted Shuffle and providing the required SSL certificates to each node.
  2. HDFS Rebalancing – HDFS rebalances by sending blocks between DataNode processes. However, if you use HDFS Transparent Encryption (see above), HDFS never holds unencrypted blocks and the blocks remain encrypted when moved between nodes.
  3. Spark Shuffle – Spark, like Hadoop MapReduce, also shuffles data between nodes at certain points during a job. Starting with Spark 1.4.0, you can encrypt data in this stage using SASL encryption.

IAM Users and Roles, and Auditing with AWS CloudTrail
You can use Identity and Access Management (IAM) users or federated users to call the Amazon EMR APIs, and limit the API calls that each user can make. Additionally, Amazon EMR requires clusters to be created with two IAM roles, an EMR service role and EC2 instance profile, to limit the permissions of the EMR service and EC2 instances in your cluster, respectively. EMR provides default roles using EMR Named Policies for automatic updates, however, you can also provide custom IAM roles for your cluster. Finally, you can audit the calls your account has made to the Amazon EMR API using AWS CloudTrail.

EC2 Security Groups and Optional SSH Access
Amazon EMR uses two security groups, one for the Master Instance Group and one for slave instance groups (Core and Task Instance Groups), to limit ingress and egress to the instances in your cluster. EMR provides two default security groups, but you can provide your own (assuming they have the necessary ports open for communication between the EMR service and the cluster) or add additional security groups to your cluster. In a private subnet, you can also specify the security group added to the ENI used by the EMR service to communicate with your cluster.

Also, you can optionally add an EC2 key pair to the Master Node of your cluster if you would like to SSH to that node. This allows you to directly interact with the Hadoop applications installed on your cluster, or access web-UIs for applications using a proxy without opening up ports in your Master Security Group.

Hadoop and Spark Authentication and Authorization
Because Amazon EMR installs open source Hadoop ecosystem applications on your cluster, you can also leverage existing security features in these products. You can enable Kerberos authentication for YARN, which will give user-level authentication for applications running on YARN (like Hadoop MapReduce and Spark). Also, you can enable table and SQL-level authorization for Hive using HiveServer2 features, and use LDAP integration to create and authenticate users in Hue.

Run your workloads securely on Amazon EMR
Earlier this year, Amazon EMR was added to the AWS Business Associates Agreement (BAA) for running workloads which process PII data (including eligibility for HIPAA workloads). Amazon EMR also has certification for PCI DSS Level 1, ISO 9001, ISO 27001, and ISO 27018.

Security is a top priority for us and our customers. We are continuously adding new security-related functionality and third-party compliance certifications to Amazon EMR in order to make it even easier to run secure workloads and configure security features in Hadoop, Spark, and Presto.

Jon Fritz, Senior Product Manager, Amazon EMR

PS – To learn more, read Securely Access Web Interfaces on Amazon EMR Launched in a Private Subnet on the AWS Big Data Blog.

AWS Podcasts – JAWS, Muterra, Betabrand, and DubSmash

by Jeff Barr | on | in AWS Podcast | | Comments

Here’s the final batch of interviews that I recorded on Tuesday, September 1 at the AWS Loft in San Francisco as part of the Intel Startup Spotlight for the AWS Podcast. Episode 122 was recorded remotely (a first for me).

I spoke with representatives from JAWS, Muterra, Betabrand, and DubSmash. As usual, the “Episode” links go directly to the show notes; you can also [visit] the AWS Podcast page and subscribe to the feed).

Episode 119 – JAWS
For Episode 119, I spoke with Austen Collins, founder of the JAWS project (subsequently renamed to Serverless). JAWS optimizes all Lambda functions, with the plans to grow beyond that capability in the future. I emailed Austen after finding him online, and we began an ongoing conversation about Lambda, JAWS, and API Gateway. We discussed the future of Lambda (“Lambda has the potential to be the focal point of AWS cloud”) and its best qualities, as well as the future of serverless models as a whole.

Episode 120 – Muterra
For Episode 120, I spoke with engineer-turned-entrepreneur Nick Badger about his newly launched startup, Muterra. Muterra launched the Muse protocol to protect personal digital autonomy with ubiquitous encryption and functions with the goal of interoperability. Nick is also developing Ethyr, a new private email system built on Muse. Nick and I talk about his journey from mechanical engineer to entrepreneur, his upcoming Kickstarter campaign, advice for future entrepreneurs, and the technology behind Muterra.

Episode 121 – Betabrand
For Episode 121, I interviewed Seamus James, software engineer at Betabrand. Betabrand is an online crowdfunding clothing community based in San Francisco. Customers (“fans”) co-design and crowdfund new products in just a matter of weeks, bringing new, fresh ideas to life almost every day. While the company is not 100% AWS-powered,  Seamus and I met off of a Reddit thread when he shared a story of how AWS’s scalability helped the ecommerce site survive a high-traffic “success disaster.”

Episode 122 – DubSmash
For Episode 122, I spoke with DubSmash co-founder and CTO Daniel Taschik about the social media and audio-dubbing app. DubSmash enables users to dub audio and video together on an easy-to-use social platform. Daniel explains how they came up with the idea for DubSmash, how people use the app, and how they’ve been able to expand their user network to more than 75 million users in 192 countries.  I enjoyed learning about how AWS has enabled the now-viral app to scale exponentially.

Special Thanks
As always, special thanks are due to my awesome colleagues on the AWS Podcast team:

Jeff;

EC2 Container Registry – Now Generally Available

by Jeff Barr | on | in EC2 Container Service | | Comments

My colleague Andrew Thomas wrote the guest post below to introduce you to the new EC2 Container Registry!

— Jeff;


I am happy to announce that Amazon EC2 Container Registry (ECR) is now generally available!

Amazon ECR is a fully-managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images. We pre-announced the service at AWS re:Invent and have been receiving a lot of interest and enthusiasm from developers ever since.

We built Amazon ECR because many of you told us that running your own private Docker image registry presented many challenges like managing the infrastructure and handling large scale deployments that involve pulling hundreds of images at once. Self-hosted solutions, you said, are especially hard when deploying container images to clusters that span two or more AWS regions. Additionally, you told us that you needed fine-grained access control to repositories/images without having to manage certificates or credentials.

Amazon ECR was designed to meet all of these needs and more. You do not need to install, operate, or scale your own container registry infrastructure. Amazon ECR hosts your images in a highly available and scalable architecture, allowing you to reliably deploy containers for your applications. Amazon ECR is also highly secure. Your images are transferred to the registry over HTTPS and automatically encrypted at rest in S3. You can configure policies to manage permissions and control access to your images using AWS Identity and Access Management (IAM) users and roles without having to manage credentials directly on your EC2 instances. This enables you to share images with specific users or even AWS accounts.

Amazon EC2 Container Registry also integrates with Amazon ECS and the Docker CLI, allowing you to simplify your development and production workflows. You can easily push your container images to Amazon ECR using the Docker CLI from your development machine, and Amazon ECS can pull them directly for production deployments.

Let’s take a look at how easy it is to store, manage, and deploy Docker containers with Amazon ECR and Amazon ECS.

Amazon ECR Console
The Amazon ECR Console simplifies the process of managing images and setting permissions on repositories. To access the console, simply navigate to the “Repositories” section in the Amazon ECS console. In this example I will push a simple PHP container image to Amazon ECR, configure permissions, and deploy the image to an Amazon ECS cluster.

After navigating to the Amazon ECR Console and selecting “Get Started”, I am presented with a simple wizard to create and configure my repository.

After entering the repository name, I see the repository endpoint URL that I will use to access Amazon ECR. By default I have access to this repository, so I don’t have to worry about permissions now and can set them later in the ECR console.

When I click Next step, I see the commands I need to run in my terminal to build my Docker image and push it to the repository I just created. I am using the Dockerfile from the ECS Docker basics tutorial. The commands that appear in the console require that I have the AWS Command Line Interface (CLI) and Docker CLI installed on my development machine (if you are using the Amazon Linux AMI and are reading this in 2015, you will need to install the CLI manually). Next, I copy and run each command to login, tag the image with the ECR URI, and push the image to my repository.

After completing these steps, I click Done to navigate to the repository where I can manage my images.

Setting Permissions
Amazon ECR uses AWS Identity and Access Management to control and monitor who and what (e.g., EC2 instances) can access your container images. We built a permissions tool in the Amazon ECR Console to make it easier to create resource-based policies for your repositories.

To use the tool I click on the Permissions tab in the repository and select Add. I now see that the fields in the form correspond to an IAM statement within a policy document. After adding the statement ID, I select whether this policy should explicitly deny or allow access. Next I can set who this statement should apply to by either entering another AWS account number or selecting users and roles in the entities table.

After selecting the desired entities, I can then configure the actions that should apply to the statement. For convenience, I can use the toggles on the left to easily select the actions required for pull, push/pull, and administrative capabilities.

Integration With Amazon ECS
Once I’ve created the repository, pushed the image, and set permissions I am now ready to deploy the image to ECS.

Navigating to the Task Definitions section of the ECS console, I create a new Task Definition and specify the Amazon ECR repository in the Image field. Once I’ve configured the Task Definition, I can go to the Clusters section of the console and create a new service for my Task Definition. After creating the service, the ECS Agent will automatically pull down the image from ECR and start running it on an ECS cluster.

Updated First-Run
We have also updated our Amazon ECS Getting Started Wizard to include the ability to push an image to Amazon ECR and deploy that image to ECS:

Partner Support for ECS
At re:Invent we announced partnerships with a number of CI/CD providers to help automate deploying containers on ECS.  We are excited to announce today that our partners have added support for Amazon ECR making it easy for developers to create and orchestrate a full, end-to-end container pipeline to automatically build, store, and deploy images on AWS. To get started check out the solutions from our launch partners who include Shippable, Codeship, Solano Labs, CloudBees, and CircleCI.

We are also excited to announce a partnership with TwistLock to provide vulnerability scanning of images stored within ECR. This makes it even easier for developers to evaluate potential security threats before pushing to Amazon ECR and allows developers to monitor their containers running in production. See the Container Partners Page for more information about our partnerships.

Launch Region
Effective today, Amazon ECR is available in US East (Northern Virginia) with more regions on the way soon!

Pricing
With Amazon ECR you only pay for the storage used by your images and data transfer from Amazon ECR to the internet or other regions. See the ECR Pricing page for more details.

Get Started Today
Check out our Getting Started with EC2 Container Registry page to start using Amazon ECR today!

Andrew Thomas, Senior Product Manager

AWS Week in Review – December 14, 2015

by Jeff Barr | on | in Week in Review | | Comments

Let’s take a quick look at what happened in AWS-land last week:

Monday

December 14

Tuesday

December 15

Wednesday

December 16

Thursday

December 17

Friday

December 18

Saturday

December 19

New & Notable Open Source

  • git-secrets prevents you from committing passwords and other sensitive information to a Git repository.
  • ssh2 helps you to SSH to your EC2 instance.
  • smoketail is a library and utility for tailing AWS CloudWatch Logs.
  • aws-shell is an integrated shell for working with the AWS CLI.
  • ami-spec performs acceptance testing on AMIs.
  • 50 Shades of Lambda is a 50-chapter ebook full of AWS Lambda use cases.
  • ansible-lambda is a custom Ansible module for AWS Lambda support.
  • mozaws is an R script to start AWS clusters.
  • jcabi-dynamo is an object-oriented wrapper around the AWS SDK for Java.
  • botoform helps to manage AWS infrastructure using YAML templates.

New SlideShare Presentations

New Customer Success Stories

New YouTube Videos

Help Wanted

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

Jeff;

AWS Week in Review – December 7, 2015

by Jeff Barr | on | in Week in Review | | Comments

Let’s take a quick look at what happened in AWS-land last week:

Monday

December 7

Tuesday

December 8

Wednesday

December 9

Thursday

December 10

Friday

December 11

Saturday

December 12

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

Jeff;

New – Enhanced Monitoring for Amazon RDS (MySQL 5.6, MariaDB, and Aurora)

by Jeff Barr | on | in Amazon Aurora, Amazon RDS | | Comments

Amazon Relational Database Service (RDS) makes it easy for you to set up, run, scale, and maintain a relational database. As is often the case with the high-level AWS model, we take care of all of the details in order to give you the time to focus on your application and your business.

Enhanced Monitoring
Advanced RDS users have asked us for more insight into the inner workings of the service and we are happy to oblige with a new Enhanced Monitoring feature!

After you enable this feature for a database instance, you get access to over 50 new CPU, memory, file system, and disk I/O metrics. You can enable these features on a per-instance basis, and you can choose the granularity (all the way down to 1 second). Here is the list of available metrics:

And here are some of the metrics for one of my database instances:

You can enable this feature for an existing database instance by selecting the instance in the RDS Console and then choosing Modify from the Instance Options menu:

Turn the feature on, pick an IAM role, select the desired granularity, check Apply Immediately, and then click on Continue.

The Enhanced Metrics are ingested into CloudWatch Logs and can be published to Amazon CloudWatch. To do this you will need to set up a metrics extraction filter; read about Monitoring Logs to learn more. Once the metrics are stored in CloudWatch Logs, they can also be processed by third-party analysis and monitoring tools.

Available Now
The new Enhanced Metrics feature is available today in the US East (Northern Virginia), US West (Northern California), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Tokyo) regions. It works for MySQL 5.6, MariaDB, and Amazon Aurora, on all instance types except t1.micro and m1.small.

You will pay the usual ingestion and data transfer charges for CloudWatch Logs (see the CloudWatch Logs Pricing page for more info).

Jeff;

AWS Config Rules – Now Available in US East (Northern Virginia)

by Jeff Barr | on | in AWS Config | | Comments

We announced AWS Config Rules (Dynamic Compliance Checking for Cloud Resources) at AWS re:Invent and made a preview available to interested customers.

As I noted at the time, you can use these rules to verify that existing and newly launched AWS resources conform to your organization’s security guidelines and best practices without having to spend time manually inspecting them. Instead, you define rules (AWS Lambda functions) that are run when resources are created or changed. The rule has access to the Configuration Item associated with the resource, and can also make calls to other AWS API functions as needed.

Available Now
Today we are making Config Rules accessible to all AWS customers, with initial availability in the US East (Northern Virginia) region and the ability to create up to 25 Config Rules per account.

We added some important features during the preview:

  • Support for additional IAM and EC2 resource types. You can now write rules that process IAM Users, Groups, Roles, including customer-managed policies. You can also write rules that process EC2 Dedicated Hosts.
  • Lambda Blueprints to help create custom rules. You now have access to blueprints for periodic rules and for rules that run in response to changes.
  • CloudFormation Support for Config and for Config Rules. You can now automate the setup of AWS Config for new accounts, and you can automate the creation and configuration of the Config Rules.

Several AWS partners are already making great use of Config Rules in production. For example, Alert Logic, CloudHealth Technologies and Trend Micro Deep Security are using Config Rules as integral parts of their respective flagship products.

Creating a Custom Rule
I will create a rule that inspects an IAM Policy named DBSuperUserPolicy. I want to make sure that the policy has only one user, DBSuperUser, attached to it. If this is the case, the Policy is compliant; otherwise, it is not.

I start by creating a Lambda function using one of the new blueprints. I open up the Lambda Console, click on Create a Lambda function, and locate the blueprint that I want to use. I enter “config” in the search box:

I want my rule to run every time the IAM Policy is changed, so I select the first blueprint, config-rule-change-triggered. Then I enter a name and description for the function:

The code provided in the blueprint includes a function named evaluateCompliance. This function is the heart of the rule; it is activated on each configuration change and always returns one of the following strings:

  • NOT_APPLICABLE -The rule does not apply to the resource or the resource type that it was given.
  • COMPLIANT – The rule is relevant to the resource, and the resource is configured in a compliant way.
  • NON_COMPLIANT – The rule is relevant to the resource, and the resource is not configured in a compliant way.

I replace the default implementation of the function with my own. It looks like this:

function evaluateCompliance(configurationItem, ruleParameters, context)
{
  if ((configurationItem.resourceType !== 'AWS::IAM::Policy') || 
      (configurationItem.resourceName !== 'DBSuperUserPolicy'))
    return 'NOT_APPLICABLE';
  if (configurationItem.relationships[0].resourceName === ruleParameters.attachedIAMUser)
  {
    if (configurationItem.relationships[1])
      return 'NON_COMPLIANT';
    else
      return 'COMPLIANT';
  }
  else
    return 'NON_COMPLIANT';
}

This rule is simple and powerful! Let’s go through it, line-by-line.

  • Lines 3 and 4 check to see if the rule was invoked on an IAM Policy named DBSuperUserPolicy; line 5 returns NOT_APPLICABLE if this is not the case.
  • Line 6 verifies that there is an IAM User attached to the Policy; line 14 returns NON_COMPLIANT if this is not the case.
  • Line 8 verifies that there is no more than one User attached to the Policy; line 9 returns NON_COMPLIANT if this is not the case.
  • Line 11 returns COMPLIANT to indicate that the Policy has exactly one IAM User attached, and is therefore compliant.

As you can see, this simple, 15-line function is all that you need to enforce an organization-wide policy across all of your AWS resources.

I also need to create an IAM Role so that the function can send the results to AWS Config:

If the function needs to access other AWS resources or APIs, I would amend the policy document accordingly. I select the role, confirm that I want to create the function, and I am already halfway there:

With the function in place, I need to arrange for Config Rules to call it as needed. I visit the Config Rules Console and click on Add Rule:

At this point I can choose one of the seven AWS managed rules, or I can click on Add custom rule (sounds good to me):

Now I name my rule, point it at my Lambda function (via its ARN), and indicate that I want it to be run when the configuration of an IAM Policy changes:

The rule will be evaluated within a few minutes; I can verify that my function has been invoked by taking a peek at the function’s Monitoring tab in the Lambda console:

From looking at these metrics I can see that the function has been run 7 times (I have that number of customer-managed IAM Policies), that each invocation lasts for less than 1 second, and that the invocations are not raising any errors. So far, so good!

Exercising the Rule
With the function running and attached to the rule, the next step is to make sure that it performs as expected. I create an IAM Policy named DBSuperUserUsagePolicy and attach user jeff to it (this is not in compliance with my rule, and it should be marked as such):

After allowing some time for the rule to be run, I return to the Config Rules console and I can see that there’s an issue:

I can learn more with a click:

If I don’t recall making the change or don’t know who did it (I did, but I was tired and clearly made a mistake), I can click on the Config timeline to learn more:

After some investigation I realize that I was supposed to use DBSuperUser, and update my Policy accordingly:

I wait a few minutes and check the rule details again. I am good to go – my resource is now compliant with my policy:

My change is reflected in the timeline (as you can see, I actually do spend some time away from the keyboard; I created the policy last night and edited it this morning):

Finally, I can see that my Lambda function was invoked after I updated my policy:

CloudFormation Support
You can now use CloudFormation templates to create your Config Rules and your Lambda functions. Here’s how you would create a rule that references a function called VolumeAutoEnableIOComplianceCheck:

"ConfigRuleForVolumeAutoEnableIO": {
      "Type": "AWS::Config::ConfigRule",
      "Properties": {
        "ConfigRuleName": "ConfigRuleForVolumeAutoEnableIO",
        "Scope": {
          "ComplianceResourceId": {"Ref": "Ec2Volume"},
          "ComplianceResourceTypes": ["AWS::EC2::Volume"]
        },
        "Source": {
          "Owner": "CUSTOM_LAMBDA",
          "SourceDetails": [{
              "EventSource": "aws.config",
              "MessageType": "ConfigurationItemChangeNotification"
          }],
          "SourceIdentifier": {"Fn::GetAtt": ["VolumeAutoEnableIOComplianceCheck", "Arn"]}
        }
      },
      "DependsOn": "ConfigPermissionToCallLambda"
    },

Partner Support
As I mentioned earlier, several AWS partners are already making great use of this feature. Here’s some more information on their offerings:

Alert Logic Cloud Insight allows customers to use AWS Config to configure checks for additional custom vulnerabilities. Learn more on the Alert Logic Partner Page.

CloudHealth Technologies stores AWS infrastructure configuration history and supports searching of changes by group, access to historical changes, and a history of asset configuration. Learn more on the CloudHealth Technologies Partner Page.

Trend Micro provides a comprehensive set of security controls. Learn more on the Trend Micro Partner Page.

Availability and Pricing
Config Rules is now available in the US East (Northern Virginia) region and you can start using it today. Now that it is generally available, you will be charged for usage as described on the Config Rules Pricing page.

Jeff;

AWS IoT – Now Generally Available

by Jeff Barr | on | in AWS IoT | | Comments

A few months ago, I wrote about AWS IoT (see AWS IoT – Cloud Services for Connected Devices) and talked about how we are working to make sure that AWS is well-equipped to support many different types of IoT devices and applications. At that time we launched AWS IoT in beta form and invited interested developers to sign up and to start getting experience with the service.

We built AWS IoT because connected devices are proliferating. They are in your house, your car, your office, your school, and perhaps even in your body! Like some of our more advanced customers, we have been building systems around connected devices for quite some time. Our experience with Amazon Robotics, drones (Amazon Prime Air), the Amazon Echo, the Dash Button, and multiple generations of Kindles has given us a well-informed perspective on how to serve this really important emerging market. Behind the scenes, AWS services such as AWS Lambda, Amazon API Gateway, Amazon DynamoDB, Amazon Kinesis, Amazon Simple Storage Service (S3), and Amazon Redshift provide the responsive, highly scalable infrastructure needed to build a robust IoT application.

When we talked to our customers and to our own engineers, we learned quite a bit about the pain points that add complexity and development time to IoT applications. They told us that connecting devices to the cloud is overly complex due to the variety of SDKs and protocols that they need to support in a secure and scalable fashion. Making this even more difficult is the fact that many devices “feature” intermittent connectivity to the Internet, even as application logic shifts from the device to the cloud. Finally, the sheer volume of data generated by the sensors attached to the devices mandates a Big Data approach to storage, analytics, and visualization.

These are, to be sure, some steep requirements. As you can read in my post above, we have designed AWS IoT with all of them in mind.

Now Available
I am happy to be able to announce that the beta period is over and that AWS IoT is now generally available. Many AWS customers are already building apps and creating new businesses around IoT. Here are a couple of examples:

  • The Philips HealthSuite digital platform collects, analyzes, and stores 15 petabytes of patient data (case study).
  • Scout Alarm uses AWS IoT to support an ever-growing set of self-installed, wireless home security systems, allowing them to focus on the user experience instead of on the infrastructure.

During the beta, we added a pair of important features to AWS IoT:

IoT Use Cases
Earlier this week I sat down with a couple of members of the IoT team. They told me that our customers are planning to use AWS IoT to support many industries and use cases! Here’s a sampling:

  • Agriculture
  • Cars & trucks
  • Consumer devices
  • Gaming
  • Home automation
  • Logistics
  • Medical
  • Municipal infrastructure
  • Oil & gas
  • Robotics

Get Started Today
To learn more about AWS IoT and how you can put it to use in your environment, hop on over to the Getting Started page. You may also want to read the AWS IoT FAQs and study the AWS IoT Documentation.

Jeff;

New – AWS Cost and Usage Reports for Comprehensive and Customizable Reporting

by Jeff Barr | on | | Comments

Many of our customers have been asking us for data and tools to allow them to better understand and manage their AWS costs.

New Reports
Today we are introducing a set of new AWS Cost and Usage Reports that provide you with comprehensive data about products, pricing, and usage. The reports allow you to understand individual costs and to analyze them in greater detail. For example, you can view your EC2 costs by instance type and then drill-down in order to understand usage by operating system, instance type, and purchase option (On-Demand, Reserved, or Spot).

The new reports are generated in CSV form and can be customized. You can select the data included in each report, decide whether you want it aggregated across an hour or a day, and then request delivery to one of your S3 buckets, with your choice of ZIP or GZIP compression. The data format is normalized so that each discrete cost component is presented in an exclusive column.

You can easily upload the reports to Amazon Redshift and then run queries against the data using business intelligence and data visualization tools including Amazon QuickSight.

Creating a Report
To create a report, head on over to the AWS Management Console, and choose Billing & Cost Management from the menu in the top-right:

Then click on Reports in the left navigation:

Click on Create report to create your first report:

Enter a name for your report, pick a time unit, and decide whether you want to include Resource IDs (more detail and a bigger file) or not:

Now choose your delivery options: pick an S3 bucket (you’ll need to set the permissions per the sample policy), set a prefix if you’d like, and select the desired compression (GZIP or ZIP):

Click on Next, review your choices, and then create your report. It will become visible on the AWS Cost and Usage Reports page:

A fresh report will be delivered to the bucket within 24 hours. Additional reports will be provided every 24 hours (or less) thereafter.

From there you can transfer them to Redshift using a AWS Data Pipeline job or some code triggered by a AWS Lambda function, and then analyze them using the BI or data visualization tool of your choice.

Visualizing the Data
Here are some sample visualizations, courtesy of Amazon QuickSight. Looking at our EC2 spend by instance type gives an overall picture of our spending:

Viewing it over time shows that spending varies considerably from day to day:

Learn More
To learn more, read about Understanding Your Usage with Billing Reports.

Jeff;

 

AWS CloudTrail Update – Turn on in All Regions & Use Multiple Trails

by Jeff Barr | on | in AWS CloudTrail | | Comments

My colleague Sivakanth Mundru wrote the guest post below in order to share news of some important new features for AWS CloudTrail.

Jeff;

As many of you know AWS CloudTrail provides visibility into API activity in your AWS account and enables you to answer important questions such as which user made an API call or which resources were acted upon in an API call. Today, we are happy to deliver two features that are many of you asked for:

  1. The ability to turn on CloudTrail across all AWS regions.
  2. Support for multiple trails.

Turn on CloudTrail in All Regions
Until now, you had to turn on CloudTrail for each desired region. Many of you provided feedback to us that this is time consuming, and asked for the ability to turn on CloudTrail in all regions with few clicks.

Starting immediately, you can simply specify that a trail will apply to all regions and CloudTrail will automatically create the same trail in each region, record and process log files in each region, and deliver log files from all regions to the S3 bucket or (optionally) the CloudWatch Logs log group you specified.

To be a bit more specific, “all” refers to the regions within a single AWS partition. The US East (Northern Virginia), US West (Northern California), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Asia Pacific (Sydney), Asia Pacific (Sydney), Asia Pacific (Tokyo), and South America (Brazil) regions are all in the aws partition; the Beijing (China) region is in the aws-cn partition (read Amazon Resource Names (ARNs) and AWS Service Namespaces to learn more). The features described in this post apply to the aws partition.

Future Proof for New Regions
In addition to turning on CloudTrail for all existing regions, when AWS launches a new region  CloudTrail will create the trail in the new region and turn it on. As a result, you will receive log files containing API activity for your AWS account in the new region without taking any action.

Here’s how you turn on CloudTrail in all regions via the AWS Management Console:

Support for Multiple Trails
CloudTrail log files enable you to troubleshoot operational or security issues in your AWS account and help you demonstrate compliance with your internal policies or external standards. Different stakeholders have different needs. With support for multiple trails, different stakeholders in the company can create and manage their own trails for their own needs. For example:

  • A security administrator can create a trail that applies to all regions and encrypt the log files with one KMS key.
  • A developer can create a trail that applies to one region, for example Asia Pacific (Sydney), and configure CloudWatch alarms to receive notifications of specific API activity.
  • An IT auditor can create a trail that applies to one region, say Europe (Frankfurt), and configure log file integrity validation to positively assert that log files are not changed since CloudTrail delivered the log files to an S3 bucket.

Here’s what this would look like:

You can create up to 5 trails per region (a trail that applies to all regions exists in each region and counted as 1 trail per region).

As part of today’s launch we are announcing support for resource level permissions so that you can prescribe granular access control policies on which users can or cannot take particular actions on a given trail. For more details and sample policies, see the CloudTrail documentation.

Viewing and Managing Trails Across Regions
We are also announcing an important enhancement to the CloudTrail Console!

You can now view and manage trails across all regions in a partition, no matter which region you are in. You will see all the trails for your account in every region.  You can click on the trail name and CloudTrail will navigate to the trail configuration page automatically:

As you can see, the trail named Allregionstrail applies to all regions. This means that the Allregionstrail exists in every region and log files for all regions are recorded and delivered to one S3 bucket and an optional CloudWatch Logs log group. Other trails are specific to a region and log files for those specific regions are recorded and delivered as per the trail configuration. You can click on a trail name to view, edit or delete a trail.

Pricing
All new and existing AWS customers can create one trail per region and record API activity for services supported by CloudTrail as a part of the free tier. The free tier does not have an expiration.

A trail that applies to all regions exists in each region and counted as 1 trail per region.

You pay $2.00 per 100,000 events recorded in each additional trail. There is no charge for creating additional trails.

Sivakanth Mundru, Senior Product Manager