🎯 Practical: AWS Blog | Tech Viral News

New AWS Competency – AWS Migration

by Jeff Barr | on 08 JUN 2016 | in AWS Partner Network | Permalink | Comments

More and more, I hear from customers who want to migrate large-scale workloads to AWS, and seek advice regarding their cloud migration strategy. We provide customers with a number of cloud migration tools and services, including the AWS Database Migration Service, and resources, such as the AWS Professional Services Cloud Adoption Framework. Further, we have a strong and mature ecosystem of AWS Partner Network (APN) Consulting and Technology Partners who’ve demonstrated expertise in helping customers like you successfully migrate to AWS.

In an effort to make it as easy as we can for you to identify APN Partners who’ve demonstrated technical proficiency and proven customer success in migration, I’m pleased to announce the launch of the AWS Migration Competency.

New Migration Competency – Migration Partner Solutions
Migration Competency Partners provide solutions or have deep experience helping businesses move successfully to AWS, through all phases of complex migration projects, discovery, planning, migration and operations.

The AWS Partner Competency Program has validated that the partners below have demonstrated that they can help enterprise customers migrate applications and legacy infrastructure to AWS.

Categories and Launch Partners

Migration Delivery Partners – Help customers through every stage of migration, accelerating results by providing personnel, tools, and education in the form of professional services. These partners either are, or have a relationship with an AWS audited Managed Service Provider to help customers with ongoing support of AWS workloads. Here are the launch partners:

Migration Consulting Partners – Provide expertise and training to help enterprises quickly develop specific capabilities or achieve specific outcomes. They provide consulting services to enable adoption of DevOps practices, to modernize applications, and implement solutions. Here are our launch partners:

Migration Technology for Discovery & Planning – Discover IT assets across your application portfolio, identify dependencies and requirements, and build your comprehensive migration plan with these technology solutions. Here are our launch partners:

Migration Technology for Workload Mobility – Execute migrations to AWS by capturing your host server, configuration, storage, and network states, then provision and configure your AWS target resources. Here are our launch partners:

Migration Technology for Application Profiling – Gain valuable insights into your applications by capturing and analyzing performance data, usage, and monitoring dependencies before and after migration. Here are our launch partners:

Launch Partners in Action
Do you want to hear from a few of our launch Partners? Visit the videos below to hear Cloud Technology Partners (CTP), REAN Cloud, and Slalom discuss the evolution of enterprise cloud migrations, and the value of the AWS Migration Competency for customers:

Cloud Technology Partners – The Evolution of Cloud Migration

REAN Cloud – the Role of DevOps in Cloud Migrations

Slalom – The Value of the AWS Migration Competency for Customers

— Jeff;

Semi-Autonomous Driving Using EC2 Spot Instances at Mapbox

by Jeff Barr | on 08 JUN 2016 | in Customer Success, EC2 Spot Instances, Guest Post | Permalink | Comments

Will White of Mapbox shared the following guest post with me. In the post, Will describes how they use EC2 Spot Instances to economically process the billions of data points that they collect each day.

I do have one note to add to Will’s excellent post. We know that many AWS customers would like to create Spot Fleets that automatically scale up and down in response to changes in demand. This is on our near-term roadmap and I’ll have more to say about it before too long.

— Jeff;

The largest automotive tech conference, TU-Automotive, kicked off in Detroit this morning with almost every conversation focused on strategies for processing the firehose of data coming off connected cars. The volume of data is staggering – last week alone we collected and processed over 100 million miles of sensor data into our maps.

Collecting Street Data
Rather than driving a fleet of cars down every street to make a map, we turn phones, cars, and other devices into a network of real-time sensors. EC2 Spot Instances process the billions of points we collect each day and let us see every street, analyze the speed of traffic, and connect the entire road network. This anonymized and aggregated data protects user privacy while allowing us to quickly detect road changes. The result is Mapbox Drive, the map built specifically for semi-autonomous driving, ride sharing, and connected cars.

Bidding for Spot Capacity
We use the Spot market to bid on spare EC2 instances, letting us scale our data collection and processing at 1/10^th the cost. When you launch an EC2 Spot instance you set a bid price for how much you are willing to pay for the instance. The market price (the price you actually pay) constantly changes based on supply and demand in the market. If the market price ever exceeds your bid price, your EC2 Spot instance is terminated. Since spot instances can spontaneously terminate, they have become a popular cost-saving tool for non-critical environments like staging, QA, and R&D – services that don’t require high availability. However, if you can architect your application to handle this kind of sudden termination, it becomes possible to run extremely resource-intensive services on spot and save a massive amount of money while maintaining high availability.

The infrastructure that processes the 100 million miles of sensor data we collect each week is critical and must always be online, but it uses EC2 Spot Instances. We do it by running two Auto Scaling groups, a Spot group and an On-Demand group, that share a single Elastic Load Balancer. When Spot prices spike and instances get terminated, we simply fallback by automatically launching On-Demand instances to pick up the slack.

Handling Termination Notices
We use termination notices, which give us a two-minute warning before any EC2 Spot instance is terminated. When an instance receives a termination notice it immediately makes a call to the Auto Scaling API to scale up the On-Demand Auto Scaling group, seamlessly adding stable capacity to the Elastic Load Balancer. We have to pay On-Demand prices for the replacement EC2s, but only for as long as the Spot interruption lasts. When the Spot market price falls back below our bid price, our Spot Auto Scaling group will automatically launch new Spot instances. As the Spot capacity scales back up, an aggressive Auto Scaling policy scales down the On-Demand group, terminating the more expensive instances.

Building our data processing pipeline on Spot worked so well that we have now moved nearly every Mapbox service over to Spot too. As the traffic done by over 170 million unique users of apps like Foursquare, MapQuest, and Weather.com grows each month, our cost of goods sold (COGS) continues to fall. Spot interruptions are relatively rare for the instance types we use so the fallback is only triggered a 1-2 times per month. This means we are running on discounted Spot instances more than 98% of the time. On our maps service alone, this has resulted in an 90% savings on our EC2 costs each month.

Going Further with Spot
To further optimize our COGS we’re working on a “waterfall” approach to fallback, pushing traffic to other configurations of Spot Instances first and only using On-Demand as an absolute last resort. For example, an application that normally runs on c4.xlarge instances, is often compatible with other instance sizes in the same family (c4.2xlarge, c4.4xlarge, etc) and instance types in other families (m4.2xlarge, m4.4xlarge, etc). When our Spot EC2s get terminated, we’ll bid on the next cheapest option on the Spot market. This will result in more Spot interruptions, but our COGS decrease further because we’ll fallback on Spot instances instead of paying full price for On-Demand EC2 instances. This maximizes our COGS savings while maintaining high availability for our enterprise customers.

It’s worth noting that similar fallback functionality is built into EC2 Spot Fleet, but we prefer Auto Scaling groups due to a few limitations with Spot Fleet (for example, there’s no support for Auto Scaling without implementing it yourself) and because Auto Scaling groups give us the most flexibility.

Over the last 12 months, data collection and processing has increased our consumption of EC2 compute hours by 1044%, but our COGS actually decreased. We used to see our costs increase linearly with consumption, but now see these hockey stick increases in consumption while costs stay basically flat for the same period.

If you’re building a resource-hungry application that requires high availability and the costs for On-Demand EC2 instances make it unsustainable to run, take a close look at EC2 Spot Instances. Combined with the right architecture and some creative orchestration, EC2 Spot Instances will allow you to run your application with extremely low COGS.

— Will White, Development Engineering, Mapbox

Amazon EMR 4.7.0 – Apache Tez & Phoenix, Updates to Existing Apps

by Jeff Barr | on 02 JUN 2016 | in Amazon EMR | Permalink | Comments

Amazon EMR allows you to quickly and cost-effectively process vast amounts of data. Since the 2009 launch, we have added many new features and support for an ever-increasing roster of applications from the Hadoop ecosystem. Here are a few of the additions that we have made this year:

April – Support for Apache HBase 1.2 (EMR 4.6).
March – Support for Sqoop, HCatalog, Java 8, and more (EMR 4.4).
February – Support for EBS volumes, M4 instances, and C4 instances.
January – Support for Apache Spark, with updates to other applications.

Today we are pushing forward once again, with new support for Apache Tez (dataflow-driven data processing task orchestration) and Apache Phoenix (fast SQL for OLTP and operational analytics), along with updates to several of the existing apps. In order to make use of these new and/or updated applications, you will need to launch a cluster that runs release 4.7.0 of Amazon EMR.

New – Apache Tez (0.8.3)
Tez runs on top of Apache Hadoop YARN. Tez provides you with a set of dataflow definition APIs that allow you to define a DAG (Directed Acyclic Graph) of data processing tasks. Tez can be faster than Hadoop MapReduce, and can be used with both Hive and Pig. To learn more, read the EMR Release Guide. The Tez UI includes a graphical view of the DAG:

The UI also displays detailed information about each DAG:

New – Apache Phoenix (4.7.0)
Phoenix uses HBase (another member of the Hadoop ecosystem) as its datastore. You can connect to Phoenix using a JDBC driver included on the cluster or from other applications that are running on or off of the cluster. Either way, you get access to fast, low-latency SQL with full ACID transaction capabilities. Your SQL queries are compiled into a series of HBase scans, the scans are run in parallel, and the results are aggregated to produce the result set. To learn more, read the Phoenix Quick Start Guide or review the Apache Phoenix Overview Presentation.

Updated Applications
We have also updated the following applications:

HBase 1.2.1 – HBase provides low-latency, random access to massive datasets. The new version includes some bug fixes.
Mahout 0.12.0 – Mahout provides scalable machine learning and data mining. The new version includes a large set of math and statistics features.
Presto 0.147 – Presto is a distributed SQL query engine designed for large data sets. The new version adds features and fixes bugs.

Amazon Redshift JDBC Driver
You can use the new Redshift JDBC driver to allow applications running on your EMR clusters to access and update data stored in your Redshift clusters. Two versions of the driver are included on your cluster:

JDBC 4.0-compatible – /usr/share/aws/redshift/jdbc/RedshiftJDBC4.jar.
JDBC 4.1-compatible – /usr/share/aws/redshift/jdbc/RedshiftJDBC41.jar.

To start using the new and applications, simply launch a new EMR cluster, and select release 4.7.0 along with the desired set of applications.

— Jeff;

Learn about Amazon Redshift in our new Data Warehousing on AWS Class

by Jeff Barr | on 02 JUN 2016 | in Amazon Redshift, Big Data, Training and Certification | Permalink | Comments

As our customers continue to look to use their data to help drive their missions forward, finding a way to simply and cost-effectively make use of analytics is becoming increasingly important. That is why I am happy to announce the upcoming availability of Data Warehousing on AWS, a new course that helps customers leverage the AWS Cloud as a platform for data warehousing solutions.

New Course
Data Warehousing on AWS is a new three-day course that is designed for database architects, database administrators, database developers, and data analysts/scientists. It introduces you to concepts, strategies, and best practices for designing a cloud-based data warehousing solution using Amazon Redshift. This course demonstrates how to collect, store, and prepare data for the data warehouse by using other AWS services such as Amazon DynamoDB, Amazon EMR, Amazon Kinesis, and Amazon S3. Additionally, this course demonstrates how you can use business intelligence tools to perform analysis on your data. Organizations who are looking to get more out of their data by implementing a Data Warehousing solution or expanding their current Data Warehousing practice are encouraged to sign up.

These classes (and many more) are available through AWS and our Trainintg Partners. Find upcoming classes in our global training schedule or learn more at AWS Training.

— Jeff;

New – Cross-Region Read Replicas for Amazon Aurora

by Jeff Barr | on 01 JUN 2016 | in Amazon Aurora, Amazon RDS | Permalink | Comments

You already have the power to scale the read capacity of your Amazon Aurora instances by adding additional read replicas to an existing cluster. Today we are giving you the power to create a read replica in another region. This new feature will allow you to support cross-region disaster recovery and to scale out reads. You can also use it to migrate from one region to another or to create a new database environment in a different region.

Creating a read replica in another region also creates an Aurora cluster in the region. This cluster can contain up to 15 more read replicas, with very low replication lag (typically less than 20 ms) within the region (between regions, latency will vary based on the distance between the source and target). You can use this model to duplicate your cluster and read replica setup across regions for disaster recovery. In the event of a regional disruption, you can promote the cross-region replica to be the master. This will allow you to minimize downtime for your cross-region application. This feature applies to unencrypted Aurora clusters.

Before you get actually create the read replica, you need to take care of a pair of prerequisites. You need to make sure that a VPC and the Database Subnet Groups exist in the target region, and you need to enable binary logging on the existing cluster.

Setting up the VPC
Because Aurora always runs within a VPC, ensure that the VPC and the desired Database Subnet Groups exist in the target region. Here are mine:

Enabling Binary Logging
Before you can create a cross region read replica, you need to enable binary logging for your existing cluster. Create a new DB Cluster Parameter Group (if you are not already using a non-default one):

Enable binary logging (choose MIXED) and then click on Save Changes:

Next, Modify the DB Instance, select the new DB Cluster Parameter Group, check Apply Immediately, and click on Continue. Confirm your modifications, and then click on Modify DB Instance to proceed:

Select the instance and reboot it, then wait until it is ready.

Create Read Replica
With the prerequisites out of the way it is time to create the read replica! From within the AWS Management Console, select the source cluster and choose Create Cross Region Read Replica from the Instance Actions menu:

Name the new cluster and the new instance, and then pick the target region. Choose the DB Subnet Group and set the other options as desired, then click Create:

Aurora will create the cluster and the instance. The state of both items will remain at creating until the items have been created and the data has been replicated (this could take some time, depending on amount of data stored in the existing cluster.

This feature is available now and you can start using it today!

— Jeff;

New in AWS Marketplace: Alces Flight – Effortless HPC on Demand

by Jeff Barr | on 01 JUN 2016 | in AWS Marketplace, HPC | Permalink | Comments

In the past couple of years, academic and corporate researchers have begun to see the value of the cloud. Faced with a need to run demanding jobs and to deliver meaningful results as quickly as possible while keeping costs under control, they are now using AWS to run a wide variety of compute-intensive, highly parallel workloads.

Instead of fighting for time on a cluster that must be shared with other researchers, they accelerate their work by launching clusters on demand, running their jobs, and then shutting the cluster down shortly thereafter, paying only for the resources that they consume. They replace tedious RFPs, procurement, hardware builds and acceptance testing with cloud resources that they can launch in minutes. As their needs grow, they can scale the existing cluster or launch a new one.

This self-serve, cloud-based approach favors science over servers and accelerates the pace of research and innovation. Access to shared, cloud-based resources can be granted to colleagues located on the same campus or halfway around the world, without having to worry about potential issues at organizational or network boundaries.

Alces Flight in AWS Marketplace
Today we are making Alces Flight available in AWS Marketplace. This is a fully-featured HPC environment that you can launch in a matter of minutes. It can make use of On-Demand or Spot Instances and comes complete with a job scheduler and hundreds of HPC applications that are all set up and ready to run. Some of the applications include built-in collaborative features such as shared graphical views. For example, here’s the Integrative Genomics Viewer (IGV):

Each cluster is launched into a Virtual Private Cloud (VPC) with SSH and graphical desktop connectivity. Clusters can be of fixed size, or can be Auto Scaled in order to meet changes in demand. Once launched, the cluster looks and behaves just like a traditional Linux-powered HPC cluster, with shared NFS storage and passwordless SSH access to the compute nodes. It includes access to HPC applications, libraries, tools, and MPI suites.

We are launching Alces Flight in AWS Marketplace today. You can launch a small cluster (up to 8 nodes) for evaluation and testing or a larger cluster for research.

If you subscribe to the product, you can download the AWS CloudFormation template from the Alces site. This template powers all of the products, and is used to quickly launch all of the AWS resources needed to create the cluster.

EC2 Spot Instances give you access to spare AWS capacity at up to a 90% discount from On-Demand pricing and can significantly reduce your cost per core. You simply enter the maximum bid price that you are willing to pay for a single compute node; AWS will manage your bid, running the nodes when capacity is available at the desired price point.

Running Alces Flight
In order to get some first-hand experience with Alces Flight, I launched a cluster of my own. Here are the settings that I used:

I set a tag for all of the resources in the stack as follows:

I confirmed my choices and gave CloudFormation the go-ahead to create my cluster. As expected, the cluster was all set up and ready to go within 5 minutes. Here are some of the events that were logged along the way:

Then I SSH’ed in to the login node and saw the greeting, all as expected:

After I launched my cluster I realized that this post would be more interesting if I had more compute nodes in my cluster. Instead of starting over, I simply modified my CloudFormation stack to have 4 nodes instead of 1, applied the change, and watched as the new nodes came online. Since I specified the use of Spot Instances when I launched the cluster, Auto Scaling placed bids automatically. Once the nodes were online I was able to locate them from within my PuTTY session:

Then I used the pdsh (Parallel Distributed Shell command) to check on the up-time of each compute node:

Learn More
This barely counts as scratching the surface; read Getting Started as Quickly as Possible to learn a lot more about what you can do! You should also watch one or more of the Alces videos to see this cool new product in action.

If you are building and running data-intensive HPC applications on AWS, you may also be interested in another Marketplace offering. The BeeGFS (self-supported or support included) parallel file system runs across multiple EC2 instances, aggregating the processing power into a single namespace, with all data stored on EBS volumes. The self-supported product is also available on a 14 day free trial. You can create a cluster file system using BeeGFS and then use it as part of your Alces cluster.

— Jeff;

Amazon ElastiCache Update – Export Redis Snapshots to Amazon S3

by Jeff Barr | on 26 MAY 2016 | in Amazon ElastiCache, Amazon S3 | Permalink | Comments

Amazon ElastiCache supports the popular Memcached and Redis in-memory caching engines. While Memcached is generally used to cache results from a slower, disk-based database, Redis is used as a fast, persistent key-value store. It uses replicas and failover to support high availability, and natively supports the use of structured values.

Today I am going to focus on a helpful new feature that will be of interest to Redis users. You already have the ability to create snapshots of a running Cache Cluster. These snapshots serve as a persistent backup, and can be used to create a new Cache Cluster that is already loaded with data and ready to go. As a reminder, here’s how you create a snapshot of a Cache Cluster:

You can now export your Redis snapshots to an S3 bucket. The bucket must be in the same Region as the snapshot and you need to grant ElastiCache the proper permissions (List, Upload/Delete, and View Permissions) on it. We envision several uses for this feature:

Disaster Recovery – You can copy the snapshot to another environment for safekeeping.

Analysis – You can dissect and analyze the snapshot in order to understand usage patterns.

Seeding – You can use the snapshot to seed a fresh Redis Cache Cluster in another Region.

Exporting a Snapshot
To export a snapshot, simply locate it, select it, and click on Copy Snapshot:

Verify the permissions on the bucket (read Exporting Your Snapshot to learn more):

Then enter a name and select the desired bucket:

ElastiCache will export the snapshot and it will appear in the bucket:

The file is a standard Redis RDB file, and can be used as such.

You can also exercise this same functionality from your own code or via the command line. Your code can call CopySnapshot while specifying the target S3 bucket. Your scripts can use the copy-snapshot command.

This feature is available now and you can start using it today! There’s no charge for the export; you’ll pay the usual S3 storage charges.

— Jeff;

Amazon Elastic Transcoder Update – Support for MPEG-DASH

by Jeff Barr | on 24 MAY 2016 | in Amazon Elastic Transcoder | Permalink | Comments

Amazon Elastic Transcoder converts media files (audio and video) from one format to another. The service is robust, scalable, cost-effective, and easy to use. You simply create a processing pipeline (pointing to a pair of S3 buckets for input and output in the process), and then create transcoding jobs. Each job reads a specific file from the input bucket, transcodes it to the desired format(s) as specified in the job, and then writes the output to the output bucket. You pay for only what you transcode, with price points for Standard Definition (SD) video, High Definition (HD) video, and audio. We launched the service with support for an initial set of transcoding presets (combinations of output formats and relevant settings). Over time, in response to customer demand and changes in encoding technologies, we have added additional presets and formats. For example, we added support for the VP9 Codec earlier this year.

Support for MPEG-DASH
Today we are adding support for transcoding to the MPEG-DASH format. This International Standard format supports high-quality audio and video streaming from HTTP servers, and has the ability to adapt to changes in available network throughput using a technique known as adaptive streaming. It was designed to work well across multiple platforms and at multiple bitrates, simplifying the transcoding process and sidestepping the need to create output in multiple formats.

During the MPEG-DASH transcoding process, the content is transcoded into segmented outputs at the different bitrates and a playlist is created that references these outputs. The client (most often a video player) downloads the playlist to initiate playback. Then it monitors the effective network bandwidth and latency, requests video segments as needed. If network conditions change during the playback process, the player will take action, upshifting or downshifting as needed.

You can serve up the transcoded content directly from S3 or you can use Amazon CloudFront to get the content even closer to your users. Either way, you need to create a CORS policy that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <CORSRule>
        <AllowedOrigin>*</AllowedOrigin>
        <AllowedMethod>GET</AllowedMethod>
        <MaxAgeSeconds>3000</MaxAgeSeconds>
        <AllowedHeader>*</AllowedHeader>
    </CORSRule>
</CORSConfiguration>

If you are using CloudFront, you need to enable the OPTIONS method, and allow it to be cached:

You also need to add three headers to the whitelist for the distribution:

Transcoding With MPEG-DASH
To make use of the adaptive bitrate feature of MPEG-DASH, you create a single transcoding job and specify multiple outputs, each with a different preset. Here are your choices (4 for video and 1 for audio):

When you use this format, you also need to choose a suitable segment duration (in seconds). A shorter duration produces a larger number of smaller segments and allows the client to adapt to changes more quickly.

You can create a single playlist that contains all of the bitrates, or you can choose the bitrates that are most appropriate for your customers and your content. You can also create your own presets, using an existing one as a starting point:

Available Now
MPEG-DASH support is available now in all Regions where Amazon Elastic Transcoder is available. There is no extra charge for this use of this format (see Elastic Transcoder Pricing to learn more).

— Jeff;

Amazon Redshift – Up to 2X Throughput and 10X Vacuuming Performance Improvements

by Jeff Barr | on 24 MAY 2016 | in Amazon Redshift | Permalink | Comments

My colleague Maor Kleider wrote today’s guest post!

— Jeff;

Amazon Redshift, AWS’s fully managed data warehouse service, makes petabyte-scale data analysis fast, cheap, and simple. Since launch, it has been one of AWS’s fastest growing services, with many thousands of customers across many industries. Enterprises such as NTT DOCOMO, NASDAQ, FINRA, Johnson & Johnson, Hearst, Amgen, and web-scale companies such as Yelp, Foursquare and Yahoo! have made Amazon Redshift a key component of their analytics infrastructure.

In this blog post, we look at performance improvements we’ve made over the last several months to Amazon Redshift, improving throughput by more than 2X and vacuuming performance by 10X.

Column Store
Large scale data warehousing is largely an I/O problem, and Amazon Redshift uses a distributed columnar architecture to minimize and parallelize I/O. In a column-store, each column of a table is stored in its own data block. This reduces data size, since we can choose compression algorithms optimized for each type of column. It also reduces I/O time during queries, because only the columns in the table that are being selected need to be retrieved.

However, while a column-store is very efficient at reading data, it is less efficient than a row-store at loading and committing data, particularly for small data sets. In patch 1.0.1012 (December 17, 2015), we released a significant improvement to our I/O and commit logic. This helped with small data loads and queries using temporary tables. While the improvements are workload-dependent, we estimate the typical customer saw a 35% improvement in overall throughput.

Regarding this feature, Naeem Ali, Director of Software Development, Data Science at Cablevision, told us:

Following the release of the I/O and commit logic enhancement, we saw a 2X performance improvement on a wide variety of workloads. The more complex the queries, the higher the performance improvement.

Improved Query Processing
In addition to enhancing the I/O and commit logic for Amazon Redshift, we released an improvement to the memory allocation for query processing in patch 1.0.1056 (May 17, 2016), increasing overall throughput by up to 60% (as measured on standard benchmarks TPC-DS, 3TB), depending on the workload and the number of queries that spill from memory to disk. The query throughput improvement increases with the number of concurrent queries, as less data is spilled from memory to disk, reducing required I/O.

Taken together, these two improvements, should double performance for customer workloads where a portion of the workload contains complex queries that spill to disk or cause temporary tables to be created.

Better Vacuuming
Amazon Redshift uses multi-version concurrency control to reduce contention between readers and writers to a table. Like PostgreSQL, it does this by marking old versions of data as deleted and new versions as inserted, using the transaction ID as a marker. This allows readers to build a snapshot of the data they are allowed to see and traverse the table without locking. One issue with this approach is the system becomes slower over time, requiring a vacuum command to reclaim the space. This command reclaims the space from deleted rows and ensures new data that has been added to the table is placed in the right sorted order.

We are releasing a significant performance improvement to vacuum in patch 1.0.1056, available starting May 17, 2016. Customers previewing the feature have seen dramatic improvements both in vacuum performance and overall system throughput as vacuum requires less resources.

Ari Miller, a Principal Software Engineer at TripAdvisor, told me:

We estimate that the vacuum operation on a 15TB table went about 10X faster with the recent patch, ultimately improving overall query performance.

You can query the VERSION function to verify that you are running at the desired patch level.

Available Now
Unlike on-premise data warehousing solutions, there are no license or maintenance fees for these improvements or work required on your part to obtain them. They simply show up as part of the automated patching process during your maintenance window.

— Maor Kleider, Senior Product Manager, Amazon Redshift

EC2 Instance Console Screenshot

by Jeff Barr | on 24 MAY 2016 | in Amazon EC2 | Permalink | Comments

When our users move existing machine images to the cloud for use on Amazon EC2, they occasionally encounter issues with drivers, boot parameters, system configuration settings, and in-progress software updates. These issues can cause the instance to become unreachable via RDP (for Windows) or SSH (for Linux) and can be difficult to diagnose. On a traditional system, the physical console often contains log messages or other clues that can be used to identify and understand what’s going on.

In order to provide you with additional visibility into the state of your instances, we now offer the ability to generate and capture screenshots of the instance console. You can generate screenshots while the instance is running or after it has crashed.

Here’s how you generate a screenshot from the console (the instance must be using HVM virtualization):

And here’s the result:

It can also be used for Windows instances:

You can also create screenshots using the CLI (aws ec2 get-console-screenshot) or the EC2 API (GetConsoleScreenshot).

Available Now
This feature is available today in the US East (Northern Virginia), US West (Oregon), US West (Northern California), Europe (Ireland), Europe (Frankfurt), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), and South America (Brazil) Regions. There are no costs associated with it.

— Jeff;

May	JUN	Jul
	09
2015	2016	2017

AWS Blog

Connect with AWS

AWS Blogs

RSS Feed

Brought to you by