Underutilized Redshift Cluster

Risk Level: High (not acceptable risk)

Rule ID: RS-015

Identify any Amazon Redshift clusters that appear to be underutilized and downsize them to help lower the cost of your monthly AWS bill. By default, an AWS Redshift cluster is considered "underutilized" when matches the following criteria:

The average CPU utilization has been less than 60% for the last 7 days.
The total number of ReadIOPS and WriteIOPS registered per day for the last 7 days has been less than 100 on average.

The AWS CloudWatch metrics utilized to detect underused Redshift clusters are:

CPUUtilization - the percentage of CPU utilization (Units: Percent).
ReadIOPS and WriteIOPS - the average number of disk I/O (Input/Output) operations per second (Units: Count/Second).

Note: You can change the default threshold values for this rule on the Cloud Conformity console and set your own values for CPU utilization, the total number of ReadIOPS and WriteIOPS to configure the underuse level for your Redshift clusters.

This rule can help you work with the AWS Well-Architected Framework.

This rule resolution is part of the Conformity Security & Compliance tool for AWS.

Sustainability

Cost
optimisation

Downsizing underused AWS Redshift clusters to meet the capacity needs at the lowest cost represents an efficient strategy to reduce your monthly AWS costs. For example, resizing a ds2.xlarge Redshift cluster to a dc1.large cluster due to CPU and IOPS resources underuse, you can save roughly $440 per month (as of March 2017).

Audit

To identify any underused Redshift clusters provisioned within your AWS account, perform the following:

Using AWS Console

01 Sign to the AWS Management Console.

02 Navigate to Redshift dashboard at https://console.aws.amazon.com/redshift/.

03 In the left navigation panel, under Redshift Dashboard, click Clusters.

04 Choose the Redshift cluster that you want to examine then click on its identifier link, listed in the Cluster column.

05 On the selected cluster settings page, choose the Performance tab to access the monitoring panel.

06 Click on Show Monitoring button from the dashboard top menu and select Show Multi-Graph View to expand the AWS CloudWatch monitoring panel.

07 On the monitoring panel displayed for the selected cluster, perform the following actions:

To verify the Redshift cluster CPU Utilization usage graph, follow the steps below:
- From the Time Range dropdown list, select Last 1 Week.
- From the Period list, select 1 Hour.
- From the Statistic dropdown list, select Average.
- And from the Metrics dropdown list, select CPU Utilization.
Once the monitoring data is loaded into the CPU Utilization usage graph, check the cluster CPU usage for the last 7 days. If the average usage (percent) has been less than 60%, e.g. , the selected Redshift cluster qualifies as candidate for the underutilized cluster.
To verify the cluster Read IOPS usage graph, follow the steps below:
- From the Time Range dropdown list, select Last 1 Week.
- From the Period list, select 1 Hour.
- From the Statistic dropdown list, select Sum.
- And from the Metrics dropdown list, select ReadIOPS metric name.
Once the monitoring data is available in the ReadIOPS usage graph, verify the total number of Read operations per second recorded in the last 7 days. If the total number of ReadIOPS has been less than 100, e.g , the selected Redshift cluster qualifies as candidate for the underused cluster.
To verify the cluster Write IOPS usage graph, follow the steps below:
- From the Time Range dropdown list, select Last 1 Week.
- From the Period list, select 1 Hour.
- From the Statistic dropdown list, select Sum.
- And from the Metrics dropdown list, select WriteIOPS.
Once the monitoring data is available in the WriteIOPS usage graph, verify the total number of Write operations per second recorded in the last 7 days. If the total number of WriteIOPS has been less than 100, e.g. , the selected Redshift cluster qualifies as candidate for the underutilized cluster.

08 Repeat steps no. 4 – 7 to verify the CPU, ReadIOPS and WriteIOPS metrics usage data recorded within the selected time frame (7 days) for the rest of the Redshift clusters available in the current region.

09 Change the AWS region from the navigation bar and repeat the audit process for other regions.

Using AWS CLI

01 Run describe-clusters command (OSX/Linux/UNIX) using custom query filters to list the IDs of all AWS Redshift clusters created in the selected region:

aws redshift describe-clusters
    --region us-east-1
    --output table
    --query 'Clusters[*].ClusterIdentifier'

02 The command output should return a table with the requested cluster IDs:

------------------------
|   DescribeClusters   |
+----------------------+
|  cc-staging-cluster  |
|  cc-sandbox-cluster  |
+----------------------+

03 Run get-metric-statistics command (OSX/Linux/UNIX) to get the statistics recorded by CloudWatch for the CPUUtilization metric representing the CPU usage of the selected Redshift cluster. The following command example returns the average CPU utilization for an AWS Redshift cluster identified by the ID cc-staging-cluster, usage data captured during a 7-day time range, using 1 hour time frame as the granularity of the returned datapoints:

aws cloudwatch get-metric-statistics
    --region us-east-1
    --metric-name CPUUtilization
    --start-time 2017-03-05T17:29:41
    --end-time 2017-03-12T17:29:41
    --period 3600
    --namespace AWS/Redshift
    --statistics Average
    --dimensions Name=ClusterIdentifier,Value=cc-staging-cluster

04 The command output should return the cluster CPU usage details requested:

{
    "Datapoints": [
        {
            "Timestamp": "2017-03-05T17:29:41Z",
            "Average": 12.2085,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-03-05T18:29:41Z",
            "Average": 11.033499999999999995,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-03-05T19:29:41Z",
            "Average": 11.10425,
            "Unit": "Percent"
        },

        ...

        {
            "Timestamp": "2017-03-12T15:29:41Z",
            "Average": 1.430999999999999993,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-03-12T16:29:41Z",
            "Average": 0.92833333333333333,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-03-12T17:29:41Z",
            "Average": 0.52783333333333333,
            "Unit": "Percent"
        }
    ],
    "Label": "CPUUtilization"
}

If the average CPU usage data returned is less than 60%, the selected AWS Redshift cluster qualifies as candidate for the underused cluster.

05 Run again get-metric-statistics command (OSX/Linux/UNIX) to get the statistics recorded by AWS CloudWatch for the ReadIOPS metric, representing the number of Read I/O operations per second. The following command example returns the total number of ReadIOPS used by an Amazon Redshift cluster identified by the name cc-staging-cluster, IOPS usage data captured during a 7-day time period, using 1 hour time range as the granularity of the returned datapoints:

aws cloudwatch get-metric-statistics
    --region us-east-1
    --metric-name ReadIOPS
    --start-time 2017-03-05T17:44:09
    --end-time 2017-03-12T17:44:09
    --period 3600
    --namespace AWS/Redshift
    --statistics Sum
    --dimensions Name=ClusterIdentifier,Value=cc-staging-cluster

06 The command output should return the ReadIOPS usage details requested:

{
    "Datapoints": [
        {
            "Timestamp": "2017-03-05T17:44:09Z",
            "Sum": 0.4293539505765276,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2017-03-05T18:44:09Z",
            "Sum": 0.4000329652228976,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2017-03-05T19:44:09Z",
            "Sum": 0.4001483344299335,
            "Unit": "Count/Second"
        },

        ...

        {
            "Timestamp": "2017-03-12T15:44:09Z",
            "Sum": 0.0000557761644715,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2017-03-12T16:44:09Z",
            "Sum": 0.133804686450845,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2017-03-12T17:44:09Z",
            "Sum": 0.087927411198773,
            "Unit": "Count/Second"
        }
    ],
    "Label": "ReadIOPS"
}

If the total number of ReadIOPS has been less than 100 in the last 7 days, the selected Redshift cluster qualifies as candidate for the underutilized cluster.

07 Run get-metric-statistics command (OSX/Linux/UNIX) to get the statistics recorded by Amazon CloudWatch for the WriteIOPS metric, representing the number of Write I/O operations per second. The following command example returns the total number of WriteIOPS used by an AWS Redshift cluster identified by the name cc-staging-cluster, usage data captured during a 7-day time range, using 1 hour time period as the granularity of the returned datapoints:

aws cloudwatch get-metric-statistics
    --region us-east-1
    --metric-name WriteIOPS
    --start-time 2017-03-05T17:51:13
    --end-time 2017-03-12T17:51:13
    --period 3600
    --namespace AWS/Redshift
    --statistics Sum
    --dimensions Name=Name=ClusterIdentifier,Value=cc-staging-cluster

08 The command output should return the WriteIOPS usage details requested:

{
    "Datapoints": [
        {
            "Timestamp": "2017-03-05T17:51:13Z",
            "Sum": 0.1293539505765276,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2017-03-05T18:51:13Z",
            "Sum": 0.2000329652228976,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2017-03-05T19:51:13Z",
            "Sum": 0.00083333333333333,
            "Unit": "Count/Second"
        },

        ...

        {
            "Timestamp": "2017-03-12T15:51:13Z",
            "Sum": 0.0,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2017-03-12T16:51:13Z",
            "Sum": 0.0,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2017-03-12T17:51:13Z",
            "Sum": 0.0,
            "Unit": "Count/Second"
        }
    ],
    "Label": "WriteIOPS"
}

If the total number of WriteIOPS has been less than 100 within the past 7 days, the selected Redshift instance qualifies as candidate for the underused database instance.
If the usage data returned at steps no. 3 - 8 satisfy the conditions set by the conformity rule, the selected Redshift cluster is considered "underutilized" and should be resized in order to reduce your AWS Redshift usage costs.

09 Repeat steps no. 3 – 8 to verify the CPU, ReadIOPS and WriteIOPS metrics usage data recorded within the selected time range for the rest of the Redshift clusters provisioned in the current region.

10 Change the AWS region by updating the --region command parameter value and repeat steps no. 1 - 9 to perform the entire audit process for other regions.

Remediation / Resolution

Option 1: Downsize (resize) the underused Redshift clusters provisioned within your AWS account. To resize any Amazon Redshift cluster that is currently running in "underutilized" mode, perform the following actions:

Using AWS Console

01 Sign to the AWS Management Console.

02 Navigate to Redshift dashboard at https://console.aws.amazon.com/redshift/.

03 In the navigation panel, under Redshift Dashboard, click Clusters.

04 Select the Redshift cluster that you want to resize then click on its identifier link, listed in the Cluster column (see Audit section part I to identify the right resource).

05 Click the Backup dropdown button from the dashboard top menu and select Take Snapshot.

06 Inside the Create Snapshot dialog box, enter a unique name for your cluster snapshot in the Snapshot Identifier box then click Create to take the snapshot. The process may take several minutes. Once the snapshot is created it will be listed on your Redshift Snapshots page.

07 In the navigation panel, under Redshift Dashboard, click Snapshots.

08 Select the Amazon Redshift cluster snapshot created at step no. 6.

09 Click the Actions dropdown button from the dashboard top menu and select Restore From Snapshot.

10 In the Restore From Snapshot dialog box, perform the following actions:

Select the node type to downsize to (e.g. dc1.large) from the Node Type dropdown list.
In the Cluster Identifier box, enter a unique name for the new (downsized) Redshift cluster.
Configure the rest of the options (Cluster Parameter Group, Availability Zone, VPC Security Groups, etc) based on the configuration information taken from the original Redshift cluster.
Click Restore to create the new (resized) Redshift cluster.

11 As soon as the build process is complete, update your application configuration to refer to the new cluster endpoint, e.g: new-cc-staging-cluster.dvopsgvyjfhe.us-east-1.redshift.amazonaws.com.

12 Once the Redshift cluster endpoint is changed within your application configuration, you can remove the initial (original) cluster from your AWS account by performing the following actions:

In the navigation panel, under Redshift Dashboard, click Clusters.
Choose the Redshift cluster that you want to remove, then click on its identifier link listed in the Cluster column.
On the selected cluster Configuration tab, click the Cluster dropdown button from the dashboard main menu then select Delete.
Inside the Delete Cluster dialog box, enter a unique name for the final snapshot in the Snapshot name box then click Delete to confirm the action. Once the snapshot is created the selected cluster removal process begins.

13 Repeat steps no. 4 - 12 to downsize (resize) any other underutilized Amazon Redshift clusters provisioned within the current region.

14 Change the AWS region from the navigation bar and repeat the entire remediation process for other regions.

Using AWS CLI

01 Run describe-clusters command (OSX/Linux/UNIX) to describe the configuration information of the AWS Redshift cluster that you want to downsize (see Audit section part II to identify the right cluster):

aws redshift describe-clusters
    --region us-east-1
    --cluster-identifier cc-staging-cluster

02 The command output should return the requested configuration metadata, information that will be useful later when the new Redshift cluster will be created:

{
    "Clusters": [
        {
            "PubliclyAccessible": true,
            "MasterUsername": "ccclusteruser",
            "NumberOfNodes": 1,
            "PendingModifiedValues": {},
            "VpcId": "vpc-3b456985",
            "ClusterVersion": "1.0",
            "AutomatedSnapshotRetentionPeriod": 1,
            "DBName": "ccclusterdb",
            "PreferredMaintenanceWindow": "fri:03:00-fri:03:30",
            "AllowVersionUpgrade": true,
            "ClusterCreateTime": "2016-11-17T16:38:54.654Z",

            ...

            "ClusterSubnetGroupName": "default",
            "ClusterSecurityGroups": [],
            "ClusterIdentifier": "cc-staging-cluster",
            "ClusterNodes": [
                {
                    "NodeRole": "SHARED",
                    "PrivateIPAddress": "172.31.21.17",
                    "PublicIPAddress": "53.27.141.204"
                }
            ],
            "AvailabilityZone": "us-east-11",
            "NodeType": "ds2.xlarge",
            "Encrypted": false,
            "ClusterRevisionNumber": "1106",
            "ClusterStatus": "available"
        }
    ]
}

03 Run create-cluster-snapshot command (OSX/Linux/UNIX) to create a manual snapshot of the existing Redshift cluster:

aws redshift create-cluster-snapshot
    --region us-east-1
    --cluster-identifier cc-staging-cluster
    --snapshot-identifier cc-staging-cluster-snapshot-20171203

04 The command output should return the snapshot configuration metadata:

{
    "Snapshot": {
        "EstimatedSecondsToCompletion": -1,
        "OwnerAccount": "123456789012",
        "CurrentBackupRateInMegaBytesPerSecond": 0.0,
        "ActualIncrementalBackupSizeInMegaBytes": -1.0,
        "NumberOfNodes": 1,
        "Status": "creating",
        "VpcId": "vpc-3b456985",
        "ClusterVersion": "1.0",
        "Tags": [],
        "MasterUsername": "ccclusteruser",
        "TotalBackupSizeInMegaBytes": -1.0,
        "DBName": "ccclusterdb",
        "BackupProgressInMegaBytes": 0.0,
        "ClusterCreateTime": "2016-11-17T16:38:54.654Z",
        "EncryptedWithHSM": false,
        "ClusterIdentifier": "cc-staging-cluster",
        "SnapshotCreateTime": "2017-03-12T18:15:49.041Z",
        "AvailabilityZone": "us-east-1a",
        "NodeType": "ds2.xlarge",
        "Encrypted": false,
        "ElapsedTimeInSeconds": 0,
        "SnapshotType": "manual",
        "Port": 5439,
        "SnapshotIdentifier": "cc-staging-cluster-snapshot-20171203"
    }
}

05 Now run restore-from-cluster-snapshot command (OSX/Linux/UNIX) to create a new AWS Redshift cluster from the snapshot created at step no. 3, using the configuration information returned at step no. 2 and the desired cluster node type name as parameter:

aws redshift restore-from-cluster-snapshot
    --region us-east-1
    --cluster-identifier new-cc-staging-cluster
    --snapshot-identifier cc-staging-cluster-snapshot-20171203
    --node-type dc1.large
    --vpc-security-group-ids sg-b47e2e89
    --cluster-subnet-group-name default
    --availability-zone us-east-1a
    --cluster-parameter-group-name default.redshift-1.0
    --publicly-accessible

06 The command output should return the metadata of the new (downsized) Redshift cluster:

{
    "Cluster": {
        "IamRoles": [],
        "ClusterVersion": "1.0",
        "NumberOfNodes": 1,
        "VpcId": "vpc-3b456985 ",
        "NodeType": "dc1.large",
        "PubliclyAccessible": true,
        "Tags": [],
        "MasterUsername": "ccclusteruser",

        ...

        "AvailabilityZone": "us-east-1a",
        "AutomatedSnapshotRetentionPeriod": 1,
        "ClusterStatus": "creating",
        "ClusterIdentifier": "new-cc-staging-cluster",
        "DBName": "ccclusterdb",
        "PreferredMaintenanceWindow": "fri:03:00-fri:03:30",
        "PendingModifiedValues": {}
    }
}

07 Run describe-clusters command (OSX/Linux/UNIX) using the appropriate query filters to describe the new Redshift cluster endpoint:

aws redshift describe-clusters
    --region us-east-1
    --cluster-identifier new-cc-staging-cluster
    --query 'Clusters[*].Endpoint.Address'

08 The command output should return the new cluster endpoint URL:

[
    "new-cc-staging-cluster.dvopsgvyjfhe.us-east-1.redshift.amazonaws.com"
]

09 As soon as the build process is complete, update your application configuration to point to the AWS Redshift cluster endpoint address returned at step no. 8.

10 Once the Redshift cluster endpoint is changed within your application configuration, run delete-cluster command (OSX/Linux/UNIX) to remove the original Redshift cluster from your AWS account:

aws redshift delete-cluster
    --region us-east-1
    --cluster-identifier cc-staging-cluster
    --final-cluster-snapshot-identifier cc-staging-cluster-final-snapshot

11 The command output should return the metadata of the cluster selected for deletion:

{
    "Cluster": {
        "PubliclyAccessible": true,
        "MasterUsername": "ccclusteruser",
        "NumberOfNodes": 1,
        "ClusterVersion": "1.0",
        "AutomatedSnapshotRetentionPeriod": 1,
        "ClusterParameterGroups": [
            {
                "ParameterGroupName": "default.redshift-1.0",
                "ParameterApplyStatus": "in-sync"
            }
        ],

        ...

        "DBName": "ccclusterdb",
        "IamRoles": [],
        "AllowVersionUpgrade": true,
        "ClusterSubnetGroupName": "default",
        "ClusterSecurityGroups": [],
        "ClusterIdentifier": "cc-staging-cluster",
        "AvailabilityZone": "us-east-1a",
        "NodeType": "ds2.xlarge",
        "Encrypted": false,
        "ClusterStatus": "final-snapshot"
    }
}

12 Repeat steps no. 1 – 11 to downsize (resize) any other underused Redshift clusters available in the selected region.

13 Change the AWS region by updating the --region command parameter value and repeat the entire process for other regions.

Option 2: Disable the rule check. If the selected underused Redshift cluster configuration must remain unchanged due to project implementation design, you should turn off the conformity rule check for the specified cluster from the Cloud Conformity console.

References

AWS Command Line Interface (CLI) Documentation
redshift
describe-clusters
create-cluster-snapshot
restore-from-cluster-snapshot
delete-cluster
cloudwatch
get-metric-statistics

Publication date Mar 13, 2017

Audit

Using AWS Console

Using AWS CLI

Remediation / Resolution

Using AWS Console

Using AWS CLI

References

Related Redshift rules