Overutilized AWS EC2 Instances

Risk Level: High (not acceptable risk)

Rule ID: EC2-073

Identify any Amazon EC2 instances that appear to be overutilized and upgrade (resize) them in order to help your EC2-hosted applications to handle better the workload and improve the response time. By default, an Amazon EC2 instance is considered "overutilized" when matches the following criteria:

The average CPU utilization has been more than 90% for the last 7 days.
The average memory utilization has been more than 90% for the last 7 days. By default, Amazon CloudWatch can't record an EC2 instance memory utilization because the necessary metric cannot be implemented at the hypervisor level, therefore to be able to report the memory utilization using CloudWatch you need to install an agent (PERL script) on the instance that you want to monitor and create a custom metric (we'll name it EC2MemoryUtilization) on the CloudWatch console. The instructions required for installing the monitoring agent, based on the Operating System used by instance, are available at this URL.

Note: You can change the default threshold values for this rule on the Trend Cloud One™ – Conformity console and set your own values for the CPU (percent), memory utilization (percent) and the preferred number of days for each condition to configure a custom overuse level for your EC2 instances. You can also change the default name for the memory utilization metric (i.e. EC2MemoryUtilization) and use a custom name for this metric. The console also provides information about each EC2 instance marked as overutilized, details such as region, ID, instance type, launch time and operating system to help you perform the right-sizing analysis.

This rule resolution is part of the Conformity Security & Compliance tool for AWS.

Performance
efficiency

Sustainability

Overutilized Amazon EC2 instances could indicate that the applications running on these machines do not have enough hardware resources to perform optimally. Upgrading (upsizing) overutilized Amazon EC2 instances to meet your load needs will improve directly the health and success of your applications, resulting in a more stable environment and a faster response time.

Audit

To identify any overutilized Amazon EC2 instances that could benefit from a more efficient hardware configuration, perform the following operations:

Using AWS Console

01 Sign in to the AWS Management Console.

02 Navigate to Amazon EC2 console at https://console.aws.amazon.com/ec2/.

03 In the navigation panel, under Instances, choose Instances.

04 Select the Amazon EC2 instance that you want to examine.

05 Choose the Monitoring tab from the console bottom panel to access the instance monitoring details.

06 On the Monitoring panel, perform the following actions:

Select the CPU utilization (%) graph, click on the 3-dot menu, and choose View in metrics to open the CPU utilization dashboard for the selected instance. On the CPU utilization (%) dashboard, configure the following parameters:
- Select 1w (1 week) from the time range top-right menu.
- Select the Graphed metrics tab, set Statistic to Average, and Period to 1 Hour.
Once the monitoring data is loaded, check the instance CPU utilization for the last 7 days. If the average usage (percentage) has been more than 90%, the selected Amazon EC2 instance qualifies as candidate for the overused EC2 instance.

07 Determine the memory utilization for the selected Amazon EC2 instance by reading the EC2MemoryUtilization metric data (or whatever name you have used for your custom metric) reported by the CloudWatch agent installed on the EC2 instance (this conformity rule assumes that the script has been successfully installed and it has returned memory usage data in the past 7 days). To check the Amazon EC2 instance memory utilization, perform the following operations:

Navigate to Amazon Cloudwatch console at https://console.aws.amazon.com/cloudwatch/.
In the navigation panel choose Metrics to access your Cloudwatch metrics.
Choose EC2 from the Metrics section to access the metrics available for the Amazon EC2 resources.
Select Per-Instance Metrics to access the metrics available for the EC2 instances.
Select the EC2MemoryUtilization metric for the Amazon EC2 instance that you want to examine. The EC2MemoryUtilization metric is listed in the Metric Name column.
Select 1w (1 week) from the time range top-right menu to return the data recorded in the past week.
Select Number from the chart type dropdown menu for usage data visualization.
Once the monitoring data is loaded, check the instance memory usage for the last 7 days. If the average usage (percentage) has been more than 90%, the selected Amazon EC2 instance qualifies as candidate for the overused EC2 instance.

08 If all the conditions outlined at step no. 6 and 7 are met, the selected Amazon EC2 instance is considered "overutilized" and should be upgraded to a better hardware configuration in order to meet your workload needs.

09 Repeat steps no. 4 – 8 for each Amazon EC2 instance available within the current AWS region.

10 Change the AWS cloud region from the console navigation bar and repeat the audit process for other regions.

Using AWS CLI

01 Run describe-instances command (OSX/Linux/UNIX) with custom query filters to list the IDs of all the active Amazon EC2 instances available in the selected AWS cloud region:

aws ec2 describe-instances
  --region us-east-1
  --filters Name=instance-state-name,Values=running
  --output table
  --query 'Reservations[*].Instances[*].InstanceId'

02 The command output should return a table with the requested instance identifiers (IDs):

-------------------------
|   DescribeInstances   |
+-----------------------+
|  i-01234abcd1234abcd  |
|  i-0abcdabcdabcdabcd  |
|  i-0abcd1234abcd1234  |
+-----------------------+

03 Run get-metric-statistics command (OSX/Linux/UNIX) to get the utilization data recorded by Amazon CloudWatch for the CPUUtilization metric, representing the CPU usage of the selected Amazon EC2 instance. Change the --start-time (start recording date) and --end-time (stop recording date) parameters values to choose your own time frame for recording the instance CPU usage. Configure the --period parameter value to define the granularity (in seconds) of the returned datapoints. A period can be as short as one minute (60 seconds) or as long as one day (86400 seconds). The following command example returns the average CPU usage of an Amazon EC2 instance identified by the ID i-01234abcd1234abcd, usage data captured over a period of 7 days, using 1-hour period as the granularity for the returned datapoints:

aws cloudwatch get-metric-statistics
  --region us-east-1
  --metric-name CPUUtilization
  --start-time 2017-04-21T15:10:00
  --end-time 2017-04-28T15:10:00
  --period 3600
  --namespace AWS/EC2
  --statistics Average
  --dimensions Name=InstanceId,Value=i-01234abcd1234abcd

04 The command output should return the CPU usage details requested:

{
	"Datapoints": [
		{
			"Timestamp": "2017-04-21T14:21:00Z",
			"Average": 153.2085333333333333,
			"Unit": "Percent"
		},
		{
			"Timestamp": "2017-04-21T15:21:00Z",
			"Average": 137.03345,
			"Unit": "Percent"
		},
		{
			"Timestamp": "2017-04-21T16:21:00Z",
			"Average": 131.4999999999999993,
			"Unit": "Percent"
		},

		...

		{
			"Timestamp": "2017-04-28T12:21:00Z",
			"Average": 312.0365,
			"Unit": "Percent"
		},
		{
			"Timestamp": "2017-04-28T13:21:00Z",
			"Average": 290.0283,
			"Unit": "Percent"
		},
		{
			"Timestamp": "2017-04-28T14:21:00Z",
			"Average": 227.0278,
			"Unit": "Percent"
		}
	],
	"Label": "CPUUtilization"
}

If the average CPU usage data returned by the get-metric-statistics command output is above 90%, the selected Amazon EC2 instance qualifies as candidate for the overused EC2 instance.

05 Determine the Amazon EC2 instance memory usage by querying the EC2MemoryUtilization metric data (or whatever name you have used for your custom metric) reported by the Amazon CloudWatch script installed on the selected EC2 instance (this rule assumes that the script has been successfully installed and it has recorded memory usage data within the past 7 days). To check the instance memory usage reported by your custom Amazon CloudWatch metric, run get-metric-statistics command (OSX/Linux/UNIX) using the metric name as the identifier parameter. The following command example returns the average memory utilization for an Amazon EC2 instance identified by the ID i-01234abcd1234abcd, from the usage data captured by a metric named EC2MemoryUtilization over a period of 7 days, using 1-hour period as the granularity for the returned datapoints:

aws cloudwatch get-metric-statistics
  --region us-east-1
  --metric-name EC2MemoryUtilization
  --start-time 2017-04-21T15:10:00
  --end-time 2017-04-28T15:10:00
  --period 3600
  --namespace AWS/EC2
  --statistics Average
  --dimensions Name=InstanceId,Value=i-01234abcd1234abcd

06 The command output should return the memory usage details requested:

{
	"Datapoints": [
		{
			"Timestamp": "2017-04-21T15:10:00Z",
			"Average": 97.2085,
			"Unit": "Percent"
		},
		{
			"Timestamp": "2017-04-21T16:10:00Z",
			"Average": 95.0334,
			"Unit": "Percent"
		},
		{
			"Timestamp": "2017-04-21T17:10:00Z",
			"Average": 95.1062,
			"Unit": "Percent"
		},

		...

		{
			"Timestamp": "2017-04-28T13:10:00Z",
			"Average": 98.03999999999999993,
			"Unit": "Percent"
		},
		{
			"Timestamp": "2017-04-28T14:10:00Z",
			"Average": 98.02833333333333333,
			"Unit": "Percent"
		},
		{
			"Timestamp": "2017-04-28T15:10:00Z",
			"Average": 93.18783333333333333,
			"Unit": "Percent"
		}
	],
	"Label": "EC2MemoryUtilization"
}

If the average memory utilization recorded in the past 7 days is more than 90%, the selected Amazon EC2 instance qualifies as candidate for the overused EC2 instance.

07 If the usage data returned for the steps no. 3 – 6 satisfy all the conditions required by the conformity rule (i.e. average CPU and memory usage above 90%), the selected Amazon EC2 instance is considered "overutilized" and should be upgraded to a better hardware configuration in order to meet your workload needs.

08 Repeat steps no. 3 – 7 for each Amazon EC2 instance available in the selected AWS region.

09 Change the AWS cloud region by updating the --region command parameter value and repeat the audit process for other regions.

Remediation / Resolution

To upgrade (resize) the overused Amazon EC2 instances provisioned within your AWS cloud account by adding more hardware resources to the specified EC2 instances, perform the following operations:

(!) IMPORTANT: The following procedure assumes that the Amazon EC2 instances selected for reconfiguration (upgrade) are NOT currently used in production or for critical operations.

Using AWS Console

01 Sign in to the AWS Management Console.

02 Navigate to Amazon EC2 console at https://console.aws.amazon.com/ec2/.

03 In the navigation panel, under Instances, choose Instances.

04 Select the overused Amazon EC2 instance that you want to reconfigure.

05 Click on the Instance state dropdown button from the console top menu and select Stop instance.

06 In the Stop instance? confirmation box, review the instance details, then choose Stop.

07 Once the instance is stopped (i.e. Instance State is set to stopped), click on the Actions dropdown button from the console top menu, select Instance settings,and chooseChange instance type.

08 On the Change instance typeconfiguration page, select the appropriate instance type from the Instance type dropdown list, and choose Apply to resize (upgrade) the selected Amazon EC2 instance.

09 Click on the Instance state dropdown button from the console top menu and select Start instance. Once the boot sequence is complete, the EC2 instance status should change from Pending to Running.

10 Repeat steps no. 4 – 9 for each Amazon EC2 instance that you want to upgrade (upsize), available within the current AWS region.

11 Change the AWS cloud region from the console navigation bar and repeat the remediation process for other regions.

Using AWS CLI

01 Run stop-instances command (OSX/Linux/UNIX) to stop the overused Amazon EC2 instance that you want to reconfigure:

aws ec2 stop-instances
  --region us-east-1
  --instance-ids i-01234abcd1234abcd

02 The output should return the stop-instances command request metadata:

{
	"StoppingInstances": [
		{
			"InstanceId": "i-01234abcd1234abcd",
			"CurrentState": {
				"Code": 64,
				"Name": "stopping"
			},
			"PreviousState": {
				"Code": 16,
				"Name": "running"
			}
		}
	]
}

03 Run modify-instance-attribute command (OSX/Linux/UNIX) to change (upgrade) the instance type for your overused Amazon EC2 instance. The following command example changes the instance type for an overused EC2 instance, identified by the ID i-01234abcd1234abcd, from c5.xlarge to c5.2xlarge (the command does not produce an output):

aws ec2 modify-instance-attribute
  --region us-east-1
  --instance-id i-01234abcd1234abcd
  --instance-type "{\"Value\": \"c5.2xlarge\"}"

04 Run start-instances command (OSX/Linux/UNIX) to restart the reconfigured Amazon EC2 instance (it may take few minutes until the instance enters the running state):

aws ec2 start-instances
  --region us-east-1
  --instance-ids i-01234abcd1234abcd

05 The output should return the start-instances command request metadata:

{
	"StartingInstances": [
		{
			"InstanceId": "i-01234abcd1234abcd",
			"CurrentState": {
				"Code": 0,
				"Name": "pending"
			},
			"PreviousState": {
				"Code": 80,
				"Name": "stopped"
			}
		}
	]
}

06 Repeat steps no. 1 – 5 for each Amazon EC2 instance that you want to upgrade (upsize), available in the selected AWS region.

07 Change the AWS cloud region by updating the --region command parameter value and repeat the remediation process for other regions.

References

AWS Command Line Interface (CLI) Documentation
describe-instances
stop-instances
modify-instance-attribute
start-instances
cloudwatch
get-metric-statistics

Publication date May 2, 2017

Audit

Using AWS Console

Using AWS CLI

Remediation / Resolution

Using AWS Console

Using AWS CLI

References

Related EC2 rules