- Knowledge Base
- Amazon Web Services
- Amazon EMR
- EMR In-Transit and At-Rest Encryption
Ensure that your Amazon Elastic MapReduce (EMR) clusters are encrypted in order to meet security and compliance requirements. Encryption helps prevent unauthorized personnel from reading sensitive data available on your EMR clusters and their associated data storage systems. This includes data saved to persistent media, known as data at-rest, and data that can be intercepted as it travels through the network, known as data in-transit.
This rule can help you with the following compliance standards:
- HIPAA
- GDPR
- APRA
- MAS
- NIST4
For further details on compliance standards supported by Conformity, see here.
This rule can help you work with the AWS Well-Architected Framework.
This rule resolution is part of the Conformity Security & Compliance tool for AWS.
When working with production and confidential data it is strongly recommended to implement encryption in order to protect your data from unauthorized access and fulfill compliance requirements for data-at-rest and in-transit encryption within your organization. For example, a compliance requirement is to protect sensitive data that could potentially identify a specific individual such as Personally Identifiable Information (PII), usually used in Financial Services, Healthcare, and Telecommunications sectors.
Audit
To determine if in-transit and at-rest encryption is enabled for your Amazon EMR clusters, perform the following actions:
Using AWS Console
01 Sign in to the AWS Management Console.
02 Navigate to Amazon Elastic MapReduce (EMR) console at https://console.aws.amazon.com/elasticmapreduce/.
03 In the main navigation panel, under EMR on EC2, choose Clusters.
04 Click on the name (link) of the Amazon EMR cluster that you want to examine.
05 Select the Summary tab and search for the Security configuration attribute in the Security and access section. The Security configuration attribute references the cluster security configuration that defines the encryption and authentication settings for the EMR cluster. If there is no Security configuration attribute listed in the Security and access section, the selected Amazon Elastic MapReduce (EMR) cluster is not currently associated with a security configuration, therefore in-transit and at-rest encryption is not enabled for the selected EMR cluster.
06 Repeat steps no. 4 and 5 for each Amazon EMR cluster available within the current AWS region.
07 Change the AWS cloud region from the navigation bar and repeat the Audit process for other regions.
Using AWS CLI
01 Run list-clusters command (OSX/Linux/UNIX) with custom query filters to list the name of each active Amazon EMR cluster provisioned in the selected AWS region:
aws emr list-clusters --region us-east-1 --active --output table --query 'Clusters[*].Id'
02 The command output should return a table with the requested EMR cluster ID(s):
-------------------- | ListClusters | +------------------+ | j-ABCDABCDABCD | | j-ABCD1234ABCD | +------------------+
03 Run describe-cluster command (OSX/Linux/UNIX) using the ID of the Amazon EMR cluster that you want to examine as the identifier parameter and custom query filters to describe the name of the security configuration that defines the encryption and authentication settings for the selected EMR cluster:
aws emr describe-cluster --region us-east-1 --cluster-id j-ABCDABCDABCD --query 'Cluster.SecurityConfiguration'
04 The command output should return the name of the security configuration associated with the selected cluster:
null
If the describe-cluster command output returns null, as shown in the output example above, the selected Amazon Elastic MapReduce (EMR) cluster is not currently associated with a security configuration, therefore in-transit and at-rest encryption is not enabled for the selected EMR cluster.
05 Repeat steps no. 3 and 4 for each Amazon EMR cluster available in the selected AWS region.
06 Change the AWS cloud region by updating the --region command parameter value and repeat the Audit process for other regions.
Remediation / Resolution
To enable in-transit and at-rest encryption for your existing Amazon EMR clusters, you must define and configure a cluster security configuration then re-create your clusters with the new security configuration. To relaunch the required Amazon EMR clusters, perform the following actions:
Using AWS CloudFormation
01 CloudFormation template (JSON):
{ "AWSTemplateFormatVersion": "2010-09-09", "Description": "Enable In-Transit and At-Rest Encryption", "Parameters" : { "ReleaseLabel" : { "Type" : "String" }, "ClusterInstanceType" : { "Type" : "String" }, "EbsRootVolumeSize" : { "Type" : "String" }, "SubnetId" : { "Type" : "String" } }, "Resources": { "EMRCluster": { "Type": "AWS::EMR::Cluster", "Properties": { "Name": "cc-emr-production-cluster", "ReleaseLabel" : {"Ref" : "ReleaseLabel"}, "SecurityConfiguration" : {"Ref" : "ClusterSecurityConfiguration"}, "Instances": { "MasterInstanceGroup": { "InstanceCount": 1, "InstanceType": {"Ref" : "ClusterInstanceType"}, "Market": "ON_DEMAND", "Name": "cc-master-instance" }, "CoreInstanceGroup": { "InstanceCount": 1, "InstanceType": {"Ref" : "ClusterInstanceType"}, "Market": "ON_DEMAND", "Name": "cc-core-instance" }, "TaskInstanceGroups": [ { "InstanceCount": 1, "InstanceType": {"Ref" : "ClusterInstanceType"}, "Market": "ON_DEMAND", "Name": "cc-task-instance-1" }, { "InstanceCount": 1, "InstanceType": {"Ref" : "ClusterInstanceType"}, "Market": "ON_DEMAND", "Name": "cc-task-instance-2" } ], "Ec2SubnetId" : {"Ref" : "SubnetId"} }, "EbsRootVolumeSize" : {"Ref" : "EbsRootVolumeSize"}, "ServiceRole" : {"Ref": "EMRRole"}, "JobFlowRole" : {"Ref": "EMREC2InstanceProfile"}, "VisibleToAllUsers" : true } }, "ClusterSecurityConfiguration": { "Type" : "AWS::EMR::SecurityConfiguration", "Properties" : { "Name" : "cc-emr-security-config", "SecurityConfiguration" : { "EnableInTransitEncryption": true, "InTransitEncryptionConfiguration": { "TLSCertificateConfiguration": { "CertificateProviderType": "PEM", "S3Object": "s3://cc-config-store/artifacts/cc-certificates.zip" } }, "EnableAtRestEncryption": true, "AtRestEncryptionConfiguration": { "S3EncryptionConfiguration": { "EncryptionMode": "SSE-S3" }, "LocalDiskEncryptionConfiguration": { "EncryptionKeyProviderType": "AwsKms", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/aaaabbbb-cccc-dddd-eeee-aaaabbbbcccc" } } } } }, "EMRRole": { "Type": "AWS::IAM::Role", "Properties": { "AssumeRolePolicyDocument": { "Version": "2008-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "elasticmapreduce.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }, "Path": "/", "ManagedPolicyArns": ["arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceRole"] } }, "EMREC2Role": { "Type": "AWS::IAM::Role", "Properties": { "AssumeRolePolicyDocument": { "Version": "2008-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }, "Path": "/", "ManagedPolicyArns": ["arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceforEC2Role"] } }, "EMREC2InstanceProfile": { "Type": "AWS::IAM::InstanceProfile", "Properties": { "Path": "/", "Roles": [ { "Ref": "EMREC2Role" } ] } } } }
02 CloudFormation template (YAML):
AWSTemplateFormatVersion: '2010-09-09' Description: Enable In-Transit and At-Rest Encryption Parameters: ReleaseLabel: Type: String ClusterInstanceType: Type: String EbsRootVolumeSize: Type: String SubnetId: Type: String Resources: EMRCluster: Type: AWS::EMR::Cluster Properties: Name: cc-emr-production-cluster ReleaseLabel: !Ref 'ReleaseLabel' SecurityConfiguration: !Ref 'ClusterSecurityConfiguration' Instances: MasterInstanceGroup: InstanceCount: 1 InstanceType: !Ref 'ClusterInstanceType' Market: ON_DEMAND Name: cc-master-instance CoreInstanceGroup: InstanceCount: 1 InstanceType: !Ref 'ClusterInstanceType' Market: ON_DEMAND Name: cc-core-instance TaskInstanceGroups: - InstanceCount: 1 InstanceType: !Ref 'ClusterInstanceType' Market: ON_DEMAND Name: cc-task-instance-1 - InstanceCount: 1 InstanceType: !Ref 'ClusterInstanceType' Market: ON_DEMAND Name: cc-task-instance-2 Ec2SubnetId: !Ref 'SubnetId' EbsRootVolumeSize: !Ref 'EbsRootVolumeSize' ServiceRole: !Ref 'EMRRole' JobFlowRole: !Ref 'EMREC2InstanceProfile' VisibleToAllUsers: true ClusterSecurityConfiguration: Type: AWS::EMR::SecurityConfiguration Properties: Name: cc-emr-security-config SecurityConfiguration: EnableInTransitEncryption: true InTransitEncryptionConfiguration: TLSCertificateConfiguration: CertificateProviderType: PEM S3Object: s3://cc-config-store/artifacts/cc-certificates.zip EnableAtRestEncryption: true AtRestEncryptionConfiguration: S3EncryptionConfiguration: EncryptionMode: SSE-S3 LocalDiskEncryptionConfiguration: EncryptionKeyProviderType: AwsKms AwsKmsKey: arn:aws:kms:us-east-1:123456789012:key/aaaabbbb-cccc-dddd-eeee-aaaabbbbcccc EMRRole: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: '2008-10-17' Statement: - Sid: '' Effect: Allow Principal: Service: elasticmapreduce.amazonaws.com Action: sts:AssumeRole Path: / ManagedPolicyArns: - arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceRole EMREC2Role: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: '2008-10-17' Statement: - Sid: '' Effect: Allow Principal: Service: ec2.amazonaws.com Action: sts:AssumeRole Path: / ManagedPolicyArns: - arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceforEC2Role EMREC2InstanceProfile: Type: AWS::IAM::InstanceProfile Properties: Path: / Roles: - !Ref 'EMREC2Role'
Using Terraform (AWS Provider)
01 Terraform configuration file (.tf):
terraform { required_providers { aws = { source = "hashicorp/aws" version = "~> 4.0" } } required_version = ">= 0.14.9" } provider "aws" { region = "us-east-1" } resource "aws_emr_cluster" "emr-cluster" { name = "cc-prod-emr-cluster" release_label = "emr-5.35.0" applications = ["Spark"] master_instance_group { instance_type = "m5.xlarge" } core_instance_group { instance_type = "m5.xlarge" instance_count = 1 ebs_config { size = "50" type = "gp2" volumes_per_instance = 1 } } ebs_root_volume_size = 50 service_role = aws_iam_role.iam_emr_service_role.arn ec2_attributes { subnet_id = "subnet-01234123412341234" emr_managed_master_security_group = "sg-01234abcd1234abcd" emr_managed_slave_security_group = "sg-0abcd1234abcd1234" instance_profile = aws_iam_instance_profile.emr_instance_profile.arn } security_configuration = "${aws_emr_security_configuration.cluster-security-configuration.name}" } # Define Security Configuration to Enable In-Transit and At-Rest Encryption for the EMR Cluster resource "aws_emr_security_configuration" "cluster-security-configuration" { name = "cc-emr-security-config" configuration = <<EOF { "EncryptionConfiguration": { "EnableInTransitEncryption": true, "InTransitEncryptionConfiguration": { "TLSCertificateConfiguration": { "CertificateProviderType": "PEM", "S3Object": "s3://cc-config-store/artifacts/cc-certificates.zip" } }, "EnableAtRestEncryption": true, "AtRestEncryptionConfiguration": { "S3EncryptionConfiguration": { "EncryptionMode": "SSE-S3" }, "LocalDiskEncryptionConfiguration": { "EncryptionKeyProviderType": "AwsKms", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/aaaabbbb-cccc-dddd-eeee-aaaabbbbcccc" } } } } EOF } resource "aws_iam_role" "iam_emr_service_role" { name = "cc-emr-service-role" assume_role_policy = <<EOF { "Version": "2008-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "elasticmapreduce.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } EOF } resource "aws_iam_role_policy" "iam_emr_service_policy" { name = "cc-emr-service-role-policy" role = aws_iam_role.iam_emr_service_role.id policy = <<EOF { "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Resource": "*", "Action": [ "ec2:AuthorizeSecurityGroupEgress", "ec2:AuthorizeSecurityGroupIngress", "ec2:CancelSpotInstanceRequests", "ec2:CreateNetworkInterface", "ec2:CreateSecurityGroup", "ec2:CreateTags", "ec2:DeleteNetworkInterface", "ec2:DeleteSecurityGroup", "ec2:DeleteTags", "ec2:DescribeAvailabilityZones", "ec2:DescribeAccountAttributes", "ec2:DescribeDhcpOptions", "ec2:DescribeInstanceStatus", "ec2:DescribeInstances", "ec2:DescribeKeyPairs", "ec2:DescribeNetworkAcls", "ec2:DescribeNetworkInterfaces", "ec2:DescribePrefixLists", "ec2:DescribeRouteTables", "ec2:DescribeSecurityGroups", "ec2:DescribeSpotInstanceRequests", "ec2:DescribeSpotPriceHistory", "ec2:DescribeSubnets", "ec2:DescribeVpcAttribute", "ec2:DescribeVpcEndpoints", "ec2:DescribeVpcEndpointServices", "ec2:DescribeVpcs", "ec2:DetachNetworkInterface", "ec2:ModifyImageAttribute", "ec2:ModifyInstanceAttribute", "ec2:RequestSpotInstances", "ec2:RevokeSecurityGroupEgress", "ec2:RunInstances", "ec2:TerminateInstances", "ec2:DeleteVolume", "ec2:DescribeVolumeStatus", "ec2:DescribeVolumes", "ec2:DetachVolume", "iam:GetRole", "iam:GetRolePolicy", "iam:ListInstanceProfiles", "iam:ListRolePolicies", "iam:PassRole", "s3:CreateBucket", "s3:Get*", "s3:List*", "sdb:BatchPutAttributes", "sdb:Select", "sqs:CreateQueue", "sqs:Delete*", "sqs:GetQueue*", "sqs:PurgeQueue", "sqs:ReceiveMessage" ] }] } EOF } resource "aws_iam_role" "iam_emr_profile_role" { name = "emr-instance-profile-role" assume_role_policy = <<EOF { "Version": "2008-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } EOF } resource "aws_iam_instance_profile" "emr_instance_profile" { name = "emr-instance-profile" role = aws_iam_role.iam_emr_profile_role.name } resource "aws_iam_role_policy" "iam_emr_profile_policy" { name = "emr-instance-profile-policy" role = aws_iam_role.iam_emr_profile_role.id policy = <<EOF { "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Resource": "*", "Action": [ "cloudwatch:*", "dynamodb:*", "ec2:Describe*", "elasticmapreduce:Describe*", "elasticmapreduce:ListBootstrapActions", "elasticmapreduce:ListClusters", "elasticmapreduce:ListInstanceGroups", "elasticmapreduce:ListInstances", "elasticmapreduce:ListSteps", "kinesis:CreateStream", "kinesis:DeleteStream", "kinesis:DescribeStream", "kinesis:GetRecords", "kinesis:GetShardIterator", "kinesis:MergeShards", "kinesis:PutRecord", "kinesis:SplitShard", "rds:Describe*", "s3:*", "sdb:*", "sns:*", "sqs:*" ] }] } EOF }
Using AWS Console
01 Sign in to the AWS Management Console.
02 Navigate to Amazon Elastic MapReduce (EMR) console at https://console.aws.amazon.com/elasticmapreduce/.
03 In the main navigation panel, under EMR on EC2, choose Security configurations.
04 Choose Create and perform the following operations:
- Provide a unique name for the new security configuration in the Name box.
- Select Enable at-rest encryption for EMRFS data in Amazon S3 under S3 encryption to enable encryption at rest for data stored on the EMR File System (EMRFS), then choose the encryption mode that you want to use from Default encryption mode dropdown list (either Server-Side Encryption – SSE or Client-Side Encryption – CSE). (Optional) Select Add bucket override under Per bucket encryption overrides to choose the optional encryption overrides for specific Amazon S3 buckets. You can specify different encryption modes and encryption materials for each selected S3 bucket.
- Select Enable at-rest encryption for local disks under Local disk encryption to enable encryption at rest for the storage volumes attached to the cluster instances, then choose the master key required for disk volume encryption. You can use either EBS encryption or Linux Unified Key Setup (LUKS) encryption.
- Select Enable in-transit encryption under Data in transit encryption to enable in-transit encryption for the EMR cluster. Choose PEM from the Certificate provider type dropdown list to use a PEM certificate that you provide in a .zip file. Two artifacts should be available in your zip file: a PrivateKey.pem file and a CertificateChain.pem file (see Create keys and certificates for data encryption for more details). Enter the Amazon S3 location of the .zip file that contains your PEM certificate in the Custom key provider location box. If you choose Custom from the Certificate provider type dropdown list, you need to specify a custom certificate provider and specify the Amazon S3 location of the custom certificate-provider file. In the Certificate provider class box, type the full name of a class declared within your EMR application that implements the TLSArtifactsProvider interface.
- Choose Create to create your new Amazon EMR cluster security configuration.
05 In the main navigation panel, under EMR on EC2, choose Clusters.
06 Select the unencrypted EMR cluster that you want to re-create and choose Clone from the console top menu.
07 In the Cloning <emr-cluster-id> dialog box, choose Yes to include the steps from the original cluster in the cloned cluster or No to clone the original cluster's configuration without including any of the existing steps. Choose Clone to start the cloning process.
08 On the Create Cluster - Advanced Options page, perform the following operations:
- Choose Step 1: Software and Steps from the left navigation panel and configure the software stack that will be installed on the new cluster. Choose Next to continue the setup process.
- For Step 2: Hardware, choose the VPC network and subnet where the EMR cluster instances will be deployed from the Networking section, set the EBS volume size for the root device, and configure the cluster nodes (instances) as needed. Choose Next to continue.
- For Step 3: General Cluster Settings, choose whether to enable the Termination Protection safety feature, configure the cluster logging, and create any required tag sets. Choose Next to continue.
- For Step 4: Security, make sure that the right permissions are applied to the new cluster, and select the appropriate EC2 key pair and the security groups. Select the name of the security configuration created earlier in the Remediation process from the Security configuration dropdown list to enable in-transit and at-rest encryption for the new cluster. Once everything is properly configured, choose Create cluster to provision your new Amazon Elastic MapReduce (EMR) cluster.
09 (Optional) You can now terminate the source (unencrypted) cluster in order to stop incurring charges for that EMR resource. To terminate the source Amazon EMR cluster, perform the following actions:
- Select the EMR cluster that you want to shut down and choose Terminate from the console top menu.
- Choose the Terminate button from the console top menu.
- Within the Terminate clusters confirmation box, review the cluster details, set the Termination protection to Off, then choose Terminate to remove the source EMR cluster from your AWS account.
10 Repeat steps no. 4 – 7 for each Amazon EMR cluster that you want to encrypt and redeploy, available within the current AWS region.
11 Change the AWS cloud region from the navigation bar and repeat the Remediation process for other AWS regions.
Using AWS CLI
01 Run create-security-configuration command (OSX/Linux/UNIX) to create a new Amazon EMR cluster security configuration that defines the in-transit and at-rest encryption configuration for your EMR clusters. The following command example creates a cluster security configuration named "cc-emr-security-config" with in-transit encryption powered by PEM certificates (with PrivateKey.pem and CertificateChain.pem certificate files available at s3://cc-config-store/artifacts/cc-certificates.zip), and encryption at rest using Server-Side Encryption (SSE-S3):
aws emr create-security-configuration --region us-east-1 --name "cc-emr-security-config" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": true, "InTransitEncryptionConfiguration": { "TLSCertificateConfiguration": { "CertificateProviderType": "PEM", "S3Object": "s3://cc-config-store/artifacts/cc-certificates.zip" } }, "EnableAtRestEncryption": true, "AtRestEncryptionConfiguration": { "S3EncryptionConfiguration": { "EncryptionMode": "SSE-S3" }, "LocalDiskEncryptionConfiguration": { "EncryptionKeyProviderType": "AwsKms", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/aaaabbbb-cccc-dddd-eeee-aaaabbbbcccc" } } } }'
02 The command output should return the name of the newly created security configuration:
{ "CreationDateTime": 1512586797.435, "Name": "cc-emr-security-config" }
03 Get the configuration details from the source (unencrypted) EMR cluster. Run describe-cluster command (OSX/Linux/UNIX) using the ID of the Amazon EMR cluster that you want to re-create as the identifier parameter, to list the configuration information available for the selected cluster:
aws emr describe-cluster --region us-east-1 --cluster-id j-AAAABBBBCCCCD
04 The command output should return the requested cluster configuration information:
{ "Cluster": { "Name": "cc-hadoop-cluster", "ServiceRole": "EMR_DefaultRole", "Tags": [], "TerminationProtected": false, "NormalizedInstanceHours": 4, ... "ScaleDownBehavior": "TERMINATE_AT_INSTANCE_HOUR", "VisibleToAllUsers": true, "BootstrapActions": [], "LogUri": "s3n://aws-logs-123456789012-us-east-1/elasticmapreduce/", "AutoTerminate": false, "Id": "j-AAAABBBBCCCCD" } }
05 Run create-cluster command (OSX/Linux/UNIX) to re-create your Amazon EMR cluster using the configuration information returned at the previous and enable in-transit and at-rest encryption using the security configuration file created at step no. 1. The following command example creates an EMR cluster with one c5.xlarge-type master instance and two c5.xlarge-type core instances, named "cc-emr-production-cluster", that is associated with a security configuration named "cc-emr-security-config":
aws emr create-cluster --region us-east-1 --name cc-emr-production-cluster --release-label emr-4.0.0 --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=c5.xlarge InstanceGroupType=CORE,InstanceCount=2,InstanceType=c5.xlarge --service-role EMR_DefaultRole --ec2-attributes KeyName=SSHAccessKey,InstanceProfile=EMR_EC2_DefaultRole,EmrManagedMasterSecurityGroup=sg-0abcd1234abcd1234,EmrManagedSlaveSecurityGroup=sg-01234abcd1234abcd,AvailabilityZone=us-east-1a,SubnetId=subnet-0abcd1234abcd1234 --security-configuration cc-emr-security-config --visible-to-all-users --no-auto-terminate
06 The command output should return the ID of your new Amazon EMR cluster:
{ "ClusterId": "j-BBBBCCCCDDDDE" }
07 (Optional) You can now terminate the source (unencrypted) cluster in order to stop incurring charges for it. To terminate the source Amazon EMR cluster, run terminate-clusters command (OSX/Linux/UNIX) using the ID of the cluster that you want to delete as the identifier parameter (the command does not produce an output):
aws emr terminate-clusters --region us-east-1 --cluster-ids j-AAAABBBBCCCCD
08 Repeat steps no. 1 – 7 for each Amazon EMR cluster that you want to encrypt and redeploy, available in the selected AWS region.
09 Change the AWS cloud region by updating the --region command parameter value and repeat the Remediation process for other regions.
References
- AWS Documentation
- Encrypt data at rest and in transit
- Use security configurations to set up cluster security
- Create a security configuration
- What is Amazon EMR?
- Cloning a cluster using the console
- Specify a security configuration for a cluster
- AWS Command Line Interface (CLI) Documentation
- ec2
- list-clusters
- describe-cluster
- create-security-configuration
- create-cluster
- terminate-clusters
- CloudFormation Documentation
- Amazon EMR resource type reference
- Terraform Documentation
- AWS Provider