To protect the communication between ML compute instances in a distributed training job, ensure that inter-container traffic encryption is enabled for your Amazon SageMaker training jobs.
Distributed machine learning (ML) frameworks and algorithms typically transmit model-related information, such as weights, rather than the training dataset itself. During distributed training, you can further safeguard transmitted data between container instances, aiding compliance with regulatory requirements. This is achieved by using inter-container traffic encryption.
Audit
To determine if inter-container traffic encryption is enabled for your SageMaker training jobs, perform the following operations:
Remediation / Resolution
To enable inter-container traffic encryption for your Amazon SageMaker training job, you have to re-create those jobs with the appropriate in-transit encryption configuration. To deploy your new SageMaker training jobs, perform the following operations:
References
- AWS Documentation
- Amazon SageMaker FAQs
- Control root access to a SageMaker notebook instance
- Protecting Data in Transit with Encryption
- Protect Communications Between ML Compute Instances in a Distributed Training Job
- AWS Command Line Interface (CLI) Documentation
- list-training-jobs
- describe-training-job
- create-training-job