You must build, deploy, and maintain machine learning (ML) systems reliably and efficiently. You can do this using the process of MLOps, which is a combination of DevOps, data engineering, and ML techniques.
MLOps provides a systematic approach to evaluating and monitoring ML models. MLOps is concerned with the lifecycle management of ML projects. This involves training, deploying, and maintaining machine learning models to ensure efficiency. Security is an essential component of all MLOps lifecycle stages. It ensures the complete lifecycle meets the required standards.
This article describes some pain points and MLOps best practices for mitigating security risks.
Protecting data storage
A model trained, deployed, and monitored using the MLOps method is end-to-end traceable. This method logs the model’s lineage to trace its origin. This means you can easily trace the source code and data used to train and test the model. Additionally, protecting data storage, understanding data compliance policies, securing ML models, ensuring observability, and logging ML tasks go a long way in securing MLOps.
You can enforce zero trust to secure the data and the data infrastructure. This security policy requires the authentication and authorization of all users wanting access to applications or data in a data storage facility. The policy validates users to ensure their devices have the proper privileges and continuously monitors their activity.
Identity protection and risk-based adaptive authentication can be used to verify a system or user identity. You can also encrypt data, secure emails, and ascertain the state of endpoints before they connect to the application in the data storage facility.
Risk-based authentication (RBA) is a standard security practice that you can use to apply different levels of strictness to the authentication process. This strategy is also known as adaptive authentication because it calculates a risk score for an access attempt in real time. It gives a user an authentication option depending on their score. The authentication process employs stricter and more restrictive measures as the risk level increases.
You can also use authentication, validation, and authorization measures on the data backup and recovery process to know who is initiating the backup or the recovery. However, this doesn't mean that your backup is 100% secure from malicious attackers. Therefore, you should consider immutable storage as an additional zero trust security measure. Once you store data this way, you can’t delete or alter it for a specified time, but you can read it many times. This prevents malicious insiders from deleting or modifying secure files and cyber attackers from encrypting data.
Another MLOps best practice is PLoP, which dictates that a user should have the exact access they need to perform their tasks—not more and not less. For instance, you should provide users who need to back up their work with the right to run backups and no other permissions like installing new software.
You reduce risk when users have access to only what they require in data storage. And if an attacker gains access to one part of the data storage system, this principle limits their access to the whole system. It reduces the attack surface and leaves bad actors with fewer targets. The hackers cannot elevate their permissions because privileges are restricted.
You should log every event in the data storage to know what happens each time there is an activity. Log files contain an audit trail which you can use to monitor activities within data storage. Log monitoring prevents malicious and accidental intrusion into your data storage system. Audit trails act as the next line of defense if an attacker bypasses other security controls. They helped conduct a forensic investigation following a security breach.
An audit of the log files of confidential information can reveal any traces of unauthorized activities, policy violations, and security incidents. You can investigate and take the necessary action if you notice an unauthorized activity. This is important in guarding the data storage system against external threats and internal misuse of information. If a security breach occurs, the audit trails will help reconstruct events resulting in the breach, allowing you to know how the breach occurred and how you can resolve vulnerabilities.
Securing ML models
Data is a significant input in training ML models. One effective way to secure ML models is to understand the data used to train the model, where it comes from, and what it contains.
Data poisoning is a significant threat to ML models. A slight deviation in the data can make your ML model ineffective. Mainly, attackers aim to manipulate training data to ensure the resultant ML model is vulnerable to attacks. You should avoid sourcing your training data from untrusted datasets while following standard data security detection and mitigation procedures. Poisoned data puts the trustworthiness and confidentiality of your data in question and, ultimately, the ML model.
Ideally, attackers feed their inputs as training data and trick the model into avoiding the correct classification. A model trained with tampered data may not output accurate predictions. An attacker can reverse-engineer an ML model, replicate, and exploit it for personal gains. If this happens, you must identify your model’s poor data samples, remove them, and retrain the original model before the attack.
However, retraining may not get the model fixed and can cost you. So, the feasible solution for your next training cycle is blocking attack attempts and detecting malicious inputs through rate limiting, validity checking, regression testing, and so on. Rate limiting controls how often a user can repeat an activity (such as logging in to an account) within a specified timeframe.
Validity checking helps you test the quality and accuracy of source data before training an ML model. Regression testing can help prevent ML bugs by keeping track of the ML model’s performance. You can also perform simulated attacks against your algorithms to learn how you can build defenses against data poisoning attacks. This allows you to discover the possible data points attackers could target and create mechanisms to dismiss such data points.
Compliance policies
Sometimes you train ML models using sensitive or private data. You need to understand the laws in your jurisdiction as they relate to data handling and storage. Several data protection laws protect the use of personal user data without consent.
For example, companies handling patient information in the European Union must comply with the General Data Protection Regulation (GDPR). The US has the Health Insurance Portability and Accountability Act (HIPAA), which governs the use of sensitive patient data. You must request consent when collecting such data and delete the data if a user requests so under the GDPR's right to be forgotten.
Observability and logging of ML tasks
Observability seeks to understand the ML system in its healthy and unhealthy states. Observability of ML tasks prevents failures by providing alerts before an incident occurs and recommending solutions for those failures. A bug introduced in the training data may affect the model's functionality.
You must track the issue back from where it started to fix it effectively. By looking at performance data and metrics of the ML tasks, you get insights into the security problems facing the ML model. Maintaining the model also involves collecting access, prediction, and model logs.
Conclusion
This article described some pain points and suggested MLOps best practices for mitigating these risks. Monitoring and logging access to data storage and following the zero trust and principle of least privilege policies are significant steps in protecting data storage.
Observability and logging tasks performed on the systems and the underlying data help you when auditing in case of a security breach. Attackers manipulate ML models through data poisoning. It’s imperative to screen the source of the training data and the contents for security vulnerabilities. Understanding compliance policies such as HIPAA and GDPR is essential to protect the use of personal user data when used in ML models.
Some ML models monitor security defects in other systems. Thereby guaranteeing MLOps infrastructure is secure and assuring the security of other systems. However, MLOps security is not always easy to achieve because there are many ways for attackers to get access to your data sets and models. Therefore, you must integrate security into all areas of MLOps systems.