9 Continuous validation, re-evaluation, decommissioning

Learning outcomes

After completing this chapter, you will:

Learn strategies for continuous learning and model retraining, including balancing large-scale retraining with incremental updates and preventing catastrophic forgetting.
Understand the importance of re-evaluating AI models to ensure alignment with evolving legal and societal norms.
Familiarize with the retirement process for AI systems
Explore techniques for managing technical debt, ensuring future-proof security, and promoting the long-term sustainability of AI systems through energy-efficient practices and lifecycle planning.

In this chapter, we close the loop with the AI system lifecycle by considering the continuous validato, re-evaluation, and retirement stages, focusing on the processes of continuously improving, realigning, and safe decommissioning of an AI system that processes personal data.

9.1 Continuous learning and model retraining

We saw in the previous chapter that after an AI model is deployed, its performance can degrade, with all sorts of model drifts. While monitoring is crucial to identify when an AI system is not performing as it should, monitoring without a reaction plan on what to do in case of model/system mis-alignment becomes useless.

Once drift is detected or model performance suddenly drops, the next step is deciding on how to retrain the model. Two common strategies are incremental updates and full re-training. In an incremental retraining approach, the model is updated with new data (which could come via online learning from the model usage or with periodic batch updates) without starting from scratch. This approach is faster and can adapt continuously, but it can come with the risk of catastrophic forgetting. Catastrophic forgetting refers to the tendency of AI system to lose performance on previously learned tasks when trained sequentially on new tasks (Ramasesh, Lewkowycz, and Dyer 2022). In neural networks, catastrophic forgetting happens because learned representations are overwritten during training on new tasks. This overwrite originates from the fact that the model’s parameters are updated to optimize performance on the current task, potentially at the cost of previous knowledge. This could be the case when there is an overlap in representations (different classes or tasks are not well separated) or simply due to model capacity for smaller models.

On the other hand, full re-training involves rebuilding the model from the ground up using a combination of old and new data (or only new data if statistical properties of the data have changed a lot). Full re-training can be more computationally expensive but could provide a cleaner model that avoids biases accumulated in incremental updates.

So what is the best strategy to adopt? In practice, depending on the size of the model, many organizations use a hybrid approach: periodically performing a full re-train (monthly or yearly) to create new versions of the model on the latest data, and alternating with smaller incremental updates if urgent changes are needed.

Automated retraining is available in various MLops pipelines (e.g. Amazon sagemaker, MLflow) so that monitoring events can trigger incremental retraining. It is maybe now clearer why we have insisted so much on previous chapters on the importance of version control of code and models, CI/CD and testing, because, with full automated responses to AI system misbehaviour, we need to be able to trace back what changed in the model and how automated testing went, and eventually revert to previous versions or decide for an alternative strategy (e.g. taking the AI system off-line or using other strategies like input/output filtering to prevent those cases that elicit lower performance of the AI system/model).

With personal data in the loop, one of course might need to be extra careful when it comes to fully automated pipelines doing re-training and deployment. In AI systems with personal data, any automated changes must still respect privacy constraints and be carefully documented for compliance with regulations, but also to ensure that fundamental subject rights and ethical alignment are still respected when processing data from individuals.

9.2 Revising Models for evolving ethical and regulatory alignment

If you have been curious about the AI regulatory landscape in Europe (and the rest of the world), it can feel impossible to follow all new developments, risks, or just new interpretation of the regulations, which can trigger retraining for your AI model. Ensuring that you have established processes for ethical and regulatory re-alignment of your AI systems/models makes it more transparent and fair not only towards the data subjects that are being processed, but also within your organisation, and society at large.

Model auditing and fairness assessments can be conducted by interdisciplinary teams or external experts, examining an AI system for issues like bias, robustness, and compliance. For instance, an audit might reveal that a facial recognition model has higher false negatives for darker-skinned individuals (Wehrli et al. 2022). On discovering these types of biases, the model could be retrained with more diverse data, or algorithmic fairness techniques (like equalizing thresholds or using adversarial debiasing) could be adopted. Audits should be done not just at deployment but periodically after deployment, since model updates or drifting data could introduce new biases. For further exploring fairness evaluations, the reader is encouraged to check the IEEE 7003 standard or the NIST AI 600-1 Artificial Intelligence Risk Management Framework which include bias and harm mitigation solutions.

In the context of personal data, auditing also means ensuring that privacy is preserved – e.g., checking that the model isn’t inadvertently memorizing sensitive personal information. GDPR mandates that personal data should not be kept longer than necessary and should be used only for the purposes consented to. Over an AI system’s life, developers might need to update the model to forget specific data if a user invokes their right to erasure (right to be forgotten). This has led to research on machine unlearning (see later chapters), which aims to remove or suppress the influence of particular training data points without retraining from scratch. The European Data Protection Board has noted that controllers should consider post-training techniques to remove or suppress personal data from trained AI models despite technical challenges (European Data Protection Board (EDPB) 2024). This means that if an AI model is found to memorize personal information (e.g., an AI assistant reciting someone’s address from training data), the developers are expected to find ways to expunge that memorization or otherwise mitigate the privacy risk.

For future MLops engineers and security professional, this stage of the AI lifecycle is blurred even further into the realm of AI governance. At this stage this is not only a matter of technical decisions or versioning and testing, it becomes an issue that needs to be solved with all parties involved, and this is why the final loop-back link in the lifecycle is to go back to the “Inception” stage and re-design the AI system if needed.

9.3 Decomissioning: safe retirement of AI systems

All AI systems eventually reach the end of their useful life, whether due to obsolescence, replacement by better models, or changes in business needs or regulations. Decommissioning an AI system – especially one that processes personal data – must be done in a responsible and structured manner. This ensures that no personal data is improperly retained, dependencies are resolved safely, and compliance is maintained even as the system is retired. In this section, we cover the steps and considerations for responsible AI decommissioning, including data retention and deletion policies, archival of models/data, compliance audits, and risk assessments during the process.

9.3.1 Responsible AI System Decommissioning

Decommissioning is more than just “turning off” an AI system/model. It should be handled according to a decommissioning plan or protocol that should be planned in advance, ideally during the inception stage.

Without looking at the governance aspects of the process, if we focus on the technical side, decommissioning relates to dependencies that the current AI system might carry: - Upstream Dependencies: data sources, data pipelines. (E.g., does a data ingestion job need to be stopped? Do data providers need to be notified that we no longer require data?) - Downstream Dependencies: applications or processes consuming the model’s output. (E.g., an API that other services call, a dashboard that displays model results, business processes that rely on AI outputs.) - Associated Resources: compute instances/clusters, databases, model artifacts, configuration files, documentation, and possibly third-party services or licenses.

By mapping all the dependencies, one can avoid orphaned processes or broken pipelines post-decommissioning and in the case of the processing of personal data, it ensures that there are no unwanted consequences for the data subjects (e.g. if an AI system for fraud detection is decommissioned, the data subjects should not receive false alarms of credit card fraud only because an API somewhere was not switched off).

During decommissioning, it’s good practice to come up with a checklist. Typical steps might include:

Notify all parties: Inform all relevant parties (business owners, IT, data protection officer, end-users if needed) of the intention to decommission, and the expected date. This ensures no one is caught off guard and can voice any concerns (perhaps someone relies on the system unknown to the team).
Documentation and Knowledge Capture: Before shutting down, ensure all documentation about the system is up to date and preserved. Write a decommissioning report that explains why the system is being retired, the date, the responsible personnel, and any important contextual information (like “model X was in production from 2022-2025, used for Y purpose”). This is useful for future audits or if questions arise later (e.g., “why did we discontinue that model?”).
Final Performance and Compliance Snapshot: It can be useful to record the system’s final state – performance metrics, any outstanding issues, the versions of software. Also verify compliance one last time (for instance, ensure that there are no outstanding data subject requests or regulatory holds on the data that would prevent deletion).
Gradual Phase-Out (if applicable): For critical systems, you might run the old and new system in parallel for a short period to ensure the replacement is fully functional. If the system is just being removed without direct replacement, ensure the business has adjusted (e.g., maybe a process reverts to manual review instead of AI). A sudden removal without adaptation can cause business disruption, which is a risk in itself.

9.3.2 Data Retention and Deletion

One of the most important aspects of decommissioning an AI that processed personal data is handling that data properly at end-of-life. Data should not be retained longer than necessary for the purpose it was collected and when an AI system is retired, it likely means the purpose no longer applies.

For the technical team involved in such task, it is important to consider at least these points:

Identify all personal data stores: This includes training datasets, validation datasets, data collected during operation (input logs, user feedback containing personal info), and any embedded data within the model. Also consider backup copies and data in non-production environments.
Determine retention requirements: Sometimes, laws or policies may require keeping data for a certain period even after system decommission (for example, financial records might need storage for X years, or medical AI decisions might need to be stored for liability reasons).
Securely erase data: Using cryptographic erasure (destroying encryption keys so data becomes irretrievable) or overwriting storage following standards (for hardware you control, refer to standards like NIST SP 800-88 “Guidelines for Media Sanitization” or ISO/IEC 27040). Remember that simply deleting a file or shutting off a cloud instance might not fully remove the data from backups or persistent storage so we must ensure that any personal data in the system is rendered unrecoverable.
Document the deletion process: Record what data was deleted and when, and who authorized it so that the organization can prove that as of the decommission date, that user’s data (along with all others in that system) was deleted in line with their policy. Keeping logs of deletion operations (like cryptographic erasure logs or certificate of destruction if a third-party did it) is a good practice.

One complication is with the model parameters themselves: A trained model could in theory retain information about individuals in its weights (especially if it overfit or memorized examples (European Data Protection Board (EDPB) 2024). This means a model might itself be subject to data protection rules unless you can demonstrate that no personal data can be extracted. During decommissioning, you should evaluate if the model file needs special handling. If the model is going to be archived or handed off, ensure it’s either thoroughly vetted for privacy or treat it with the same care as raw personal data (access control, encryption, eventual deletion).

9.3.3 Celebrate

Finally, when all is done, celebrate the proper retirement of the system: it served its purpose, and now it’s responsibly laid to rest, with no loose ends. Responsible decommissioning is a sign of a mature AI governance program that shows respect for user data, maintain trust with all data subjects involved, and clear the way for new innovations.

9.4 Summary

With this chapter our journey through the AI lifecycle comes to an end. There are a few broader perspective to consider on “where to go next”, but also what implications AI systems can have when it comes to sustainability. There are also a few advanced cases to consider where more complex AI systems might introduce further privacy risks and these will be covered in the last remaining chapters of the book.

Exercise 9.1: Multiple choice questions

Question	Options
1. What is a primary risk of using incremental retraining in AI models?	1) Increased latency. 2) Catastrophic forgetting. 3) Lack of model interpretability. 4) Model overcompression.
2. Why might full re-training be preferred over incremental updates?	1) It is faster to deploy. 2) It always uses less energy. 3) It helps avoid biases from accumulated updates. 4) It guarantees explainability.
3. What is the role of automated retraining pipelines in MLOps?	1) Encrypt data before training. 2) Launch new models manually. 3) Trigger model updates based on monitoring events. 4) Avoid the need for model versioning.
4. What is the ‘right to erasure’ under GDPR relevant to in the context of AI?	1) Deleting input data logs. 2) Removing the model entirely. 3) Ensuring specific user data influence can be removed from a trained model. 4) Terminating user accounts.
5. Which of the following is a goal of machine unlearning?	1) Speeding up model inference. 2) Replacing outdated APIs. 3) Enabling selective removal of training data influence. 4) Compressing model weights.
6. What is one reason to periodically audit AI models post-deployment?	1) To reduce cloud computing costs. 2) To check for new biases or misalignments. 3) To enhance training datasets. 4) To update license keys.
7. Which standard offers guidance on bias and harm mitigation in AI systems?	1) ISO/IEC 42001. 2) IEEE 7003. 3) GDPR Article 32. 4) ISO 27017.
8. Why is it important to identify both upstream and downstream dependencies during decommissioning?	1) To allow new AI models to reuse components. 2) To avoid data being stored in GPUs. 3) To prevent orphaned processes and ensure graceful retirement. 4) To migrate all services to Kubernetes.
9. What is cryptographic erasure used for during AI system retirement?	1) Obfuscating log files. 2) Speeding up shutdown. 3) Making data irretrievable by destroying encryption keys. 4) Encrypting model weights for future use.
10. What does responsible decommissioning of an AI system demonstrate?	1) High energy efficiency. 2) A complete handover to DevOps teams. 3) A mature AI governance practice. 4) Full automation of AI pipelines.

Exercise 9.1. Solutions

Click to reveal solutions

Answer: 2) Catastrophic forgetting
Explanation: Incremental updates can cause a model to forget previously learned tasks if not handled properly.
Answer: 3) It helps avoid biases from accumulated updates
Explanation: Full re-training can result in a cleaner model and mitigate issues that arise from incremental changes.
Answer: 3) Trigger model updates based on monitoring events
Explanation: Automated MLOps pipelines can initiate retraining workflows when performance degrades.
Answer: 3) Ensuring specific user data influence can be removed from a trained model
Explanation: This is part of implementing the GDPR’s right to erasure in the AI context.
Answer: 3) Enabling selective removal of training data influence
Explanation: Machine unlearning techniques aim to remove the effect of certain data without full retraining.
Answer: 2) To check for new biases or misalignments
Explanation: Regular audits ensure ongoing fairness, accuracy, and regulatory compliance.
Answer: 2) IEEE 7003
Explanation: This standard specifically addresses bias and mitigation in AI systems.
Answer: 3) To prevent orphaned processes and ensure graceful retirement
Explanation: Mapping dependencies helps avoid unexpected disruptions and compliance issues.
Answer: 3) Making data irretrievable by destroying encryption keys
Explanation: Cryptographic erasure is a secure deletion method, especially for cloud environments.
Answer: 3) A mature AI governance practice
Explanation: Proper decommissioning reflects responsibility, accountability, and long-term planning.