Introduction
What is this book about?
This book is a training curriculum: a structured course for understanding secure deployment of artificial intelligent systems that have been trained with or are processing personal data. Each chapter has clearly defined learning goals, exercises, and notes for instructors.
Who is this book for?
While this book is intended for a technical audience who might already be familiar with some aspects of the AI deployment life-cycle and data privacy techniques, the book also tries to stimulate the discussions between different type of experts to gain a comprehensive view of AI systems development in the context of personal data. The security expert, the data scientist, the system administrator, the data governance expert, the legal expert of an organisation will all learn from each-other and this book helps facilitating their mutual discussions to set a solid foundation and shared vocabulary when working together in deploying an AI system. Management boards will also find insights on the technological challenges that various team members might be facing when developing and deploying AI systems. The book has no formulas and no code examples to make it accessible to a wide audience.
Book structure
The book is structured in 5 modules. Each module corresponds to a single day workshop with an instructor (facilitator) who can cover the topics and can challenge the learners with tasks and questions. Each module is divided in chapters. At the end of each chapter there are exercises, quizzes, and suggestions for in-class activities. The book can be also used as self-learning material. The first module sets the basic terminology of AI systems and their life-cycle. The second module focuses on the data privacy aspects of the AI systems life-cycle: data collection, privacy enhancing technologies, and in general all the (personal) data flows before putting the system in production. Module 3 focuses on the development of the AI system, considering good coding practices, secure sandboxes, and security testing of AI systems under development. Module 4 considers the deployment of an AI system, monitoring, sustainability, and decommissioning. Module 5 focuses on checklists for auditing with a series of use cases and a deeper overview of the technological, legal, and ethical challenges in the field.
Further considerations
This book is trying to provide fundamentals on practical solutions for companies developing or acquiring AI systems, but as of the date of this book (March 2025) there are still many unknown elements on how AI regulation is developing, how AI compliance and liability will be implemented in practice, with novel insights, peer-reviewed articles, policies, guidelines that are being released almost daily. The open-source book model will hopefully enable active development of this curriculum throughout the months, making it a useful and reusable resource for anyone interested in the intersection of data privacy, AI, and cybersecurity.
Notice on the tools used for writing this book
The following process and tools were used to create this book. From the initial requirements from the Greek data protection authority, the author has generated user stories, i.e. statements from a potential learner in the form of “I would like to learn about …”. The initial user stories (n = 52) were then passed to ChatGPT to further augment and expand the number of stories resulting in 116 user stories. The stories were then rated by the reviewers and it was agreed to focus on a subset of 70 user stories. The author then drafted the structure of the book/workshops based on the chosen user stories. All content was drafted by the author, except for the following content which was first synthesised by ChatGPT and then revised by the author: learning goals for each chapter, final quiz with multiple choice questions at the end of each chapter. Furthermore language and structure in many paragraphs were improved using AI tools such as ChatGPT, and DeepL. The book is written using the Quarto https://quarto.org/ open source software tool, and the book source code is made available. Zotero was used to manage the references. All images were made manually the author using the open source software Inkscape. The final version of the book was analysed with Turn-it-in to verify that the book contains no plagiarised text (this is a risk that can happen when asking AI tools to rephrase some of the paragraphs).
Contributing to the book
The book is released as an open source book under CC-BY-SA 4.0 license. Since the intersection between AI, privacy, and security is a field of active development, update requests to the book can be submitted by opening an issue in the github repository. New contributions can happen through the git mechanism of “pull requests” (or merge requests). Forks of the book will need to explicitly mention the changes that have been done, following the requirements set by the CC-BY-SA license. We encourage new training materials stemming from this book such as slides, videos, tests, flash cards, code examples.