
Published by Alisha McKerrron on 24 September 2025 and revised on 13 October
It seems there is no avoiding artificial intelligence (AI) these days. The topic comes up constantly in the news, eye watering amounts of money are being invested in AI systems, and it ever more increasingly touches our lives, via products we use at home, on the go, at work, in the field of commerce, and in healthcare. Some say it may make the world’s economic growth explode! So how should we privacy lawyers tackle compliance issues relating to the use of AI in all its forms? A structured technical understanding of AI and resulting privacy challenges is a good start.
Step 1 : Differentiate True AI from “Smart” AI
As a first step it is important to recognise when an AI system is involved. It’s a common misconception that many seemingly “smart” products are powered by AI, when they rely on older, simpler technology. This is often because the term “AI” is used as a marketing buzzword to suggest a level of intelligence and adaptability that isn’t there. Take chatbots as an example:
Rule based chatbots work on a simple, “if/then” logic. If you type a specific question like “What are your opening hours?”, the bot is programmed to recognize the exact phrase and provide a pre-written answer. If you ask the same question in a slightly different way (e.g., “Are you open today?”), the bot may fail to understand and simply respond with “I’m sorry, I don’t understand.” This is a rule-based system, not a learning AI.
AI powered chatbots, on the other hand, employ AI components to process user input. The system analyses a user query by identifying their intent (their goal) and extracting entities (key pieces of information) even with variations in phrasing, typos, or colloquialisms. Instead of relying on a limited, pre-written library of responses, it can generate novel, human-like text on the fly. It can simulate human conversation but is based on complex computational and statistical models rather than actual understanding or thought.
Step 2: Grasp the Mechanics of Core AI Models
As a next step it is important to have a basic understanding of the core models that power AI systems. They include machine learning (ML) models, deep learning (DL) models, and large language models (LLMs).
ML is a broad subset of AI where systems learn from data to make predictions or decisions without being explicitly programmed. These models are effective for well-defined tasks with structured data. There are three main types of ML models based on how they learn: (i) Supervised learning models which are trained on labelled datasets, where the correct answer is provided to the model during training. They’re commonly used for tasks like email spam detection or classifying accounts by type; (ii) Unsupervised learning models which find patterns in unlabelled data without human intervention. They’re used for things like customer segmentation or grouping similar data points; and (iii) Reinforcement learning models which learn through a trial-and-error process, receiving rewards for good behaviours and penalties for bad behaviours. This is often used for teaching robots or AI to play games.
Deep learning is an advanced subset of ML that uses artificial neural networks with multiple layers to process complex, unstructured data like images, video, and text. The layered structure allows them to automatically learn complex features from data with minimal human intervention. There are two types of networks: (i) Convolutional neural networks (CNNs), a specialized type of deep learning algorithm particularly well-suited for analysing visual data. They are used in tasks like image classification, object detection, and facial recognition; and (ii) Recurrent Neural Networks (RNNs) which are designed for sequential data, like time series or natural language, because they have a form of “memory” that allows them to process information in sequence.
LLMs are a cutting-edge type of deep learning model that are pre-trained on vast amounts of text data from the internet. They use a transformer architecture which allows them to understand context and the relationships between words across a text. LLMs are incredibly flexible and are primarily used for: (i) Generative AI which creates new content such as text, code, or images; and (ii) Natural Language Processing (NLP) for “understanding”, summarizing, translating, and classifying text.
Step 3: Deconstruct the AI System Architecture
Before assessing risk, you must understand the individual components of the AI system your organization uses and where the intelligence resides.
Continuing with the AI chatbot example, the components would include: a user interface, a natural language processing (NLP) engine, a dialogue manager, a knowledge base/database, and a natural language generation (NLG) engine. The user face (the part of the system the user sees, such as a chat window or a voice interface) and the knowledge base/ data (where the chatbot’s information is stored) do not use AI. But the other components do or may do.
The NLP engine (which interprets human language by identifying the user’s intent (their goal) and extracts key entities (important pieces of information) from a query), uses ML, and the NLG engine (used to generate a human-like response from the data provided by other components) uses DL and a LLM. The dialogue manager (which uses the information from the NLP engine to decide how the chatbot should respond) may or may not use AI depending on how sophisticated it is.
Step 4: Map AI Model Type to GDPR Remediation Difficulty
A critical insight for privacy lawyers is that the architectural complexity of a model, coupled with its data processing methodology, determines the spectrum of GDPR compliance risk and the difficulty of remediation.
| Model Type | Compliance challenges |
| Simpler ML | Trained on smaller, proprietary, and often labelled data. The model focuses on content, not identity, allowing for anonymization/pseudonymization and training within a secure, controlled environment. Remediation is manageable through data minimization and strict access controls |
| Massive LLM | Trained on a huge, diverse internet corpus that inevitably contains personal data. This results in the risk of verbatim memorisation and potential leakage of sensitive training data via prompts, violating the GDPR core principle of Purpose Limitation and Storage Limitation. Right to Erasure and establishing clear data lineage are nearly impossible. |
Step five: Assess AI delivery Method for Data Control
The company’s chosen method of delivery dictates its level of control and visibility over customer data, which is essential for defining the Controller/Processor relationship
| Delivery Method | Control and Visibility | Primary Compliance Risk |
| Software-as-a Service | Low. The business (Data Controller) is reliant on the provider (Data Processor) for security and data handling within their system. | Data security, international data transfers, and the provider potentially using customer data to train public LLMS. Require a strong DPA |
| API Integration | Medium. The business has more direct control over the data being sent via the API call. | The business must practice data minimisation (redacting personal data before sending) and manage third party data transfers. |
| On-Premises Deployment | High. The business is the sole Data Controller and Processor; data never leaves the company’s data centre. | Most GDPR complaint. The only risk is international security protocol failures |
| Embedded AI in Hardware | High (Privacy by Design). Processing occurs locally on the device’s chip; data is minimised before any cloud transfer. | The business must be transparent about any data collected and sent to the cloud, requiring explicit consent for cloud processing |
Step 6: Familiarise Yourself with Applicable Laws
The final step is to stay current with the rapidly evolving regulatory landscape. New and revised local and extraterritorial laws relating to AI, such as the EU’s recent acts, are emerging constantly. Prioritising compliance efforts based on your company’s sector, jurisdiction and role (Controller/Processor) is vital for building the necessary governance framework —a crucial topic I shall tackle in my next blog post.