GDPR, Data Residency & LLMs: A Practical Guide for Dev Teams

Under GDPR, where inference runs is a compliance question, not just a technical preference. If personal data passes through an LLM prompt (and more often than teams realize, it does), the jurisdiction of the inference provider is subject to the same framework as any other data processor. This post explains the relevant GDPR principles, what they mean for LLM tooling, and what a practical compliance checklist looks like.

This post is for informational purposes only and does not constitute legal advice. GDPR compliance involves fact-specific legal analysis. Consult a qualified legal professional for guidance specific to your organization.

Why GDPR cares about where inference runs

GDPR's core principle is that personal data of EU/EEA residents must be handled under the GDPR framework regardless of where the processing happens. Article 3 extends GDPR's territorial scope to any organization that processes personal data of people in the EU/EEA, even if that organization is based elsewhere.

The key GDPR concepts in play for LLM inference:

Lawful basis. Article 6 requires that any processing of personal data has a lawful basis, most commonly legitimate interests, contract performance, or consent. Using an LLM to process data that includes personal data triggers this requirement. The lawful basis question is separate from the residency question, but you need both.

Data processor obligations. When you use a third-party API (an LLM provider) to process personal data, that provider is a data processor under GDPR. Article 28 requires a Data Processing Agreement (DPA) between you (the controller) and the processor. Many enterprise-tier LLM API offerings provide DPAs; consumer-facing products often don't. Check what tier you're on.

International transfers. Article 46 governs transfers of personal data to countries outside the EEA. Sending data to a server in the US requires a transfer mechanism, typically Standard Contractual Clauses (SCCs) or adequacy decisions. Sending data to countries without adequacy decisions requires SCCs plus a Transfer Impact Assessment in most interpretations following the Schrems II ruling.

The inference location matters because it determines which transfer mechanisms you need and what level of protection the data has once it arrives. These are general principles; the specific analysis for your data and provider requires legal review.

Source code, personal data, and prompts

Teams often don't think carefully about whether the code in LLM prompts contains personal data. The answer is sometimes yes, and it's worth checking.

Code itself is typically not personal data: a function definition doesn't identify anyone. But code that's part of a developer's active work often includes adjacent material that can be:

Hardcoded test data with real names, emails, or identifiers that a developer pasted in for debugging
Database schema comments that reference fields storing personal data, sometimes with example values
Log snippets in context from debugging sessions, which may contain identifiers or behavioral data
Configuration that references users, such as user IDs, account identifiers, or email addresses in environment variable examples

Additionally, if your coding tool's context window includes open files from your IDE, and those files are data fixtures, migration scripts with sample data, or system logs, personal data can enter the prompt stream without the developer actively intending to send it.

This is worth raising in your team's data handling review, not because it's inevitable, but because "we never send personal data to this API" is an assumption that deserves to be tested against how the tool actually works in practice.

Transfers outside the EEA

For data originating in the EU/EEA, sending it to a processor outside the EEA requires a transfer mechanism under Chapter V of GDPR. The main options in practice:

Adequacy decisions. The European Commission has determined certain countries provide adequate protection. The UK has a reciprocal adequacy arrangement with the EU (though this is subject to ongoing review). The US has the EU-US Data Privacy Framework (DPF), which replaced Privacy Shield after Schrems II, though the legal robustness of DPF continues to be analyzed and litigated. Japan and Australia do not currently have full adequacy decisions for general data transfers.

Standard Contractual Clauses (SCCs). The standard mechanism when no adequacy decision covers the transfer. SCCs are model contracts approved by the European Commission that create a contractual framework obligating the data importer to handle data in line with GDPR principles. SCCs need to be supplemented by a Transfer Impact Assessment (TIA) that evaluates whether the legal framework in the destination country undermines the SCCs in practice, a requirement that arose from the Schrems II decision.

What this means practically: If you're sending personal data to an LLM API running in the US, you need to verify that the provider either operates under the EU-US DPF or provides SCCs as part of their enterprise/API tier. If inference runs in countries without an EU adequacy decision and no SCCs, the transfer would typically require a more complex legal analysis.

This is a factual description of the legal framework, not legal conclusions about any specific provider or transfer scenario. Your organization's legal team should review the specific provider's documentation and your use case.

A practical compliance checklist for dev teams

This checklist is intended to help you identify the right questions to ask, not to replace legal analysis.

Before adopting an LLM API tool:

Does any personal data of EU/EEA residents appear in prompts? If yes, full GDPR analysis is required.
Where does the provider run inference? Is this documented and verifiable?
Is the provider prepared to sign a DPA? Does the API tier you're using include one, or is it only available at enterprise tier?
What transfer mechanism covers the transfer to the inference jurisdiction? Adequacy decision, SCCs, or other?
What data retention policies apply to prompts and responses? Is there an API-tier log purge SLA?
Has your legal or compliance team reviewed the tool against your obligations to data subjects?

Ongoing operations:

Are developers aware of what categories of data should not appear in LLM prompts?
Is there a technical or policy control to reduce the likelihood of accidental personal data ingestion (e.g., prohibiting use of the tool with live data files in scope)?
Are you maintaining records of processing activities (Article 30) that include this tool as a processing activity?

For teams in regulated industries:

Does your industry-specific regulation (HIPAA for health data, PSD2 for payments, etc.) impose additional controls beyond GDPR?
If you process data of children, have you applied the higher protections required under GDPR Article 8?

Choosing tooling that supports residency

The cleanest GDPR story for EU-based teams using LLM coding tools is inference that runs inside the EEA or in a country with an adequacy decision, combined with a DPA from the provider.

Germany, for example, stays within the EEA: no international transfer mechanism is required, and GDPR applies directly to the processing.

The UK is covered by the current EU-UK adequacy decisions, though these are subject to renewal. The US requires a DPF registration or SCCs. Jurisdictions without adequacy decisions or DPF equivalents require a more demanding analysis.

EU or UK inference avoids the Chapter V transfer question entirely, which simplifies the compliance posture.

Sota runs inference for GLM-5.2 and Kimi K2.7 Code on Cloudflare's global network, with nodes in Germany, the UK, the US, Japan, and Australia. For teams that want EU-resident inference specifically, Germany is an available option. This does not substitute for a DPA or legal review, but it addresses the residency prerequisite. See our AI inference data residency guide for a full breakdown of how to evaluate inference location. For the broader risk picture around sending code to overseas APIs, see our guide to overseas LLM API risk. And for a systematic approach to data sovereignty across your whole AI tooling stack, see data sovereignty for AI coding tools.

Disclaimer + CTA

This post is provided for informational purposes only and does not constitute legal advice. GDPR compliance analysis is fact-specific and depends on the exact data being processed, the legal basis for processing, the provider's contractual terms, and your organization's regulatory context. Nothing in this post should be relied on as legal conclusions about any specific provider, data flow, or compliance posture. Consult a qualified data protection legal professional before making compliance determinations.

If your team is looking for LLM inference that runs within the EEA or in comparable Western jurisdictions, Sota provides access to GLM-5.2 and Kimi K2.7 Code on Cloudflare's network (Germany, UK, US, Japan, and Australia) with an OpenAI-compatible API and flat-rate per-user pricing.

Get started with Sota to access frontier open-weight models on Western infrastructure, with inference in the EU (Germany), UK, or US from $25/month per user.