Episode 36 — Encrypt AI Data Correctly: In Transit, At Rest, and In Use

In this episode, we’re going to build a clear, beginner-friendly picture of encryption in AI systems, because it is easy to treat encryption like a box you check while missing the places where it actually matters. Encryption is simply the practice of turning data into a form that cannot be read without the right key. The reason it matters for AI is that AI pipelines move data around more than people realize. Data is collected, stored, cleaned, enriched, retrieved, sent to a model, logged for monitoring, and sometimes shared across services and teams. Every movement and every storage location is a chance for exposure. Encryption is one of the strongest tools we have to reduce that exposure, but only when it is applied correctly and when the key management around it is solid. The goal here is to understand three contexts of encryption that you will hear constantly: encryption in transit, encryption at rest, and encryption in use, and to learn what each one protects, what it does not protect, and how beginners can think about doing it responsibly.

Before we continue, a quick note: this audio course is a companion to our course companion books. The first book is about the exam and provides detailed information on how to pass it best. The second book is a Kindle-only eBook that contains 1,000 flashcards that can be used on your mobile device or Kindle. Check them both out at Cyber Author dot me, in the Bare Metal Study Guides Series.

Encryption in transit is the most familiar concept because it is about protecting data as it travels across networks. When you send data from a client to a server, or from one service to another, encryption in transit prevents eavesdroppers from reading that traffic. In AI systems, transit happens everywhere. Telemetry might travel from endpoints to a central store, then to a processing service, then to a retrieval system, then to a model service, then back to an application. If any link in that chain is unencrypted, someone who can observe the traffic might capture sensitive data. Even if you encrypt most links, one weak link can leak the entire payload. Beginners should learn that encryption in transit is not only for public internet traffic. It is also for internal service-to-service communication, because internal networks can be monitored, misconfigured, or compromised. A safe mindset is to encrypt all data in motion by default, regardless of whether it is internal or external.

A practical issue with encryption in transit is authenticity, because encryption without identity checks can still be unsafe. You want to know you are talking to the right service, not just any service that can accept a connection. This is why encrypted protocols often include certificate-based verification, which helps confirm the other party’s identity. For beginners, the essential idea is that you do not want a man-in-the-middle, where an attacker intercepts and relays traffic while pretending to be the legitimate service. In AI workflows, this matters because sensitive prompts, retrieved documents, and model outputs can all be intercepted if identity checks are weak. A well-designed system uses encryption in transit along with proper authentication between services, so the pipeline not only hides data from observers but also reduces the chance of sending data to the wrong place.

Encryption at rest is about protecting data while it is stored. If someone gains access to the storage medium, such as a disk, a database snapshot, or a backup, encryption at rest reduces the chance they can read the content. This is particularly important in AI systems because data often sits in multiple stores: raw data vaults, processed feature stores, retrieval indexes, prompt logs, model training files, and evaluation datasets. Even if the main database is encrypted, a copied export or a forgotten cache might not be. Encryption at rest is about making sure that if storage is stolen, copied, or accessed improperly, the raw content is still protected. However, beginners should understand the limitation: encryption at rest does not protect against someone who already has legitimate access through the running system. If an attacker compromises an account that can query the database, encryption at rest does not stop them from reading data because the system decrypts it for authorized access. That is why encryption is important, but it is not a substitute for access control.

Key management is the heart of encryption at rest and in transit, and it is also where beginners can get tripped up. Encryption is only as strong as the protection of the keys. If keys are stored in plain text in configuration files, or shared widely, or reused across environments, the encryption might look good on paper but provide little real safety. A healthier approach is to treat keys as highly sensitive secrets, to restrict who and what can use them, and to rotate them regularly so a compromise does not last forever. In an AI pipeline, you also want separation of duties, meaning the people who manage infrastructure should not automatically have access to the data keys, and the services that process data should only get the minimum key access they need. This connects back to least privilege and to data minimization. If you minimize who can decrypt, you minimize who can leak.

Now we get to encryption in use, which is the most misunderstood of the three and is increasingly important in AI. Encryption in use refers to protecting data while it is being processed, not just while it is traveling or sitting on disk. In a traditional system, data is typically decrypted in memory so it can be processed, which means a compromise of the running system can expose it. Encryption in use aims to reduce that exposure by using techniques that protect memory or ensure processing occurs in a trusted environment. For beginners, you can think of this as trying to keep data protected even when it is actively being used, which is hard but valuable. In AI systems, this matters because prompts and retrieved documents are often assembled in memory, and model services may handle sensitive inputs. If you are sending sensitive data to a model hosted in another environment, encryption in use concepts become part of how you decide whether that environment is trustworthy.

A practical way to think about encryption in use is that it is often achieved through trusted execution environments, hardware-backed protections, or other methods that reduce who can see data while it is processed. You do not need to memorize the specific technologies for this certification level. The key idea is that there is a gap between encrypted at rest and encrypted in transit, because the moment you process data, you usually decrypt it somewhere. That decrypted moment is the exposure window. If your threat model includes insiders, compromised hosts, or shared infrastructure risks, you care about shrinking that exposure window. Some systems also combine encryption with tokenization or de-identification so the model never sees the most sensitive data in the first place. In many real deployments, the most practical approach is not perfect encryption in use, but reducing sensitivity through minimization and de-identification, then using strong encryption for everything else.

Another beginner misunderstanding is the idea that if a provider says they encrypt, you are done. Encryption claims are only meaningful when you know what is encrypted, where, under whose keys, and under what access conditions. For example, a service might encrypt data at rest with keys managed by the provider, which can be acceptable in many cases but may not meet stricter requirements where you need customer-managed keys. A service might encrypt in transit but still log payloads in plaintext for debugging. A service might encrypt storage but allow broad access to decrypted data through internal roles. So encryption must be evaluated as part of a full control set, including access boundaries, logging discipline, retention rules, and incident response. In AI systems, you should be especially cautious about prompt logs and telemetry, because those are places where sensitive inputs and outputs can be stored even if the main database is well encrypted.

Encryption also interacts with integrity and provenance in important ways. Encryption alone does not tell you whether data was altered. You can encrypt a file and still have no idea whether it was modified before encryption or after decryption. That is why encryption is often paired with integrity checks like hashing and signing. When you sign an encrypted artifact, you can verify that it has not been tampered with. When you encrypt a signed artifact, you protect confidentiality. In AI pipelines, combining confidentiality and integrity gives you stronger assurance that data is both private and trustworthy. This matters for training datasets and model artifacts, because a tampered dataset can poison a model even if it was encrypted during storage. Encryption protects against reading, but not against malicious modification by someone with access.

It is also important to recognize that encryption can create operational challenges, and those challenges can lead to unsafe shortcuts if teams are not careful. If encryption makes debugging hard, someone might disable it temporarily and forget to re-enable it. If key rotation is painful, teams may avoid rotating keys for years. If access policies are too complex, someone might grant broad permissions to make things work. Responsible encryption design includes usability, because controls that are too hard to operate tend to be bypassed. In security, we prefer controls that are strong and operationally realistic. That means automation for key rotation, clear separation between environments, and careful handling of development and testing so real sensitive data is not casually used in lower-security contexts.

By the end of this episode, the key takeaway should feel concrete: encryption is about protecting confidentiality across the whole AI data lifecycle, but it must be applied in the right places and supported by good key management. Encryption in transit protects data while it moves, but must include proper identity checks. Encryption at rest protects stored data and backups, but does not replace access control because authorized access still decrypts data. Encryption in use is about shrinking the window of exposure during processing, often by using trusted processing environments or by reducing sensitivity before processing. When you combine these with minimization, de-identification, integrity checks, and disciplined logging, you create an AI pipeline where sensitive data is less likely to be exposed even when things go wrong. That is the practical standard you should aim for in SecAI work: not one magic control, but a set of controls that work together to reduce risk throughout the system.

Episode 36 — Encrypt AI Data Correctly: In Transit, At Rest, and In Use
Broadcast by