Tom Clancy

Why ‘AES-256 Encryption’ Tells You Almost Nothing About Data Security

2025-10-28T07:00:00+00:00

It’s common to see companies attesting that their customer’s data is secure because they use “AES-256” encryption. Some might go as far as saying that they use “military-grade” encryption, which conveys absolutely no information because it is not a well-defined term. Which military? The term “AES-256” is better defined, but is still not precise enough. The at-rest encryption of data is also only a small component of a service’s security posture.

Symmetric Encryption

Symmetric encryption is a type of encryption that uses the same key to encrypt and decrypt the data. This is in contrast to public-key cryptography (asymmetric encryption) where a pair of keys can be derived such that one can be used for encryption (the public key) and another for decryption (the private key). Despite this useful property, asymmetric encryption is generally slower and provides less defence against brute-force attack for keys of the same size as an equivalent symmetric algorithm

For these reasons, symmetric encryption is usually preferred for at-rest encryption of data (e.g. data volumes or backups). A single private key is used for both encryption and decryption.

Block Ciphers

Far and above the most common set of ciphers (algorithms) used for at-rest encryption come from the Advanced Encryption Standard (AES), which was defined by the US National Institute of Standards and Technology (NIST) in 2001. NIST held a competition and invited cryptographers to submit their algorithms. The winning submission was the Rijndael block cipher, submitted by a pair of Belgian researchers, Vincent Rijmen and Joan Daeman.

The AES set of ciphers are a form of block cipher; a cipher that operates on data of fixed size. For AES, this block size is 128 bits. So where does the -256 in AES-256 come from? That refers to the size of the key used; 256 bit keys are used for AES-256.

This property of block ciphers presents an immediate challenge. If the cipher only supports block sizes of 128 bits, how is the algorithm used to encrypt data that is many orders of magnitude larger? Naturally, the answer is to loop over the data and apply the cipher repeatedly in 128bit chunks until the entire payload has been encrypted. The simplest (naive) approach is something like:

encrypted = []
for chunk in plaintext:
 encrypted += AES256(privatekey, chunk)

This loop, describing how a block cipher is applied in chunks, is referred to as a mode of operation. This particular mode of operation is known as Electronic Codebook (ECB) mode. If you’re not familiar with block ciphers, this approach might seem reasonable, but despite the strength of the AES256 cipher the overall implementation is terribly insecure. This mode of operation introduces a property that two chunks with the same plaintext will have the same encrypted output. A great example of why this is a bad idea is provided below ([source]). When looking at an individual encrypted pixel, it’s impossible to tell what the input is. But zooming out and seeing the image as a whole reveals very clear patterns in the input data

So when a service says they are using “AES-256”, it’s possible that they are using AES-256-ECB. While unlikely, it was discovered back in 2020 that while Zoom was claiming to use “AES-256” to encrypt their meetings what they were actually doing was:

Generating a single AES-128 key for each meeting on Zoom servers (these servers often resided in countries completely seperate from the meeting participants)
Supplying that key to every meeting participant
Encrypting all traffic with AES-128 in Electronic Codebook Mode (!!!)

So while ECB is generally regarded as a terrible idea, it didn’t stop Zoom from using this scheme. There is a great blog post that describes the shortcomings of Zoom’s cryptography, if you’re interested in learning more.

What’s the solution? There are much better modes of operation available. One classic method is known as Cipher Block Chaining (CBC). In this mode, a random (but not secret) initialisation vector (IV) is generated and XORd with the first block prior to encryption. For all subsequent blocks, the previous encrypted chunk is XORd with the current plaintext chunk prior to encryption. It looks something like this:

encrypted = []
previous_block = IV # Initialization Vector (random, same size as block)

for chunk in plaintext:
 xor_result = XOR(chunk, previous_block)
 cipher_block = AES256(privatekey, xor_result)
 encrypted += cipher_block
 previous_block = cipher_block

This has the property that chunks with the same content will look totally different when encrypted. However, it’s critical that the IV used is unique and unpredictable. Older methods of SSL which used CBC were vulnerable to these sorts of attacks, because the chosen IV was predictable.

One of the most common modes you will see today is Galois Counter Mode (GCM) which takes a different approach, but provides similar protections against pattern identification in ciphertext.

As an aside, be extremely careful using Universally Unique Identifiers (UUIDs) in cases where you don’t want an attacker being able to guess the ids. For example, storing user content behind urls (with no additional access control) with the assumption that users will not be able to guess these URLs. Per the original RFC:

Do not assume that UUIDs are hard to guess; they should not be used as security capabilities (identifiers whose mere possession grants access), for example. A predictable random number source will exacerbate the situation.

It seems possible to implement unpredictable UUIDs in newer versions of the standard, but this also depends on the implementation. Best to avoid them entirely in cases where unpredictability is important.

Data Protection

While I’ve only scratched the surface of symmetric encryption, it should now be clear that while AES-256 is a great block cipher, being told that “your data is protected with AES-256” shouldn’t give you much comfort. There are many ways to make the encryption vulnerable through selection (and implementation) of operation mode. This is why the standard advice is to “never roll your own crypto”, preferring use of standard libraries like OpenSSL instead.

Beyond shoddy crypto implementations, there is a famous quote from Adi Shamir (the “S” in RSA encryption) stating that:

Cryptography is typically bypassed, not penetrated

It’s rare that weak cryptography is the basis for a data breach. There have been countless breaches resulting from misconfigured permissions on S3 buckets, this is despite the fact that buckets use AES-256-GCM encryption by default. No systems or cryptographic ciphers worked unexpectedly, it’s just that the access control implemented was fundamentally broken.

So while it’s great that your service encrypts my data using the AES-256 block cipher, it really tells me nothing about the security of your systems.

Preventing Cross-Service Confused Deputy Attacks in AWS ELB Logging

2025-04-24T07:00:00+00:00

Like many AWS services, the Elastic Load Balancing service allows for the delivery of logs to an S3 bucket. Despite this, each service seems to take a different approach to bucket permissions. Until earlier this year the approach taken by the Cloudfront service was particularly idiosyncratic, requiring the use of bucket ACLs.

Bucket encryption is also a mixed bag, with some supporting sse:kms configuration while others only support sse:s3

Despite many of the things that AWS does well, I tend to agree with view that they lack consistency between services.

But back to ELB. The process of setting up cross-account S3 logging is fairly straightforward. This is helpful when you want to aggregate all logs to a centralised log/audit account. When I had to implement this a few weeks ago, the docs provided the following example bucket policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "logdelivery.elasticloadbalancing.amazonaws.com"
      },
      "Action": "s3:PutObject",
      "Resource": "s3-bucket-arn/*"
    }
  ]
}

There is a slight caveat though - only regions created after 2022 support referencing the logdelivery.elasticloadbalancing... principal. For most regions, you need to provide access to a specific account id for that region that presumably hosts infrastructure for the ELB service. For Sydney (ap-southeast-2) the policy looks like:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::783225319266:root"
      },
      "Action": "s3:PutObject",
      "Resource": "s3-bucket-arn/*"
    }
  ]
}

Once the bucket policy is in place, it’s a simple matter of turning on logging for the load balancer and supplying the bucket name.

Despite the relatively simple process there is a flaw with the policies. It’s so easy to setup cross-account logging that in fact any account can now direct their logs to your S3 bucket - assuming they know (or can guess) the name of the bucket. This is known as a Cross-Service Confused Deputy Attack, which is any situation where an attacker can leverage an AWS service to take actions on their behalf which they do not directly have the permissions to perform.

In the case of ELB access logging:

A user creates a bucket in account 111122223333 and attaches one of the above bucket policies to allow the ELB service to write logs to the bucket
An attacker in account 444455556666 has permissions to access the ELB service via their account, but no direct access to the account 111122223333
The attacker confuses the ELB service (the deputy) to write to the bucket in account 111122223333 on their behalf

The AWS docs on Cross-Service Confused Deputy Attack have a helpful diagram of a very similar attack involving delivery of logs to S3 from the Cloudtrail service.

For an example of a particularly nasty AppSync vulnerability that allowed confused deputy attacks see this post from Datadog Labs

In comparison, the confused deputy issue with ELB is minor because the documented ELB policy only allows s3:PutObject. Additionally, the ELB service always logs under the path AWSLogs//... or optionally /AWSLogs//.

There is no clear way that this situation could be leveraged to tamper with a victim account’s existing logs unless the attacker were able to identify a vulnerability in the ELB service (e.g. by abusing the optional prefix somehow). Despite this, the idea of my bucket being writable to any AWS account (even in a limited capacity) is uncomfortable.

The standard mitigation in this situation is to use IAM condition keys, like aws:SourceAccount which enforces that the request “originates” from a particular account. Per AWS docs:

This key provides a uniform mechanism for enforcing cross-service confused deputy control across AWS services. However, not all service integrations require the use of this global condition key. See the documentation of the AWS services you use for more information about service-specific mechanisms for mitigating cross-service confused deputy risks.

I opened a case with AWS Support to clarify whether the aws:SourceAccount condition key was supported for ELB logging - because based on my testing it seemed to work as expected. The support engineer was certain that this condition key would not work, but was eventually able to replicate the behaviour that I saw. Perhaps the condition key is not officially supported by the service?

Support instead suggested updating the targeted Resource to include the ELB added prefix e.g. arn:aws:s3:::s3-bucket-name/AWSLogs/111122223333/*. If another account then tried to configure the bucket as a logging destination it would fail because the bucket path would not match the bucket policy.

While I asked them to put in a request to have the docs updated to include the more restrictive policy, I half expected that it might sit in a backlog for some time. So when I came across the awssecuritychanges.com site earlier this week, I was happy to see the change listed:

Per the updated docs:

Ensure your AWS account ID is always included in the resource path of your Amazon S3 bucket ARN. This ensures only Application Load Balancers from the specified AWS account are able to write access logs to the S3 bucket.

Good Advice!