S3 Idea

 ·  ☕ 3 min read

Making the world a better place, one bucket at a time.


S3 takes from a global namespace (actually, it’s the ARN that needs to be globally unique), which means anyone can create a bucket with any name (subject to a format) as long as that name has not been used by someone else.
This actually creates a problem.
Given the global namespace, short simple bucket names are already taken. For example, a bucket used to store logs can’t be called logs because someone else certainly already has it. Like a username on a website.
Thus a typical pattern is to create buckets that contain the account or company name. For example: customerx-myenvironment-logs.

This is still not perfect, I can think of 3 undesirable scenarios that are possible with the current design.

  • A grinch takes one of those names, thus forcing a deviation from a company’s naming convention.
  • A log replication, Kinesis Stream, CloudTrail Trail, etc. is configured to send logs / data to an S3 bucket; if that bucket is deleted the data will stop being saved into a non-existent bucket, if a malicious actor were to then create the same bucket is their own account the data could start being saved to that new bucket, IN THE MALICIOUS ACTORS ACCOUNT!
    • A possible workaround is to encrypt using a customer-managed key which either prevents the malicious actor from reading the data or preventing it from being saved to the bucket in the first place.
  • Sub-domain takeover attack. This is where an S3 bucket is configured as a static website, then a subdomain (for example: aaa.example.com) has a CNAME DNS record pointing to the S3 website endpoint (i.e. aaa.example.com.s3.amazonaws.com). Now if the bucket is deleted, the website will stop. If a malicious actor were to then create a bucket with the same name the CNAME record would still exist and still point to the same static endpoint, and the new bucket’s endpoint will be the same as the old one because the bucket has the same name. Now, aaa.example.com is serving content managed by the malicious actor and not the owner of the domain example.com.

The solution to this seems simple then.
What if you could reserve bucket names?
The current approach to reserving bucket names is to create that bucket until you need to use it.
This has a number of drawbacks: firstly, AWS limits an account to 100 buckets, secondly, you might not know what buckets you need so far in advance.
It would not be enough to allow anyone to reserve all buckets with a matching prefix or suffix. E.g. A malicious actor could reserve all the buckets ending in example.com, locking out the owner of example.com from using S3 to host the website.

But what if you could reserve all buckets that end with a domain name and you could prove you own that domain name?
This could take a similar form to using a DNS challenge when creating TLS certificates.

If you can prove you own a domain, then you reserve all buckets that end in that domain, you would eliminate all 3 of the above cases. (the example of the log bucket might need to be renamed myevironment-logs.companyx.com).

This is still not perfect, what if you need multiple accounts to use the same suffix?
The solution is to have a single account that owns the domain of S3 buckets which can then be shared with other accounts Perhaps by RAM, which plays into AWS Organizations allowing for the owner to grant permissions to all accounts in an OU or the whole Organization.


Kieran Goldsworthy
WRITTEN BY
Kieran Goldsworthy
Cloud Engineer and Architect