24

I'm uploading files on S3, where the filename is a UUID. All files uploaded are publicly read/write, however these files are considered private to users.

Is it possible for someone to guess a UUID, or randomly try a bunch of combinations to access some of these files?

Is there a better way to secure S3 files?

Snowman
  • 537
  • 2
  • 4
  • 10
  • 3
    Guess? Probably depends on how the UUIDs are being generated. Note that if they're trying for a _specific_ file (or from a specific user), they're probably out of luck, but they can probably get a **random** one eventually. I'd imagine Amazon already has something setup for security - what does their service provide/documentation suggest? – Clockwork-Muse Mar 16 '14 at 03:16
  • See http://security.stackexchange.com/questions/58215/are-random-urls-a-safe-way-to-protect-profile-photos#comment-92500 – Pacerier Feb 16 '15 at 09:49

4 Answers4

15

UUIDs generated following RFC 4122 come in several “versions”; in a UUID of the form xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx (in hexadecimal), then:

  • If 8 ≤ N ≤ b and M=1 or M=2 then the UUID identifies the machine that generated it and the time it was generated.
  • If 8 ≤ N ≤ b and M=3 or M=5 then the UUID is generated deterministically using a cryptographic hash function.
  • If 8 ≤ N ≤ b and M=4 then the UUID is random (with 6 fixed bits and 122 random bits).

If the UUID doesn't meet any of these constraints, it wasn't generated according to RFC 4122. But even if the constraints are met, this doesn't guarantee than the RFC was followed, or that the random generator was a good one.

I don't know what S3 does for UUIDs.

Even if the UUIDs are properly random, which would make the URL unguessable, protecting access to a resource with only a URL is a bad idea. URLs tend to leak in many ways: shared in emails or chat, lying around in browser histories and bookmark lists, accidentally copied around, or worse, left as a link in a page that the author meant to keep private but that had no authentication and got indexed by Google.

If you want to protect access to a resource on the web, do not rely on the URL being kept secret. Add an authentication mechanism. Even plain old password authentication is a huge improvement on an unguessable URL — browsers and even people know that they shouldn't be sharing passwords around, whereas URLs are normally public knowledge.

Gilles 'SO- stop being evil'
  • 51,415
  • 13
  • 121
  • 180
  • But if I have access to your browser history, I'll have access to your browser, and I can access that secure page anyway (regardless of whether it's authenticated by URL or not). – Pacerier Feb 16 '15 at 08:23
  • 3
    @Pacerier Not necessarily. You could see the history via shoulder surfing. Or you could have access to my browser data but not to my passwords, which wouldn't let you log in to a site with authentication. – Gilles 'SO- stop being evil' Feb 16 '15 at 13:32
5

The 100% correct solution, obviously, would be not to make files publicly readable and writable. That's out of your control here, but it is also not strictly necessary.

The server is set up in a way similar to how incoming folders have been set up on FTP servers for 30-40 years: You can cd into that directory and create/write files, and read any files that you can open, but you cannot list the directory's contents (and optionally, you can create and write any file, but not overwrite an existing one).

Assuming that file names cannot be guessed, this is secure insofar as you can't do much without knowing a file's name. An UUID has 128 bits, but you could use any other random file name of the same or greater size (or a 160-bit random name, if you think 128 bits are not enough).
While it is in theory possible to guess a random 128-bit number (and there is even a chance that two filenames will collide by accident), the chances for such a thing to happen is astronimically small given the time it takes to access a file over the network (which severely limits the number of operations per second that you can do). This is much different from e.g. someone brute-forcing a hash from a stolen password database, where the attacker would typically try a few hundred million hashes per second.
You are probably more likely to die from being hit by a meteor than to ever see this in your life.

Also, a quick Google says that S3 supports folders, so if you are in ultra-paranoia mode, you can create a folder with a UUID name and place your UUID-named files in there. Someone would have to guess a 256 bit number correctly for accessing a file, you can be pretty sure that this won't happen during your lifetime.

Damon
  • 5,211
  • 1
  • 20
  • 26
2

The chances of someone randomly guessing a v-4 UUID is infinitesimally small. It's so tiny, it's not worth serious consideration.

Take a look at the math.

The following equation shows the number of guesses (n) required to generate a 50% probability of a correct match is 2.71 quintillion.

n ≈ 0.5+sqrt(0.25+2*ln2*(2^122))
  ≈ 2,714,922,669,395,445,248
  ≈ 2.71 × 10^18

Let's increase the probability (p) of a match to one in a billion.

Let: p = 1/(1 billion) = 0.000000001
UUID   ≈ sqrt(2*2^122*ln(1/(1-p)))
       ≈ 103,120,461,418,554
       ≈ 103.1 × 10^12

I.e., to would take about 103 trillion guesses.

Given these odds, and the compute power required to engineer the hack, the likelihood of someone randomly accessing the UUIDs associated is so low, it just wouldn't be worth losing any sleepover.

Wikipedia has an in-depth description of the model and the use of UUIDs in database keys.

Always assume that any addressable network is hackable. But since your files will be effectively "unlisted" and semi-private using the UUID model should be sufficient to make them virtually inaccessible.

ssent1
  • 131
  • 2
  • Your post assumes that the v4 UUID is generated from a cryptographically secure RNG. That is the case for many implementations (e.g. Java guarantees it), but by no means guaranteed for every API. – CodesInChaos May 05 '20 at 19:41
  • 1
    Also you're confusing collisions with (multi target) guessing. – CodesInChaos May 05 '20 at 19:44
  • Whilst your answer is supported by correct facts and calculations, your conclusion is wrong. It is not safe to rely on obscure file names to keep objects secure. If an attacker wants to find files, they could: use a link or a list provided by someone else, or found elsewhere, or simply enumerate all UUIDs - it would take a while but it's certainly feasible. Without a correctly implemented access control layer, there's no guarantee that the files are safe from unauthorised access. – Pedro May 06 '20 at 05:49
  • 1
    Thanks for your feedback. > "Files uploaded are *publicly read/write*, [but] *private to users*." Based on this description, I assume a "bulletproof" security model is not needed. In a similar use case, my client wants to make it easy for their users to access files from a shared link. At the same time, they want to prevent the discovery of other files. Would you change anything in your analysis given this need? Is there a better or more cost-effective way of doing it on S3 or shared hosting? – ssent1 May 06 '20 at 20:25
  • @ssent1 I'll try: a) I am not sure what you mean by "files are publicly read/write but private to users" - Are they meant to be public or private? and if public, why the effort to hide them? b) bulletproof does not exist. the definition of public and private however makes things clearer, to the lack of a better security profile definition (or threat assessment). having false assurances does not help - it is not safe to rely on identifiers being difficult to guess or enumerate to keep resources/objects private; – Pedro May 06 '20 at 21:13
  • @ssent1 (cont) c) you must have authentication, session management and access controls in place. i.e. you can't serve files directly to the user if you want to keep them private to an extent. You need to use a piece of code (an application) that maps access to these resources. The application pulls the file from a private S3 bucket and serves it in an HTTP response via a dynamic link. This enables you to safely share publicly, privately, to a group, etc. – Pedro May 06 '20 at 21:16
  • @Pedro Here's the problem. To keep certain government contacts, the files must be available in public "at-will." It would be inconvenient, but not fatal if non-government customers got the information. We're trying to use the YouTube definition of "private." The videos are public since anyone with the link has access. The distinction is this. 'Privacy' as it relates to "access to information" and security as it relates to "protecting" it. We can't make it secure since it would violate the contract terms. But we can make it close to impossible to find without a link. – ssent1 May 06 '20 at 22:09
  • @ssent1 well, it appears as though you have answered your own questions above. this is what I meant by threat assessment and looking at your requirements. if this model fits your requirements and threat assessment, you're doing security right. – Pedro May 06 '20 at 22:59
0

This is a very old post, but there will be people who come across this because it still shows up in google searches. Most of the answers given here are wrong. You should not use UUID as a security mechanism. The answer to the question of "How to secure files in S3" is to use S3 pre-signed URLs.

See documentation here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html

You should not store anything in public S3 buckets that you do not want the entire world to see every day. I won't even mention the potential abuse case this opens you up for - think about data transfer charges for large objects if you get trolled endlessly by a reddit group.

gk0r
  • 1