4

tl;dr: Is a pseudorandomly generated string unguessable enough to be used as token in a URL?

I'm making a web application, let's for the sake of argument assume it's another PasteBin. I allow users to paste text and I generate a random URL for them that they can pass to their friends but otherwise will remain unknown to the public.

How should I generate this URL? It'll be something like https://pastebin.example.com/%s where %s is a random string.

How secure is it to simply use a pseudo-random value, such as the result of uniqid? How will using something as openssl_random_pseudo_bytes improve how un-guessable the ID is?

Assume that users will not be able to submit arbitrary many pastes per minute, there is some rate-limiting system (such as a captcha and IP check). Also assume that I have taken measures to prevent search engines from indexing the page. The risk of sharing information protected just by an ungessable URL is outside the scope of this question.

For clarity: the random string is ONLY used as an identificator, it is NOT used for cryptographic operations afterwards.

Anders
  • 65,052
  • 24
  • 180
  • 218
jornane
  • 415
  • 2
  • 14

2 Answers2

2

uniqid might not be the best choice, in part because the docs do mention (in red):

Warning This function does not guarantee uniqueness of return value. Since most systems adjust system clock by NTP or like, system time is changed constantly. Therefore, it is possible that this function does not return unique ID for the process/thread. Use more_entropy to increase likelihood of uniqueness.

And I would tend to suspect that ID collisions would be fairly disastrous in your case.

I think the basic answer is that if you really have a requirement for unguessable, as opposed to simply unique, you probably want to use a cryptographically secure RNG.

Or use a UUID (https://en.wikipedia.org/wiki/Universally_unique_identifier) - there are options, specifically UUIDv5, that would seem better suited to your usecase that uniqid.

iwaseatenbyagrue
  • 3,631
  • 1
  • 13
  • 24
  • I would argue v4 UUID is more suitable than v5. V4 is the random one. – Marko Vodopija Mar 18 '17 at 19:53
  • v5 could be useful in a pastebin context where the content cannot be changed after upload. You can then make the UUID dependent on the content. – jornane Mar 22 '17 at 08:09
  • On the topic of uniqueness, you should always check for duplicate IDs, regardless of how good your random generator is. If the generator is perfect, you have an extremely low chance of hitting a collision, but it can still happen and it's easy to mitigate. – jornane Mar 22 '17 at 08:37
2

Do you need something "cryptographically secure"?

Yes, if you want the URL:s to be unguessable (through other means than brute force) you will need to use a cryptographically secure pseudo random number generator (CSPRNG).

The numbers will still be pseudo random - computers can not generate "true" random numbers. The only difference is that future numbers are not guessable from past numbers. And that is what makes all the difference here.

I understand that the words "cryptographically secure" is causing some confusion here. As you say, you are not doing cryptography, so why would you need that? Just translate them to "unguessable" and things become clearer.

Rate limiting helps some here. It will definitively make it harder for an attacker to get hold of a long list of sequential IDs. But does it make it impossible for an attacker to predict future IDs? No, probably not.

What about those PHP functions?

uniqid is based on the current time in microseconds. According to the manual, it is not cryptographically secure:

Caution: This function does not generate cryptographically secure values, and should not be used for cryptographic purposes. If you need a cryptographically secure value, consider using random_int(), random_bytes(), or openssl_random_pseudo_bytes() instead.

Use one of the suggested functions instead, and make sure your ID is long enough to prevent brute forcing.

Anders
  • 65,052
  • 24
  • 180
  • 218
  • The first two functions are PHP7 only, the last one will also set a `crypto_strong` value. Should that be checked before using the value? – jornane Mar 22 '17 at 09:03
  • @jornane As the doc says it is likely to be `true`, but you might as well check it just to be sure. If it is false, though, I don't think you have any alternatives then to fail hard and fast. – Anders Mar 22 '17 at 11:50
  • Fail or retry? When you generate a large key with OpenSSL, you'll see output with a lot of `.` and `+`, where OpenSSL discards "bad" random numbers. – jornane Mar 23 '17 at 09:07
  • Fail. If it gives `false` it is not because something accidentally went wrong. It is because PHP does not have access to a CSPRNG. So retrying will just give you `false` again. – Anders Mar 23 '17 at 10:43