8

This is a potentially hypothetical question since it has not been determined whether we will be required to do this but I figure it's a question that will come up more often.

Background

When implementing RESTful services the standard approach is that the URI contains the information required to identify the resource in question. While I am not a restafarian there are a lot of advantages to this approach over that we wish to take advantage of.

Problem

We are subject to regulatory requirements around protecting certain types of information. Some of these protected elements will end up in resource paths if we follow the basic REST formula.

TLS will cover most of the concerns about because the URLs are encrypted. However, there are a few places where they may be stored in the clear. Notably they are often written to access logs (which is useful) by default and if people use browsers for this, they can be stored in the clear in their browsing history.

Approach

Assuming this is an issue and we want to use such an approach it seems there is one prime contender for the solution, which is to encrypt parts of the URI or all of it. Hashing does not seem to be an option since the data is highly predictable and calculating the entire set of values is easy. Encrypting the entire URI will limit a lot of the advantages of the approach. Symmetric encryption requires distributing a secret key widely and therefore defeats the purpose. The option that seems the best is to encrypt on the sensitive data with a public key.

As I understand it, your encrypted output needs to be at least as long as the modulus. So if we go with a 2048-bit key, each the data will need to be at least 256 bytes. This will result in some fairly long URIs but I think that should be OK for a while anyway and that limit may not really apply to our solution.

Does this approach seem valid and/or is there anything else I am missing? Would it be valid to use the same key-pair that would be used for TLS? There probably needs to be some other information in the encrypted portion to avoid replay attacks or mapping an encrypted value back to given key, correct?

NOTE: I should mention that for my specific problem, only authenticated parties would be able to call. This is not something that we mean to be expose on the public internet although mistakes can happen. All parties involved are legally-bound to protect the data. I think it's worth exploring more general usage of such an idea because of the growing popularity of RESTful services.

JimmyJames
  • 3,049
  • 2
  • 17
  • 25
  • 2
    Can I offer an alternative: you do a POST with the identity of the resources you want to access in exchange for a token/signed request and use that for latter GETs. In this way, each URL can still be logged and access tracked, but there's no single private key to leak here. – billc.cn Dec 09 '16 at 18:15
  • @billc.cn This is definitely workable but I think the scope of this question needs to stay with REST-style interfaces. I don't think the solution you propose doesn't really fit into that model at least not in a simple way. – JimmyJames Dec 09 '16 at 19:24
  • you can use hashing if you concat any sensitive info with the sessionID before hashing, but just use tokens. – dandavis Dec 10 '16 at 09:39
  • 1
    Agree with the other answers. This is a bad approach all round. URI's are not designed to hold sensitive data, indeed the exact opposite is true. You cannot guarantee the safety of keys in front-end systems and there are potentially many systems that will/could log your URI's between your client and your server (assuming use of the Internet). TLS helps but doesn't eliminate that issue. – Julian Knight Dec 10 '16 at 15:11
  • @dandavis If you mean a server side session, then there is none. Sessions are verboten. – JimmyJames Dec 12 '16 at 18:04
  • @JulianKnight How would the intermediaries get at the encrypted URI? – JimmyJames Dec 12 '16 at 18:06
  • @LamonteCristo I don't think so. It's nothing like a session and it, like any good service design, is stateless. – JimmyJames Dec 12 '16 at 19:02

5 Answers5

6

it seems there is one prime contender for the solution, which is to encrypt parts of the URI or all of it. Hashing does not seem to be an option since the data is highly predictable and calculating the entire set of values is easy. Encrypting the entire URI will limit a lot of the advantages of the approach. Symmetric encryption requires distributing a secret key widely and therefore defeats the purpose. The option that seems the best is to encrypt on the sensitive data with a public key.

Your logic is sound.

I know of one product that does exactly as you suggest; sensitive data is encrypted in the end-users browser by Javascript code, and an HTTPS GET request is made in which some of the 'key=value' parameter pairs have a public key encrypted, base64 encoded string.

The advantage is that these requests can be passed across networks with multiple TLS endpoints without the sensitive data being subject to decryption, and any web server logs or proxies in the middle won't have access to the sensitive data.

Does this approach seem valid and/or is there anything else I am missing?

It is valid, but here are the things you don't want to miss:

  1. You need to have a client with enough intelligence to perform the encryption for you. That means Javascript, or a custom app, or something.
  2. Don't roll your own crypto. Use a solid, reputable library and get a code review of your app.
  3. Key management. You need to generate keys, publish keys, rotate keys, and retire keys... and you need to do it carefully, because to do it right means to balance between losing control of the data and losing access to the data.

Would it be valid to use the same key-pair that would be used for TLS?

I would recommend against it. Aside from the general rule-of-thumb of "don't overload cryptosystems," there are some situations where that key isn't "trusted". One that I've run into is that DDoS mitigation services often like to have a copy of your web certificates and keys if they're fronting your traffic during an attack.

There probably needs to be some other information in the encrypted portion to avoid replay attacks

That depends; if the REST transaction you're working with is idempotent, then it's not that important. If it does change something, then yes, replay protection is more important.

or mapping an encrypted value back to given key, correct?

Yes, it's not absolutely necessary - depending on how many keys you use and how long their OUP and RUP tail is - but it's much easier not to have to guess which key :)

gowenfawr
  • 72,355
  • 17
  • 162
  • 199
  • If they are using TLS, then replay detection is already covered. Isn't it? – Limit Dec 11 '16 at 03:49
  • 1
    @Limit, remember that the fundamental condition driving this question is how to protect the content of a request given that logs and caches may reveal the data even with TLS in place - and if it were to be revealed, it could be replayed. – gowenfawr Dec 11 '16 at 06:07
  • "One that I've run into is that DDoS mitigation services often like to have a copy of your web certificates and keys" If that's the case, wouldn't the entire communication be at risk? The content that would be returned in the response is far more sensitive than the keys in the URL and would expose what those keys were in many/most cases. – JimmyJames Dec 12 '16 at 18:11
  • @Limit The replay concern is that if someone can grab the URI, they could do a GET on it and get the response data which would likely identify the sensitive keys and much much more. – JimmyJames Dec 12 '16 at 18:15
  • @JimmyJames if you are worried about leaking private information, you should implement a role based access system. Just replay detection is not a solution. There can be multiple legitimate requests and if you don't use probabilistic encryption, the encryption output will be the same always. – Limit Dec 12 '16 at 18:19
  • @Limit All users are authenticated and authorities checked but that doesn't mean that they don't have the rights to make the same requests. The fact that someone made request X about key Y could be sensitive in itself. By adding the user/client identify to the request and a timestamp, each request for the same keys should have a different URI. – JimmyJames Dec 12 '16 at 18:24
  • @JimmyJames in the software I'm familiar with credit card data goes in, low value chits go out, so there's no concern about the return data. Obviously that's going to vary by application; this issue is far from one-size-fits-all. (And, quite frankly, DDoS providers that require your keys freak me out all kinds of ways). – gowenfawr Dec 12 '16 at 18:28
  • @JimmyJames forgive my ignorance but are you inserting the user identity within the URI? REST supports better (read: safer) ways to send the user identity AFAIK. – Limit Dec 12 '16 at 18:29
  • @Limit No, that's likely to be accomplished with client-certs. Most of the client calls are coming from other servers. The sensitive data is generally about other people and there are laws governing protecting it with significant penalties involved. – JimmyJames Dec 12 '16 at 18:33
  • @gowenfawr In my case the keys are range from technically sensitive based on the regs (but give me a break really?) to pretty sensitive and the responses from GETs range from pretty sensitve to hugely massively (holy shitballs!) sensitive. So being able to take a URL do a GET is probably a bigger deal than replaying a POST because it will generally create an invalid condition and go to manual review. So given that, is there any other reasons no to use the same TLS keys? – JimmyJames Dec 12 '16 at 19:24
  • @JimmyJames the only real reason is "best practices"; history has taught us repeatedly that overloading keys leads to failures, _even if none of us can see anything wrong with it today_. And when I say "best practices," don't forget it means "...and every auditor or security analysis you ever get will pounce on key reuse like a weasel on a maple-syrup covered hot dog." – gowenfawr Dec 12 '16 at 20:44
  • @gowenfawr Good points. I'd be interested in some examples of how this lead to failures in the past. If it's not too much trouble, could you add some links to your answer? – JimmyJames Dec 12 '16 at 20:51
3

I was reminded of this because it came up as a "popular question." Since I asked this I have learned that the OWASP recommendation is to use request headers:

  • In POST/PUT requests sensitive data should be transferred in the request body or request headers
  • In GET requests sensitive data should be transferred in an HTTP Header

While this makes the API somewhat more difficult to use, there is support for this approach in at least some common frameworks.

dbreaux
  • 105
  • 3
JimmyJames
  • 3,049
  • 2
  • 17
  • 25
  • 1
    I'll note that the OWASP recommendation is specifically about, "Passwords, security tokens, and API keys". So, say, search criteria that might include like name/address/etc. doesn't strictly fit this pattern and, IMO, seems more out-of-character for HTTP request headers. Also, note OWASP is going to naturally err on one side of the trade-off. It might or might not be the "better" side for a particular situation. – dbreaux Aug 19 '19 at 14:27
2

Does this approach seem valid and/or is there anything else I am missing?

Your approach is valid, however; I wouldn't dismiss using a preshared symmetric key for the following reasons.

  • Asymmetric is far slower to decrypt than symmetric. If the service has to decrypt multiple values from your URI performance could be impacted.
  • The attack vector for this encryption is limited. TLS should be protecting this data in most cases.
  • Different consumers can be given different pre-shared keys for added security (addressing your concern with widely shared symmetric). This also allows for more flexibility with key rotation.
  • Asymmetric encryption will result in a large encrypted text. If your URI + querystring are long enough it could possible cause a 414 (Request-URI Too Long)

Would it be valid to use the same key-pair that would be used for TLS?

In theory you shouldn't because if an attacker obtained your key-pair they could decrypt both your TLS connections & your encrypted headers.

... Now from a realistic/support standpoint, if your TLS key-pair is obtained that's a doomsday scenario. Having a second keypair would not prevent a "full breach" event since you'd likely have sensitive data in the headers/body that could be decrypted.

I think you need to carefully weigh the options for your specific business need.

rdChris
  • 181
  • 1
  • 1
  • 6
0

The more standard approach is to move sensitive information into the body.

Really, this comes down to your logging practices. You could make the above suggestion worthless by including the body in your logs, or you could sidestep the issue by not logging the full URI. The information is always still available to your server (even in your encrypted scheme, it has to be decrypted by at least some part of the stack).

It seems to me that, rather than implementing a complex scheme that bloats URLs, makes it harder to debug, and causes users to wonder what's happening, you'd be better off just changing the defaults on your logs to something that suits your environment.

Xiong Chiamiov
  • 9,402
  • 2
  • 35
  • 78
  • I don't necessarily disagree with your answer but it side-steps the question. An noted in the question, the assumption that the URI contains the information and that you must protect it. While disabling the access logs is an option, it's a risk because it's a default that could easily be re-enabled by accident. – JimmyJames Dec 09 '16 at 21:02
  • Sure, I'm just saying that the question starts with a poor assumption, and there are better ways of solving your actual problem. – Xiong Chiamiov Dec 09 '16 at 21:16
  • I respect your opinion but moving the data to the body creates other issues. The logging issue is not part of the solution. It's an artifact of old technology that is emulated without thinking. – JimmyJames Dec 09 '16 at 21:20
0

There are a few points that your approach hasn't considered:

  • encrypt parts of the URI or all of it

    How will you encrypt the URI? The browser does it or the web-page? If the browser does it, how does the browser know what part of the URI to encrypt and if the web-page does it, how will you get the TLS certificate of the server? Read this. Also, you can't encrypt the entire URL otherwise the browser will not know the server that it should talk to.

  • There probably needs to be some other information in the encrypted portion to avoid replay attacks or mapping an encrypted value back to given key, correct?

    If you are using TLS, then the replay detection is already handled by TLS (Are SSL encrypted requests vulnerable to Replay Attacks?).

There are several discussions about encrypting URL parameters that have happened on StackOverflow: 1, 2 and some other google search results: It's a bad idea, how to do it in Java

Limit
  • 3,236
  • 1
  • 16
  • 35
  • Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackexchange.com/rooms/50059/discussion-on-answer-by-limit-hiding-sensitive-data-in-uris). – Rory Alsop Dec 13 '16 at 23:00