1

I need to generate a MAC signature for data on a website, but it's in a HTML templating system where there's only a few functions available. The only hashing function available is md5. Ideally, I'd use a real hmac function, but since that's not an option here, is it secure to have something like the below?

secret_key = '...';
data = '{"user_id": 123, "timestamp": 12345}';
signature = md5(data + secret_key);

I know that hmac-md5 is considered secure, and I know length extension attacks are possible if you use the construction md5(secret_key + data), but is the reverse, md5(data + secret_key) also insecure? It seems like it shouldn't be vulnerable to length extension since the key gets added to the end. Or is this not an issue anyway for the case where data is a JSON string like above, since you can't add characters on to the end of a valid JSON string and have it still be valid?

The specific use-case is that I have a CMS that I'm theming using liquid templates, and the template needs to make requests to a API that I control. I need to know who the user is that's making the request, and verify that they are in fact logged-in on the CMS when the request arrives at my API. My solution is to use the above md5 signature system to sign some info about the user in the Liquid template compilation, and then on my API server I can verify that signature since the secret key is shared between both my API server and the liquid template, so I know the data hasn't been tampered with. The CMS I'm using is a hosted cloud service and provides no other way to verify the identify of the logged-in user thats available to javascript running on the page (at least that I know of).

David Chanin
  • 113
  • 3
  • https://crypto.stackexchange.com/q/40413/18298 – kelalaka Dec 29 '20 at 00:00
  • you can easily convert it HMAC-MD5 where the collision is not necessary. – kelalaka Dec 29 '20 at 00:01
  • Can I ask what you're trying to achieve here? The answer to this question has a lot of caveats and nuances that tend to lead to a very fragile implementation, so it might be better to understand the use-case and propose an alternative approach. Why do you feel that a MAC is necessary here? What's your threat model? (probably best to edit this into your question rather than reply in a comment) – Polynomial Dec 29 '20 at 00:23
  • @Polynomial just updated the description with use-case – David Chanin Dec 29 '20 at 00:40
  • @kelalaka I don't fully the question in that link, does that mean md5(data + key) is not secure? I can't write arbitrary code because this is all inside of a Liquid HTML template, and it doesn't let you write arbitrary code. There's just a small number of available functions you can use, and the dialect I'm stuck with only has plain md5, no other hashing functions. – David Chanin Dec 29 '20 at 00:46
  • https://crypto.stackexchange.com/questions/2669/attacks-of-the-mac-construction-mathcalhm-mathbin-k-for-common-hashes – kelalaka Dec 29 '20 at 00:50
  • @DavidChanin Is the MAC being calculated on the client-side? If so, the user needs to know the "secret key" you're referring to, so what's to stop a malicious authenticated user from taking that key and using it to sign any data they like, including requests that look like they're from other users? – Polynomial Dec 29 '20 at 00:53
  • @Polynomial It's in the template code, but the user only sees the compiled output. The code looks like `{% data | append: "...secret_key..." | md5 %}`, but that runs on the server. Only the resulting value appears in the compiled HTML, the key never shows up in the client. – David Chanin Dec 29 '20 at 01:00
  • @kelalaka thanks for sharing that. If I understand the answers correctly, it's actually better to do `hash(key + data)` instead of `hash(data + key)` for collision resistance, as long as length extension isn't an issue. length extension is an issue for `md5`, but since I'm using a JSON-encoded string as the data, I think that should be resistant to length extension anyway because you can't add meaningful characters to the end of a JSON string and still have it be valid JSON? So tldr it's basically OK? – David Chanin Dec 29 '20 at 01:08
  • @DavidChanin Are you sure there's no `hmac_sha1` or `hmac_sha256` function available, or `sha256` function available? From the looks of the documentation, most Liquid template implementations (e.g. Shopify, Tines, Braze, etc.) have those functions as string filters. – Polynomial Dec 29 '20 at 01:08
  • @Polynomial Yeah, sadly, the Liquid variant this CMS is using doesn't have those functions, only md5. I tested all of them :( – David Chanin Dec 29 '20 at 01:09
  • Related: https://security.stackexchange.com/questions/79577/whats-the-difference-between-hmac-sha256key-data-and-sha256key-data – mti2935 Dec 29 '20 at 14:41

1 Answers1

1

No, md5(data+key) is not secure. MD5 is vulnerable to dirt cheap collision attacks. It's possible to craft a data1 that is innocuous and that your system, or the code that's calling your system, accepts as valid, and then later submit data2 which is malicious but such that md5(data1+key) = md5(data2+key), without knowing key (the collision works for an arbitrary suffix).

md5(key+data) would be more secure in this respect, because generating collisions would require knowing the key. However, it has other weaknesses, in particular a length extension attack which allows finding a specific md5(key+data2) from md5(key+data1). While the attack doesn't allow any freedom on the content of data2 given data1, sometimes all it takes is to arrange for an admin bit to be 1 instead of 0, and all it might take is a few tries of different data1 until the corresponding data2 works for the attacker.

The good news is that if you have a hash, you do have the corresponding HMAC. HMAC is a simple construction that just requires two applications of the hash function. In JavaScript-like pseudocode (I think this is valid code if xor is a function that performs xor on byte strings and pads key with null bytes if it's shorter, but I'm not competent in JavaScript and it's possible that string encodings make this a bit more difficult):

function hmac_md5(key, data) {
  return md5(xor(key, '\x5c'.repeat(32)) +
             md5(xor(key, '\x36'.repeat(32)) +
                 data));
}

If expressing xor is really impossible, my recommendation would be to keep the important parts of HMAC, but avoid the known pitfalls (collisions, length extension, algebraic relations). You'd end up with a construction that hasn't been studied, but at least you can avoid known weakness within your constraints. The following construct keeps the gist of HMAC, which is to do hash(key+_) with two different keys (here with a constant prefix to ensure that the two hash invocations don't start with a common prefix). It's homemade and hasn't been studied, so it isn't great, but it's better than using a construct which has been studied and broken.

function homemade_mac(key, data) {
  return md5('a' + key +
             md5('bc' + key +
                 data));
}

Using HMAC-MD5, or even worse a homemade construct, is not ideal, and you should arrange to upgrade your system to support an unbroken hash (SHA-2 or SHA-3), if only because it's a sign that other parts of the system are probably using MD5 in a vulnerable way. But HMAC-MD5 does not have any known vulnerability (other than being a bit too short at 16 bytes, but that at most leads to very expensive attacks which only become plausible if you have billions of data objects in your system).

Gilles 'SO- stop being evil'
  • 51,415
  • 13
  • 121
  • 180
  • sadly in this specific case `xor` isn't available since this is all happening inside of liquid HTML templates which don't have that function. I wish I could upgrade this system, but it's a cloud service that I have no control over. Would `md5(key + data + key)` get the best of both worlds then, basically preventing both collision attacks and length extension attacks? Or is `md5(key + data)` already OK since the data is a JSON-object string, which won't be valid if extra chars are added to the end? – David Chanin Dec 29 '20 at 15:42
  • @DavidChanin `xor` is a function that you have to write for yourself (or probably copy from Stack Overflow). AFAIR `md5(key+data+key)` isn't safe either but the attacks are more subtle. If xor is really a problem, I suggest `md5('a' + key + md5('bc' + key + data))`, which keeps the gist of HMAC (double hashing involving the key both times, but not exactly the same prefix for both hash invocations). As far as I know, this hasn't been studied and lacks some extra robustness granted by flipping key bits, but there are no known attacks that this extra robustness prevents. – Gilles 'SO- stop being evil' Dec 29 '20 at 15:54
  • Aaah yes, doing `md5('a' + key + md5('bc' + key + data))` is doable in the templating system! I know it's absurd that I can't just use a hmac like a sane person, but all that's available in the templating language is `md5`, and you can't write arbitrary code - you can just run strings through a few premade functions like `md5` and `append`, etc... – David Chanin Dec 29 '20 at 16:15