24

As I learned in a comment for How to encrypt in PHP, properly?, I was told using a string comparison like the following in PHP is susceptible to timing attacks. So it should not be used to compare two MACs or hashes (also password hashes) for equality.

if ($hash1 === $hash2) {
   //mac verification is OK
   echo "hashs are equal"
} else {
  //something bad happenend
  echo "hashs verification failed!";
}

Can someone please detail me what exactly the problem is, how an attack would look like and possibly provide a secure solution that avoid this particular problem. How should it be done correctly? Is this a particular Problem of PHP or do other languages like e.g. Python, Java, C++, C etc. have the same issues?

evildead
  • 624
  • 1
  • 4
  • 14
  • yes, and no. I made the question a little bit more general, whereas http://security.stackexchange.com/questions/74547/timing-attack-against-hmac-in-authenticated-encryption is more specific and only talks about MACs. I guess this question here is more general. Any other opinions? – evildead Mar 12 '15 at 22:27

3 Answers3

27

The problem here is that generic string comparison functions return as soon as they find a difference between the strings. If the first byte is different, they return after just looking at one byte of the two strings. If the only difference is in the last byte, they process both entire strings before returning. This speeds things up in general, which is normally good. But it also means someone who can tell how long it takes to compare the strings can make a good guess where the first difference is.

In an attack scenario, an attacker has total control of $mac1 (it's taken from the attacker-made message), while $mac2 is the real valid MAC for the attacker's message. $mac2 must remain secret from the attacker, or he can stick it on his message and thus forge a valid message. The attacker, by analyzing the time it takes to get a response, can probably figure out where the first difference is between his MAC and the real one. He can try all possibilities for that one byte, find the correct one, and then work on the next byte secure in the knowledge that the first k bytes are right. At the end, he tried just 256*len MACs (if len is the length of the MAC) instead of the 256^len he should have had to try.

cpast
  • 7,263
  • 1
  • 30
  • 35
  • 4
    The optimisation feature you're describing is called "early-out optimisation", just in case anyone wants to look into it further. – Polynomial Jun 30 '16 at 09:47
  • @cpast do you have a reference regarding the 256*len vs 256^len combinations difference? – Wadih M. Jan 31 '17 at 02:47
  • @WadihM. It's simple math. 256 possible byte values for each position in the MAC. If you can tell the first wrong one, you only need 256*length tries at the most. If you must try every single possibility, it's essentially a base 256 number with length digits, so it's 256^length combinations. – mbomb007 Apr 26 '18 at 17:06
  • the problem is not that the attacker knows the MAC, the problem is that the attacker can use the timing information as a side-channel to create an oracle that can be used to tell the attacker a valid MAC for his arbitrary message. Your explanation how the attack actually works is not correct. The way you are describing it, does not work and is not the purpose for doing comparison in constant time. – evildead Nov 20 '19 at 00:04
  • @evildead No, my description is *exactly* how it works. "Use the time information as a side-channel to create an oracle that can be used to tell a valid MAC" is exactly what I said, except rewritten to use fancy technical terminology. "Analyzing the time to get a response" is using timing as a side-channel. Using that to find the first difference is your oracle. The `$mac2` he ultimately discovers is the valid MAC for the message. If you think there's some incorrect aspect of the answer, can you explain what is actually wrong? – cpast Nov 20 '19 at 00:24
  • @cpast After carefully reading you answer, I guess you are meaning the correct thing and I got confused by your explanation. You are saying that `mac2` must remain secret to the attacker. I think what you really wanted to express is that `mac2` must not be _computable_ by the attacker by using an oracle that implemented string comparison function incorrectly. Maybe that's a picky comment, but the term secret confused me to believe that you meant that an attacker must not have access to a valid mac for a valid messages. (Which is not what you meant if I interpret you correctly now, right?) – evildead Dec 08 '19 at 12:12
24

I will add a list with time constant functions for different languages:

PHP:

Discussion: https://wiki.php.net/rfc/timing_attack

bool hash_equals ( string $known_string , string $user_string )

http://php.net/manual/en/function.hash-equals.php

Java Discussion : http://codahale.com/a-lesson-in-timing-attacks/

public static boolean  MessageDigest.isEqual(byte[] digesta, byte[] digestb)

http://docs.oracle.com/javase/7/docs/api/java/security/MessageDigest.html#isEqual(byte[],%20byte[])

C/C++ Discussion: https://cryptocoding.net/index.php/Coding_rules

int util_cmp_const(const void * a, const void *b, const size_t size) 
{
  const unsigned char *_a = (const unsigned char *) a;
  const unsigned char *_b = (const unsigned char *) b;
  unsigned char result = 0;
  size_t i;

  for (i = 0; i < size; i++) {
    result |= _a[i] ^ _b[i];
  }

  return result; /* returns 0 if equal, nonzero otherwise */
}

More I found here: http://www.levigross.com/2014/02/07/constant-time-comparison-functions-in-python-haskell-clojure-java-etc/

Python (2.x):

#Taken from Django Source Code

def constant_time_compare(val1, val2):
    """
    Returns True if the two strings are equal, False otherwise.

    The time taken is independent of the number of characters that match.

    For the sake of simplicity, this function executes in constant time only
    when the two strings have the same length. It short-circuits when they
    have different lengths.
    """
    if len(val1) != len(val2):
        return False
    result = 0
    for x, y in zip(val1, val2):
        result |= ord(x) ^ ord(y)
    return result == 0

Python 3.x

#This is included within the stdlib in Py3k for an C alternative for Python 2.7.x see https://github.com/levigross/constant_time_compare/
from operator import _compare_digest as constant_time_compare

# Or you can use this function taken from Django Source Code

def constant_time_compare(val1, val2):
    """
    Returns True if the two strings are equal, False otherwise.

    The time taken is independent of the number of characters that match.

    For the sake of simplicity, this function executes in constant time only
    when the two strings have the same length. It short-circuits when they
    have different lengths.
    """
    if len(val1) != len(val2):
        return False
    result = 0
    for x, y in zip(val1, val2):
        result |= x ^ y
    return result == 0

Haskell

import Data.Bits
import Data.Char
import Data.List
import Data.Function

-- Thank you Yan for this snippet 

constantTimeCompare a b =
  ((==) `on` length) a b && 0 == (foldl1 (.|.) joined)
  where
    joined = zipWith (xor `on` ord) a b

Ruby

def secure_compare(a, b)
     return false if a.empty? || b.empty? || a.bytesize != b.bytesize
     l = a.unpack "C#{a.bytesize}"

     res = 0
     b.each_byte { |byte| res |= byte ^ l.shift }
     res == 0
   end

Java (general)

// Taken from http://codahale.com/a-lesson-in-timing-attacks/
public static boolean isEqual(byte[] a, byte[] b) {
    if (a.length != b.length) {
        return false;
    }

    int result = 0;
    for (int i = 0; i < a.length; i++) {
      result |= a[i] ^ b[i]
    }
    return result == 0;
}
Benoit Esnard
  • 13,979
  • 7
  • 65
  • 65
evildead
  • 624
  • 1
  • 4
  • 14
  • Thanks for providing these, particularly the C version. This can be adapted to any language, for example if your PHP needs to be compatible with PHP < 5.6 which doesn't have the "hash_equals" function. – thomasrutter Oct 28 '16 at 04:14
  • 1
    Don't check length first and then fall out, since that leaks length info. ALWAYS hash each string and then check length of original. – fatal_error Jan 28 '17 at 17:48
  • Besides the C version, OpenSSL has [CRYPTO_memcmp](https://www.openssl.org/docs/manmaster/man3/CRYPTO_memcmp.html) which you could use with implementations in assembly. Note that you _must_ check string size equality or hash strings yourself, since it compares only up to the specified `len`. – Rafe Aug 18 '19 at 13:54
3

Timing attacks against string comparisons are not PHP-specific. They work in any context where a user-provided string is checked against a secret string using the standard “short circuit” comparison algorithm (the check stops on the first non-matching byte). This applies to PHP, Python, C and even database systems like MySQL.

The standard approach to this problem is to always iterate over all bytes of the string, regardless of the content. As pseudo code:

function safe_string_comp(str_1, str_2):
    if byte_length(str_1) =/= byte_length(str_2):
        return FALSE
    else:
        comparison_bit := 0  // 0 if the strings match, 1 otherwise
        for i := 0, i < byte_length(str_1), i := i + 1:
           comparison_bit := comparison_bit | (str_1[i] ^ str_2[i])

        return comparison_bit == 0

The symbol | denotes bit-wise OR operator, and ^ is the bit-wise XOR.

Recent PHP versions (>= 5.6.0) already have a built-in function named hash_equals. If it's not available, the above algorithm needs to be implemented. So a timing-safe string comparison function may look like this:

<?php

/**
 * Count the number of bytes in a string.
 *
 * Note that the strlen() function is ambiguous, because it will either return the number of *bytes* or the
 * number of *characters* with regard to mb_internal_encoding(), depending on whether the Mbstring extension
 * has overloaded the string functions:
 * http://php.net/manual/en/mbstring.overload.php
 *
 * For example, the non-overloaded strlen() function returns 2 for the string "\xC3\x84". However, if the
 * function is overloaded and the internal encoding set to UTF-8, the same string is interpreted as a single
 * character, namely the "Ä" umlaut. So the function returns 1 in this case.
 */
function byte_length($binary_string)
{
    if (extension_loaded('mbstring'))
        return mb_strlen($binary_string, '8bit');
    else
        return strlen($binary_string);
}



/**
 * Timing-safe string comparison.
 *
 * The standard string comparison algorithm stops as soon as it finds a non-matching byte. This leaks information
 * about the string contents through time differences, because the longer the common prefix, the longer the
 * comparison takes (e. g. checking "aaax" against "aaaa" theoretically requires slightly more time than checking
 * "xaaa" against "aaaa").

 * To avoid this problem in security contexts like MAC verification, iterate over *all* bytes of the strings
 * regardless of the content.
 */
function secure_string_equals($string_1, $string_2)
{
    // Use built-in hash_equals() function if available (PHP >= 5.6.0)
    if (function_exists('hash_equals'))
    {
        return hash_equals($string_1, $string_2);
    }
    else
    {
        $equals = false;

        if (!is_string($string_1) || !is_string($string_2))
        {
            trigger_error('One of the arguments is not a string.', E_USER_ERROR);
        }

        if (byte_length($string_1) == byte_length($string_2))
        {
            // 0 if the strings are equal, 1 otherwise
            $comparison_bit = 0;
            for ($byte_index = 0; $byte_index < byte_length($string_1); $byte_index++)
            {
                $comparison_bit |= ord($string_1[$byte_index]) ^ ord($string_2[$byte_index]);
            }

            $equals = ($comparison_bit == 0);
        }

        return $equals;
    }
}
Fleche
  • 4,024
  • 1
  • 17
  • 20