-8

I have also asked similar q here :

To create a variants of MD5, I made following changes :

  1. MD5 uses a non-linear sin(i)* pow(2,32) ----> i plane to use cos(i)*pow(2,32)
  2. Instead original values of A, B, C, D that are four initial seeds( or states), that changes additively during the processing of input text.----> I am planing to start with some different then given in MD5's RFC.
  3. Also I would change the code functions. F,G,H,I (in MD5's RFC) with any other used in SHA r other.

I just want to know, What would be the effect on properties of MD5.
I want to use variant as a good hash function. I am not using Md5's variant for authentication.

Although, I have created four variants using above ideas and checked this with time inputs of 10 to 20 minutes and its working fine. Am i doing correct ?

  • 4
    Why would you want to do that? – CodesInChaos Oct 08 '12 at 10:06
  • Its client requirement only. Also I want to know. – Grijesh Chauhan Oct 08 '12 at 10:31
  • 9
    I cannot advise against this strongly enough. **Never roll your own crypto.** Use a proper RNG, designed for this purpose. Windows exposes [`CryptGenRandom`](http://msdn.microsoft.com/en-us/library/windows/desktop/aa379942.aspx) as part of its CSP, and Linux-like OSes have `/dev/urandom`. – Polynomial Oct 08 '12 at 12:49
  • 1
    Is this for a uni project of some sort? I can see you may get some extra marks for this but in practice it's pointless and is more likely to make holes in your security than reinforce it. – Inverted Llama Oct 08 '12 at 15:00
  • 3
    @InvertedLlama If I had rolled my own crypto in a uni project, my professors would have bitchslapped me to hell for it. **Never ever roll your own crypto, full stop.** – Polynomial Oct 08 '12 at 19:00
  • 4
    For the love of Scheneier, why??? – Gilles 'SO- stop being evil' Oct 08 '12 at 23:50
  • Hashes are not random. –  Sep 15 '12 at 13:54
  • What is the problem with the output of MD5? –  Oct 08 '12 at 23:25
  • Lack of collision does not correlate to random - a monotonically increasing sequence will not collide and is certainly not random - changing sin() to cos() is (roughly) an inversion, but the rest of the alg may not feel good if you do this - the seeds you want to fiddle with are not random numbers - just use an existing algorithm or buy a hardware generator – Mark Mullin Oct 10 '12 at 17:19
  • @Polynomial Actually I just wanted how variant effect properties of MD5. I did just for fun/learning and understanding purpose. Random number generation I written wrongly I just wanted to check whether my variants are weaker then Original. I just posted my good question in wrong way- Thanks for your response. – Grijesh Chauhan Jul 08 '13 at 19:59
  • @InvertedLlama No I was just learning MD5 -- Just wanted to explore why the function are written in that way -- how my function in my variants will effect MD5. – Grijesh Chauhan Jul 08 '13 at 20:01
  • http://security.stackexchange.com/questions/21339/varient-of-md5-and-sha#comment35080_21341 – Grijesh Chauhan Jul 08 '13 at 20:04
  • @Gilles for the love to understanding!! Just for fun – Grijesh Chauhan Jul 08 '13 at 20:08
  • @CodesInChaos Just to understand how original constant can effect MD5...just to understand. – Grijesh Chauhan Jul 08 '13 at 20:09
  • @Polynomial At least tell us why we should never do it... This got me curious – nl-x Oct 04 '17 at 08:31
  • @nl-x See this question: https://security.stackexchange.com/questions/18197/why-shouldnt-we-roll-our-own – Polynomial Oct 06 '17 at 15:20

5 Answers5

7

SHORT ANSWER:

Most likely, your RNG looks random to you but doesn't pass the simplest of tests (range tests, mean tests, variance tests, bucket tests). Why ask us? The statistics PhDs have a ton of simple to code tests designed specifically for RNGs, go test your algorithm.

LONG ANSWER:

Never, ever, EVER roll your own cryptographic algorithm. Almost certainly, you don't have the skills required to do it securely. Instead, survey the available algorithms, determine which one meets your requirements, then BE SURE TO USE IT PROPERLY.

In this case, if you don't like your options from the crypto libraries available to you, search for a cryptographic PRNG (Pseudo Random Number Generator) and find a way to get it some pretty good seeds (no, current time is not a good seed, at least if its the only seed).

HASH functions like MD5, SHA-1 and Keccak (SHA-3) are not good random number generators. They weren't designed for that purpose and do not pass the most basic of RNG tests. If you want to know more about RNG testing, a search for RNG TESTS will give you more information. Most secure RNGs are implemented on top of secure PRNGs and given good seeding. The seeding can be Hashed, but hashing does nothing to increase security of the seed or the PRNG. Seeds are measured in Entropy and Entropy is equivalent to NON-GUESSABLE bits (information). However, total number of bits doesn't increase entropy. That's why current time isn't a very secure seed. Yes, it is unique (most of the time), but if outsiders can use good guesses to predict the seed time, then they can also predict the output of your PRNG!

"Repetition should be rare" - I'm not sure what you mean by this, but you can either use really long output from your RNG (512 - 1024 bits) or perhaps you're trying to implement some other algorithm (like choosing cards from a deck).

Andrew Philips
  • 1,431
  • 8
  • 11
7

Warning: some parts of this answer might be unpleasant.


To answer the specific question of how your changes alter the characteristics of MD5, we must first restate what the MD5 security characteristics are. MD5 is a cryptographic hash function, so it is supposed to be resistant to collisions, preimages and second preimages. MD5 is not good at resisting collisions, since efficient methods for building collisions have been discovered since 2004. My own code (an implementation of Klíma's attack) produces, on average, one collision every 14 seconds, when it runs on a 2.4 GHz Intel Core2 CPU. As far as we know, MD5 seems to be strong against preimages and second preimages (a theoretical attack with cost 2123.4 has been described, and that's better than the generic attack of cost 2128, but not much better).

How do your variants fare ? Although this depends a lot on the precise modifications you intend, we can say the following:

  • Changing the round constants, replacing the sine with a cosine, should have no influence whatsoever on the security. None of the published attacks exploits any special structure of the constants; and, indeed, the known collision attacks are differential cryptanalysis which is oblivious to actual constant values (at least for the attack's efficiency).

  • Changing the fixed IV should have no bearing either. When the first MD5 collision was publicized, the result was much decried because the researchers got the endianness wrong and thus computed the collision not on MD5, but on some other function which differs from MD5 precisely on the IV. Their method did not depend on specific IV characteristics, so a few days later they ran the code again, this time with the right IV, and produced the real first collision on MD5.

  • Altering the bitwise functions in the rounds will have an impact, since it will change the differential paths which collision attacks use. However, the functions that MD5 uses are not especially weak; there is no indication that any other choice would make the function stronger. There is, however, a good probability that other functions might make any MD5 variant substantially weaker.

To sum up, among the changes that you suggest, there are some changes which should not change security, and some others which have a rather high probability of decreasing security quite a bit -- and beginning with a function which is already broken. That's an achievement, but not a very positive one.


It seems that you want to use your MD5 variant to generate randomness. It must be said that:

  1. Using a hash function to build a PRNG yields poor performance. Hash functions like MD5 are very good at processing a lot of input data -- but for a PRNG, you want to spew a lot of output data, and hash functions usually suck at it. A properly optimized MD5 function, used to hash, e.g., successive values of a counter, might get you 100 MB/s worth of pseudo-random data, which is not bad, but any decent AES implementation on the same CPU will be almost twice faster.

  2. Using a hash function to build a PRNG yields poor security. To get proper randomness out of a hash function, you need to believe that the function is a random oracle. "Believe" is the right word: we already know that MD5 (or SHA-1 or SHA-256) is not a random oracle (the length extension attack is enough to show it, and it can even be proven that it is not ultimately possible, in a very mathematical way, for a concrete function to be a random oracle). To make a sturdy PRNG from a hash function, it is best to use a more elaborate (but more expensive) construction, namely HMAC_DRBG.

  3. Regardless of how you extend an initial random seed into a long sequence of pseudo-random bits, obtaining that initial seed is a harder problem. Many have failed.

Considering the flimsy and tenuous nature of the security properties of a hash function on which a hash-based PRNG relies, altering that hash function without any rational reason does not look like the best way to achieve security. It shall be noted that being a "perfect" hash function, with rock solid resistance against collisions and all kinds of preimages, does not make it good for a PRNG job. Being appropriate at being integrated in a PRNG is another security characteristic, that given hash functions may or may not exhibit; and it has been much less studied than resistance to collisions.


Your methodology sounds, at best, dubious. Things look as if you were throwing arbitrary tweaks at MD5, without any justification or rationality. This is more ritualistic than scientific. I have nothing against religions; however, while theosophical reasoning might give you good insights into Good and Evil, it is known to be inefficient at thwarting evildoers on an immediate basis. Jesus ensured redemption for us all (at least so goes the theory, according to the pope), but he still got nailed to the cross and died.

To sum up, your variants are unlikely to increase the security of MD5, while they are likely to decrease it; and the baseline is that MD5 is not strong to begin with; and even if it was strong as a hash function, it would not necessarily be strong as a PRNG.

Your are basically taking the problem in the totally wrong direction; in fact, in several totally wrong directions simultaneously. Before even thinking about assessing the appropriateness of any given algorithm for a job, you should first define that job with precision. If the job is generating randomness (to be precise, generating bytes which appear random, aka "unpredictable" for outsiders -- a computer being the ultimate deterministic machine, it cannot be really random), then using a hash function is not the smartest idea ever. Even more using a hash function of questionable repute like MD5. And twisting the function internals in the hope of making the function "better" is akin to performing brain surgery with a soldering iron: at least, it will make the failure indisputably spectacular.

If you need randomness, use what your operating system provides (/dev/urandom, CryptGenRandom(), os.urandom, java.security.SecureRandom... depending on your OS and programming environment). The OS is better at it than you. Just let it do its job.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • Excellent Boss, you got me – Rahul Gautam Oct 12 '12 at 06:34
  • @Thomas Pornin: Thanks a Lots... I was waiting since so long...:) – Grijesh Chauhan Oct 12 '12 at 07:04
  • Actually I didn't want to generate random numbers. I just made some modification in MD5 for fun/learning purpose. And wanted to check whether "the variants I created effects the strength of MD5 properties and what would be result". To check this I use randomed number of 10-20 minutes. And as I was expected the changes were not noticeable. I was also aware that I shouldn't use modified MD5 (I am not mathematician). Initially I posted my question in wrong way so got to much down-votes. Just wanted to say you thanks You answered what I wanted. Other person couldn't understand my changes. – Grijesh Chauhan Jul 08 '13 at 19:43
3

You should not be implementing your own cryptography. Use an existing cryptographically secure random number generator.

MD5 is not a random number generator. Not only will it (nor any variant you are likely to create) not pass basic randomness tests, but it would still require a random input, and the output will only be as random as the input you provide.

Stephen Touset
  • 5,774
  • 1
  • 23
  • 38
2

Hashes are not random number generators.

You are inventing your own crypto. Do not invent your own crypto.

Worst, say you modify MD5. Now you have n = MD5'(x). You still need a random x in order to get your "random" number. Where, exactly, will this input come from? You haven't solved the problem; you've simply moved it somewhere else.

Stephen Touset
  • 5,774
  • 1
  • 23
  • 38
  • 1
    @Stepen Touset-Thanks for answer.But sorry, you misunderstood my question. I asked whether, the changes I made would effect the properties of original MD5 or not? And I thought it will not!. I have already verified it for 10 to 20 mins. on random input but I still want an opinion of others on this. MD5 is a hash function useful in authentication and data integrity. But may time it has been use for RNG.I need a hash function that is different than existing. Randomness needed to select randomly. And its well suited to the application.Plz give your OPINION how the change effect MD5 property. –  Oct 09 '12 at 04:48
  • What properties do you think it will not modify? Running a hash on 20 minutes of input is not even close to proof. – Stephen Touset Oct 09 '12 at 05:59
  • Yes I can't trust on 20 mins output. That why I need your suggestion. Properties I required: Low-Collision, Quite different output hash values even on single input bit change...Also I can't use original one. –  Oct 09 '12 at 06:07
  • @Chauhan - We are telling you that MD5 and SHA are not good solutions to generate a random number. The chances of a collision surrounding SHA and MD5 are higher then you realize. Why don't you use a solution like GUID where the proability of a unique value is nearly guaranteed. – Ramhound Oct 09 '12 at 12:39
  • Why can't you use the original one? And why would you use MD5, which is utterly broken at this point? – Stephen Touset Oct 09 '12 at 15:58
  • @Ramhound, you just recreated a variant of the same flaw enigma that the same number can't come up twice is not random either, there is a chance that the same number comes up twice in a row in a true random number generator, and GUIDs can still collide anyway. – ewanm89 Oct 09 '12 at 18:50
  • @GrijeshChauhan random number generators are tricky things, leave it to the experts who then send it to other experts who take years verifying it. – ewanm89 Oct 09 '12 at 18:51
1

If you need a unique hash function, the most common thing to do is to use your own unique salt with a standard algorithm - that way you don't risk accidentally making the hash weaker by messing around with its internal maths.

You can include the salt by appending it onto your plaintext:

hashSHA256('thing to hash' + 'my secret salt UYT23gje6t21313d72')
James
  • 151
  • 3
  • So you are suggesting me put md5 output as salt in next round .... –  Sep 15 '12 at 17:59
  • No! I'm suggesting you choose a (constant) salt, and use it to make your "new" hash function, instead of trying to modify one internally. – James Sep 15 '12 at 18:03