59

I have a PDF with important information that may contain malware. What would be the best way to view it?

Anders
  • 65,052
  • 24
  • 180
  • 218
  • 1
    Is it of a JS kind? I think you can turn off JS. – curiousguy Aug 19 '12 at 17:26
  • The thing I would do is open it in a virtual machine without network access. – Luc Aug 19 '12 at 19:25
  • There may be a question here as to whether static or dynamic analysis is most effective. – adric Aug 24 '12 at 11:15
  • 1
    If a PDF contains malicious software then it no longer should be viewed. Besides non-malicious content likely doesn't even exist. You could also open the PDF file in a Linux virtual machine, but like I said, the content is likely gone. – Ramhound Aug 24 '12 at 16:06
  • @curiousguy - ok. I don't know this. Why does someone have legal JS code in a PDF OR What does it do that a non-JS PDF cannot? – FirstName LastName Feb 05 '13 at 10:13
  • Open it up...but not in an old version of adobe reader etc that has known vulnerabilities. Safest way is in a VM but that doesn't mean when you find nothing and open it on your normal machine it is safe. – Arlix Mar 08 '17 at 13:26

11 Answers11

38

Document-based exploits are directed not at the document itself, but rather at some vulnerability in the viewer. If you view the document in a program that isn't vulnerable (or in a configuration that inhibits the vulnerability), then you won't be exploited.

The real issue is knowing whether or not your viewer is vulnerable, which usually means knowing specifically what the exploit is. But there are alternate PDF viewers such as foxit or even Google chrome's built-in viewer that do not necessarily have the same vulnerabilities as Adobe's official viewer. This is not necessarily true for all vulnerabilities, so it's important to understand what you're getting in to ahead of time.

EDIT
If you find yourself frequently dealing with potentially malicious materials, it would be very wise to set up a hardened virtual environment. I'd recommend booting into a Linux system and running your target OS (usually Windows) in Virtualbox or a similar environment. Save a snapshot of the virtual OS, and then revert to that snapshot after you're done interacting with the malicious content. Also, it's not a bad idea to run the host Linux environment from a read-only installation (i.e. Live-CD).

Mike Samuel
  • 3,873
  • 18
  • 25
tylerl
  • 82,665
  • 26
  • 149
  • 230
  • The main vulnerability in adobe (which I don't use) is using javascript to call on an insecure undocumented API it run shellcode. I used origami to decrypt and decomporess and pdfid to check if it has javascript triggers (which it doesn't)... but I guess this doesn't even matter for anyone not using adobe viewer. –  Aug 19 '12 at 19:01
  • 4
    Reasonably simple setup would be a VM + [Sandboxie](http://www.sandboxie.com/) + [DigiSigner](http://en.wikipedia.org/wiki/DigiSigner) – Polynomial Aug 19 '12 at 19:59
  • I don't use foxit or adobe. I use an obscure reader. Recently, it crashed when i opened a pdf file. Can this be a malware attack? How do I check? – FirstName LastName Feb 05 '13 at 10:11
  • 1
    Note about the edit - most modern Linux systems have several native PDF viewers available (including a ancient version of Adobe Reader, usually you don;t need to bother with that - I suggest using Okular, and most versions of evince and mupdf work great as well), you don't need to use a Windows VM.... – Wilf Apr 23 '15 at 20:10
  • 3
    @FirstNameLastName be weary of using lesser known products to avoid infection. 1: the product may use a common library and unknowingly be actively exploitable and 2: it may not be getting patched as often or as quickly as more main stream products. Hardened VM really is the only way to be sure. – rob Nov 30 '16 at 10:38
23

Put it through a PDF viewer that isn't vulnerable to the exploit. If it's someone else's viewer, that's even safer. Try Google Docs, where they will parse it and display it as HTML, so the malicious payload won't harm you. (I'm sure that their PDF parser is extremely secure, so you shouldn't feel bad about possibly infecting them.)

B-Con
  • 1,842
  • 12
  • 19
  • 11
    I don't want to give the information in the PDF to google but thanks. –  Aug 20 '12 at 03:21
  • 5
    Using Google Docs is good advice, but "Put it through a PDF viewer that isn't vulnerable to the exploit" sounds strange to my ears. Usually, you don't know whether a particular viewer is vulnerable until it's too late. – Dmitry Grigoryev Jun 21 '17 at 10:42
  • 2
    @DmitryGrigoryev if the exploit depends on javascript as almost all of them do, then a viewer that does not support javascript makes that exploit impossible. An exploit that depends on file attachments is rendered impossible by a viewer that doesn't support attachments. An exploit that depends on retrieving data from a URL cannot work if the viewer does not support retrieving data from a URL. And so forth. – barbecue Jun 21 '17 at 15:09
  • @barbecue If avoiding "almost all" exploits is good enough then by all means one should stick to a PDF viewer without JS support. – Dmitry Grigoryev Jun 21 '17 at 16:59
  • @DmitryGrigoryev I'm not sure what your point is. An exploit specifically targeted to work with Sumatra is possible, as I stated in my answer. Its likelihood is exceedingly small. If you're claiming there is a way to eliminate 100% of all exploits, you're simply wrong. No such method exists. – barbecue Jun 21 '17 at 18:29
  • @barbecue "If you're claiming there is a way to eliminate 100% of all exploits, you're simply wrong." - I'm not claiming that, it's B-Con who suggests to "Put it through a PDF viewer that isn't vulnerable to the exploit". Also, if you're not sure what my point is, you can just let B-Con answer. – Dmitry Grigoryev Jun 22 '17 at 07:27
  • @barbecue: The phrase "isn't vulnerable to the exploit" is a casual term and the context of the question is important. Despite the fact that there are no 100% foolproof methods to thwart malware, in practice we find ways around malware. I'm recommending the OP analyze the perceived threat and put it through a PDF parser that is likely to not be vulnerable. A third-party PDF parser transfers the majority of the risk to a third party and that party's risk is likely low because they likely have designed to handle malware cases. – B-Con Nov 13 '17 at 11:09
  • @B-Con I think we're in complete agreement. Almost all PDF exploits depend on executing malicious scripts. Completely eliminating the ability to execute scripts means you're immune to all such exploits, which is the majority. – barbecue Nov 13 '17 at 20:05
11

Use pdf.js with a sandboxed browser (such as Chromium or Firefox) in a virtual machine without network access.

It should be quite tricky for malware to get out of this.

ysdx
  • 976
  • 6
  • 15
7

A simple and straightforward way to open possibly malicious PDFs on a Windows computer is to use the Sumatra PDF viewer. Sumatra is a small, lightweight PDF viewer that has no support whatsoever for interactive fillable forms or javascript in PDF files.

Sumatra also has configuration options to lock it down even further, such as preventing file system or internet access.

The PDF file format has many interactive features intended to make the format more useful, but which create significant security risks, including:

These abilities combined together make a powerful toolkit for an attacker. Many so-called "drive-by download" attacks rely on the use of PDF files.

Common PDF viewers attempt to provide safety for these features by creating sandbox environments or giving the user prompts, but these solutions are both more complex (and therefore subject to their own vulnerabilities) and less compatible with other parties' products than the simpler solution of leaving out that functionality entirely.

Sumatra is one example of a PDF viewer that does not provide many of the functions which are most commonly used in PDF exploits. By completely eliminating entire categories of potential attacks, such programs greatly reduce the risk of viewing unknown PDF files.

A further advantage of using a less popular viewer is that because it's both less common and less powerful, it's a less interesting target.

The Sumatra viewer could be possibly be exploited by a specially crafted PDF which takes advantage of some unknown bug to cause a buffer overflow, for example. Such cases are rare however, and there have not been any significant security exploits for Sumatra in recent years.

barbecue
  • 630
  • 5
  • 10
  • What makes you think Sumatra is safer than any of the 1001 other PDF viewers out there? – Dmitry Grigoryev Jun 21 '17 at 10:37
  • @DmitryGrigoryev My reasons for thinking this are clearly stated in my answer. I recommend re-reading the first paragraph and looking at the link in the second paragraph. You will find your answers there. – barbecue Jun 21 '17 at 15:04
  • 1
    There's no need to be rude. I did read your answer in full, yet I fail to see what makes Sumatra so special. There are plenty of PDF viewers which either don't support JS or let the user disable it. – Dmitry Grigoryev Jun 22 '17 at 07:18
7

In this situation I've always used the Unix/Linux/OSX shell command "strings". On *nix systems, do this:

strings ScaryFile.pdf | less

You can also get "strings" for Windows, as mentioned by Polynomial, below. You can download it here. Runs on XP or higher. Here is an example of using it on Windows:

strings ScaryFile | findstr /i TextToSearchFor

But for the rest of my answer here I'll assume you're on *nix, since that is my experience with strings. Assuming all you're looking for is text content (not bitmaps or vector graphics), you can scroll down or search and find bits of the text you need. Unfortunately, to find it you have to wade through tons of metadata, most of which is in XML, and formatting settings in some other markup, plus some binary (as ascii, not raw bytes). So you may want to use the search capabilities of the "less" command. To search down the document for the case-sensitive string "thingyouwant", use the slash key + your string + return:

/thingyouwant

Then hit the "n" key to see the next instance of "thingyouwant", over and over till you find what you want. You can use the "?" key to do the same thing in the upward direction. See the less man page (type "man less") for more magic.

You could also analyze things like which URLs the document links to:

strings ScaryFile.pdf| grep -i "http" | sort | uniq | less

But, as stated above, 99% of what you'll see from the output of "strings" is going to be metadata and formatting settings.

Luke Sheppard
  • 2,237
  • 3
  • 15
  • 21
  • 2
    [Windows has strings too](http://technet.microsoft.com/en-us/sysinternals/bb897439.aspx). – Polynomial Aug 20 '12 at 06:04
  • +1 for strings(1) and pdfinfo with a followup in evince. Paging through the file looking for JS and calls to outside resources is quite effective if a bit slow. – adric Aug 24 '12 at 11:14
  • 3
    You shouldn't rely on strings for security: http://lcamtuf.blogspot.ca/2014/10/psa-dont-run-strings-on-untrusted-files.html?m=1 – Tanath Feb 17 '17 at 17:45
6

Use a virtual machine that can be reverted to clean slate after tests. If the PDF reader is vulnerable, your real workstation will be much less likely to be affected.

user65388
  • 61
  • 2
2

Latest versions of Adobe Reader (version 10.1 and up) support "Protected Mode" or sandboxing which can be used to view untrusted PDF files. This effectively restricts the access of the process displaying the PDF file to %appdata%\Adobe\Acrobat and other PDFs which are explicitly opened by the user.

Protected mode has to be activated by going to Edit->Preferences menu and selecting either General or Security tab, depending on the version:

enter image description here

Obviously, you'll want to close any sensitive PDFs like your bank statements before opening the untrusted one.

Dmitry Grigoryev
  • 10,122
  • 1
  • 26
  • 56
  • What is consiered as "unsafe location"? For example, would that be triggered if you open a document from a USB thumbdrive? – Jean-Francois T. May 10 '20 at 06:09
  • @Jean-FrancoisT. I don't use Acrobat anymore, but as far as I remember, yes, thumbdrives are considered unsafe locations. Trusted locations are your local drives, excluding problematic folders such as "Downloads" and "Temp", anything else is untrusted. – Dmitry Grigoryev May 10 '20 at 16:36
1

Another easy and less time consuming option is to open it in the Sandboxie app, which would isolate it.

schroeder
  • 125,553
  • 55
  • 289
  • 326
Lee
  • 31
  • 1
  • 6
  • 1
    Considering this answer is written in 2017, I wonder why you advocate using Sandboxie instead of the [sandbox](https://security.stackexchange.com/a/162444/71607) built-in into Adobe Reader. – Dmitry Grigoryev Jun 21 '17 at 10:33
1

You can open the PDF in a container. Here's a docker image you can use: https://hub.docker.com/r/chrisdaish/acroread/

MY_PDF_DIR='/tmp/foobar'
docker pull chrisdaish/acroread
docker run  -v $MY_PDF_DIR:/home/acroread/Documents:rw \
        -v /tmp/.X11-unix:/tmp/.X11-unix \
        -e uid=$(id -u) \
        -e gid=$(id -g) \
        -e DISPLAY=unix$DISPLAY \
        --name acroread \
        chrisdaish/acroread

This will open an Acrobat Reader that will display via the local X server.

The approach reduces the attack surface, but not 100% safe as it's got access to your X server.

Valer
  • 111
  • 2
1

We can say ALL of the in-the-wild or targetted attack using malicious PDF file are covered with obfuscation techniques to hardened the analysis or detection process.

Most of the obfuscation technique are mainly using JavaScript obfuscation like eval(), String.fromCharCode(), arguments.callee(), base64, and even with PDF key values such as /Author, /Keywords, /CreationDate and etc.

We might unable to view the content of the malicious PDF file (those within the PDF object stream) as it might be deflated commonly with FlateDecode. But there are tools available to allow us to inflate the content within the PDF object stream, such as pdf-parser (http://blog.didierstevens.com/programs/pdf-tools/) and FileInsight (http://www.mcafee.com/us/downloads/free-tools/fileinsight.aspx). Most of the obfuscated JavaScript code will lies within the inflated PDF stream.

We can advise you to get the latest patched version of PDF reader with turned-off JavaScript functionality to open the file, but the good solution is to get a virtual machine where you can delete it or revert the snapshot after opening the file.

d3t0n4t0r
  • 61
  • 2
0

You can use a less popular viewer/OS combination. I guess no one targets Okular running on FreeBSD (though it can still be vulnerable), so if you open the file in a VM you should be very safe.

In order to do harm the rogue payload must match the viewer version and the OS and the CPU architecture of course. It is really low-level assembly and memory stuff (the payload expects to be placed at a particular memory address and expects some standard system functions to be available). If you change any of those, then the payload may not execute properly (or the viewer may simply crash without doing harm).

filo
  • 303
  • 1
  • 5