7

Web application allows users to upload files. Is it necessary to scan those files by antivirus?

I'd want to hear answer in two scenarios:

  1. Type of file isn't checked when uploading. They are stored in folder that isn't directly accessible from the internet. Files don't have permissions to execute. Then user may query some url, and files will be returned with Content-Disposition: attachment
  2. Type of file isn't checked when uploading. It's contents is parsed with some HTML/XML/JSON/etc. parser. After parsing file isn't stored at server
Andrei Botalov
  • 5,317
  • 10
  • 46
  • 73
  • In case 1, beware of pranksters submitting EICAR and then having the download get rated by their anti-virus program. All alarm bells probably go off and your domain is blocked for everybody using that AV, or at least that happened to me once... Only I uploaded it myself so I wouldn't have to look EICAR up all the time. I figured I'd better move it to a private directory :P – Luc Nov 01 '12 at 20:04
  • Nobody mentioned an important detail: threat model. In your threat model, are the people uploading these files considered to be potential attackers? If so, then #1 is a must. – Mark E. Haase Nov 02 '12 at 01:41

4 Answers4

6

I see no reason to use a virus scanner in the second case. The chances of a code-execution vulnerability in an anti-virus seem larger than that of a vulnerability in my parsing libraries. Especially if the parsing libraries are written in a memory safe language.

In the first case I'd use an anti-virus, since offering virus files for download doesn't sound like a good idea. Both people and automated website reputations systems(for example google safe browsing) might associate virus downloads with your website hurting its reputation.
I might still offer such a file for download, but only after an explicit warning and user confirmation. Preferably a confirmation that prevents bots from downloading it.

CodesInChaos
  • 11,964
  • 2
  • 40
  • 50
4

In scenario 1, you can safely assume that the content will not be executed on the server, and the user will be forced to download the file rather than viewing it in their browser. This is reasonably safe from your point of view, because you offload the responsibility of checking the file to your users.

In scenario 2, I'd definitely run it through an AV. Your parser library might have a vulnerability, so you most definitely want to do some basic checks before handing it off to the library.

Polynomial
  • 133,763
  • 43
  • 302
  • 380
1

If someone asked you to carry a sealed package as part of your luggage on a commercial flight, would you?

Even if you're just providing a respository for users to store their own files rather than re-publishing the uploaded content, it would be reckless to accept anything that is thrown at your server.

However AV scanning is not the only way to protect against malware - verifying the mimetype and using a lossless conversion method to switch between formats can neutralise a large proportion of threats (e.g. convert jpg to png, MSWord doc to pdf*). If you have implemented your own parser for the file and it's a basic format such as XML, then you're already verifying the contents - AV scanning is probably overkill. But like all code, it should be secure.

*) but not using MSWord to do the conversion!

symcbean
  • 18,418
  • 40
  • 74
  • Good suggestion to convert files first, but I think that doesn't work in this case. There is no way you could do this for all possible filetypes. – Luc Nov 01 '12 at 22:43
1

Type of file isn't checked when uploading. They are stored in folder that isn't directly accessible from the internet. Files don't have permissions to execute. Then user may query some url, and files will be returned with Content-Disposition: attachment

Scanning before delivering the attachment is just the courteous thing to do, but there are lots of things that are problematic to store but which will pass a virus scan just fine.

This may be a tad paranoid, but if you wouldn't intentionally write software that lets people plant evidence that incriminates you, don't unintentionally write software that does that.

What kind of information do you log about the uploaders? Might you have enemies who would upload child pornography or bomb-making plans to your server and then send an anonymous tip to the police?

Type of file isn't checked when uploading. Its content is parsed with some HTML/XML/JSON/etc. parser. After parsing the file isn't stored at server.

Anti-virus scanners are unlikely to be helpful here. Same question as above. Can you sanity check the information you extract from the files or prove that whatever information you store came from an external source?

Mike Samuel
  • 3,873
  • 18
  • 25
  • I never understood that part. If a site like Google Drive or Drop Box gets files with illegal contents, how does it protect itself? – AturSams Jun 11 '14 at 10:40
  • @ArthurWulfWhite, many legal systems have some protections for [common carriers or public carriers](http://en.wikipedia.org/wiki/Common_carrier#Legal_implications). – Mike Samuel Jun 11 '14 at 11:12