1

I am trying to accomplish the following and have been unsuccessful. I would appreciate any insight. Scenario: http://www.mydomain.com/filename.html is a webpage. On this webpage I am running a viewer script that calls up documents (.png,swf, pdf) and show these to my visitors when they visit the .html page.Right now you can still access the pdg, swf, and pdf files by going to http://www.domainname.com/viewerfiles/nameoffile.png (or swf or pdf). How can I block the direct access (http://www.domainname.com/viewerfiles/nameoffile.png) but still allow the viewer resting on the html page to access the files located in the viewrfiles folder. I have tried .htaccess method but you can still access the files directly. My goal is to allow visitors to view and print the files from the webpage but I do not want scrappers and bots indexing/taking the .pdf, png, swf etc and hosting them on other sites. Thanks!

makerofthings7
  • 50,488
  • 54
  • 253
  • 542
Tracy
  • 11
  • 1
  • 1
  • 2
  • what web server are you using, and how exactly are you linking in the new files? .htaccess will work to deny access to all IPs to a file or directoy if the server is configured for it. – ewanm89 Mar 02 '12 at 22:54
  • Would something like one-time identification via Captcha help prevent bots? – logicalscope Mar 02 '12 at 23:11
  • you might get some inspiration from this question: http://stackoverflow.com/questions/6030750/control-access-to-files-based-on-db-values-with-php-apache – Kristoffer la Cour Mar 03 '12 at 12:14
  • @logicalscope CAPTCHA isn't any form of identification, what do you mean by one-time identification? – AviD Mar 03 '12 at 18:32
  • 1
    Hi @Tracy, welcome to [security.se]! I suggest taking a moment to review the [FAQ], it will help you get used to the format of this site. – AviD Mar 03 '12 at 18:40
  • @AviD: Human identification vs. bot. – logicalscope Mar 03 '12 at 19:35

5 Answers5

3

It is not possible to do what you want to do, for fundamental reasons. The user controls the client. From the server side, the server has no way to distinguish a legitimate browser that happens to be making a request for an image on the page (a case where you want to serve the image) from a malicious user who is trying to directly access the image (a case where you don't want to serve the image). These two situations are indistinguishable from the server's point of view, so the server has to behave the same way in both cases: either serve the image in both cases, or don't serve it.

You might want to revisit what you are actually trying to accomplish, and see if there is some other way to achieve it. What are you trying to prevent? What threat are you trying to protect against?

If you are trying to prevent scraping, one thing you could consider doing is making the images available only to logged-in users. You will have to decide whether that is appropriate for your site, or if it is not a good match (maybe you want the web pages and images viewable by everyone). Another option is to set up your robots.txt file to kindly request that bots not download the images. You will be forced to rely upon the good will of the bots. Well-behaved bots and scrapers will usually obey the instructions in robots.txt. A malicious user can always ignore the robots.txt file and scrape your site anyway, but unfortunately, there is no way to prevent that: a malicious user will always be able to download any content that you've decided to make available to unauthenticated users. So robots.txt is in some ways arguably about the best you can reasonably do, in most situations.

See also How do I prevent people from stealing photos from my website? and how can I prevent people from using my images from my website.

D.W.
  • 98,860
  • 33
  • 271
  • 588
  • 1
    +1, but one other thing (not really worth an answer) that might help in *some* scenarios, again depending on what @Tracy really wants to do, is generate long random names for each file. Obviously, if these are all linked from an index page, or if the OP is trying to DRM the files, this would not help - but, it would help prevent illicit users from grabbing all the existing files. – AviD Mar 03 '12 at 18:36
  • It is possible to do what the guy is asking. Look at my solution below. – Salvador Dali Nov 16 '12 at 22:17
1

You can only decide whether a user has access to download a file from your site or not, you cannot control what a determined user does with the file. You could attempt to inspect properties of the request to recognize behavior typical of a browser embedding the file as a resource into your main page, but these properties are easily spoofable, and you are prone to breaking the normal usage of your site.

JGWeissman
  • 271
  • 1
  • 6
  • Thanksewan89, logicalscope, and JGWeissman, it sounds like this won't be possible since I have to allow my server to access the images for the viewer, hence a visitor could find the pdf, png etc files as well. I will look at hosting the files on another site and blocking all except the one domain to allow the viewer to access the files. I am not technical nor have I ever posted a question online and am TOTALLY THRILLED at the prompt and courteous responses. Thanks again – Tracy Mar 03 '12 at 01:52
  • @Tracy, I'm afraid that putting it on another domain and blocking all but one domain isn't going to accomplish what you want, either. The request comes from the user's browser, not from the site that is hosting the web page, so the server that hosts the image has no secure way to tell what triggered the request for the image. – D.W. Mar 03 '12 at 03:27
0

You might want to check mod_rewrite in Apache configuration:

RewriteEngine on 
RewriteCond %{HTTP_REFERER} !^http://(www\.)?localhost [NC] 
RewriteCond %{HTTP_REFERER} !^http://(www\.)?localhost.*$ [NC] 
RewriteRule \.(gif|jpg)$ - [F]

This would show 404 page if the file is accessed directly through browser link, but will show it if it's accessed through website code.

grab-a
  • 1
0

Try Googling how to use image Hotlink protection to prevent direct access to them.

If your hosting account has cPanel, their instructions are at: http://docs.cpanel.net/twiki/bin/view/AllDocumentation/CpanelDocs/HotLinkProtection

  • 2
    Welcome to IT Security! We generally discourage answers that only provide a link without any real substance. Perhaps you could expand your answer with more information? –  Nov 17 '12 at 13:13
-1

you told in your post that you are using .htaccess

I already answered this question few days ago on stackoverflow

In that post it was about images, but you can easily modify it for everything

Salvador Dali
  • 1,745
  • 1
  • 19
  • 32
  • This only stops access via a browser. It doesn't stop a user from downloading the file via `curl` (specifying extra headers on the command line if necessary). It doesn't prevent scrapers and bots from downloading the file, so it doesn't answer this question. The question you link to on StackOverflow is a different question: the SO question asks about using up your bandwidth, whereas this question is about preventing people to access the file. – D.W. Nov 17 '12 at 02:10
  • 1
    Also, blocking users who don't send a `Referer` header is a bit unfriendly. Some corporate firewalls, and many privacy tools, block the `Referer` header to protect the privacy of the user. Blocking those users is not a very friendly thing to do if you care about privacy, and will annoy some users (admittedly not many). – D.W. Nov 17 '12 at 02:12
  • I was giving a partial solution, which is better than no solution at all. Because as you told in your post, if the person wants to take the file, he will take it. On the Referer part I totally agree with you. I was not thinking about it when I was responding. Thanks for pointing this out – Salvador Dali Nov 17 '12 at 02:31
  • 1
    Salvador Dali - @D.W. [has a more elaborate answer on REFERER blocking here](http://security.stackexchange.com/q/7944/396) – makerofthings7 Nov 17 '12 at 14:29