49

Let's say I create a microsite for a client that contains confidential business information. We need to place this in a location the client can access, in order for them to approve for launch.

If we place this microsite behind a login, we have a guarantee noone can just stumble across the content and compromise it. But, what if we publish it to an undisclosed, unindexed directory with a name of the same "strength" as the aforementioned password? For the sake of argument, "undisclosed and unindexed" means it won't be manually or automatically linked to/from anywhere, or indexed by any website search on the same domain. It also won't be placed in it's own subdomain, so DNS crawling is not a concern.

My initial instinct says this is simply security by obscurity, and is much less secure due to the possibility of someone just stumbling over it. But, after thinking about it, I'm not so sure. Here's my understanding:

  • Even using a dictionary-weak, two-word string for both the password and the URL, there are still billions of guessable options. Placing it in the URL doesn't magically reduce that list.
  • Login pages can have brute-force protection, so an attacker would get optimistically 20 attempts to guess. URL guessing would have to be caught by the server's DoS or spam protection, and may allow 200 404-producing guesses if you're anticipating an attack - still not statistically significant to billions of options.
  • The login page is linked from a website - it's a visible wall for an attacker to beat on. It's evidence that something exists worth attacking for. Guessing the URL, however, is blind. It requires being on the right domain (and subdomain), and operating on faith that, even after tens of thousands of incorrect guesses, you're still going to turn something up.
  • The URL has an extra susceptibility to being index/spidered externally. However, most respectable spiders don't "guess" at sites, they just follow links. A malicious "guessing" spider would be caught by the same DoS/spam protection as point 2.

From what I can tell, the only meaningful difference between the two is imagined peace of mind. The possibility that the URL can be stumbled over makes people nervous, and the login makes things feel secure, despite them seeming comparable based on the points above. The URL option still feels like it should be much less secure, though. What am I failing to consider?


EDIT: A lot of valid human-error concerns popping up. This question was inspired by a client that implements a degree of human-proofing security - vpn login via keyfob, screen dimmers, 5min sleep timeouts, social media blackout, etc. For this question, please assume no public-network access and no incidental breaches like shoulder-watching or "oops! I posted the link to twitter!". I'm looking for a more systematic answer, or at least one more satisfying than "humans screw up".


EDIT 2: Thanks for pointing out the possible duplicate. IMHO, I think each has a value as an individual question. That question addresses image security specifically, and delves into alternate methods of securing and encoding that data (eg base64 encoding). This question more specifically addresses the concept of secrecy vs obscurity, and applies it to why a login is better than a URI independent of the type of data in question. Furthermore, I don't think the accepted answer there explains my particular question as deeply or thoroughly as @SteveDL's great answer below.

CodeMoose
  • 601
  • 5
  • 10
  • 13
    This is not a comprehensive answer so I will put it as a comment: When the URL is the secret, often people won't treat it as carefully as they would some other info like a user/pass. Sure, a sufficiently complex URL is used all the time to protect confidential data (cloud file sharing sites do it all.the.time) but that doesn't make it the same as a user/pass which people generally (but not always) regard with a bit more reverence. If the time scale is short (i.e. it's only live for a few days and then wiped) then you are not really notching up the risk by doing it that way. – Jeff Meden May 12 '15 at 20:25
  • @JeffMeden For cloud sharing websites, stuff that actually needs to be secure isn't just accessible with a URL; you also have to be logged in. – cpast May 12 '15 at 20:52
  • 13
    Secrets such as passwords are kept secret *by design*. They're used and transferred and stored when needed, and then let go of. URLs are not considered secrets and as such are not handled with the purpose of minimizing their lifetime by all the actors of the Internet ecosystem. That's what you're missing. – Steve Dodier-Lazaro May 12 '15 at 22:02
  • @SteveDL I think you're right - even ignoring accidental/incidental exposures, it's the perception of the link vs the password that's causing the problem. If you post it as an answer, I'll accept it! – CodeMoose May 12 '15 at 22:19
  • Ditto @JeffMeden – CodeMoose May 12 '15 at 22:19
  • 1
    Done. Note that all the answers are equally good at this point. Also, don't assume that people who reason exclusively in terms of security have got the "human error" aspect covered. That requires hiring interaction designers, ethnographers and the like :-) – Steve Dodier-Lazaro May 12 '15 at 23:04
  • 2
    Adding a password can be very easy when you just use a plain HTTP login instead of a fancy HTML form. On Apache, for example, you can do this by placing a .htaccess file with the username and password in the same directory on your webserver - it's ugly but quick to do and does the job. – Philipp May 13 '15 at 09:05
  • Note that posting the URL to someone in a *private* Facebook chat (and a few other services) will trigger an access to that URL. – pjc50 May 13 '15 at 10:46
  • I didn't see anyone else mention this, but you could mitigate the problem of exposing the URI during transmission by placing the "password" behind a crosshatch (#), and having some javascript move it into the body of an XHR request. Browsers won't send the "URI Hash" (not to be confused with a cryptographic hash). It's not intended to hold secrets though, so all the other browser history logging problems and such still apply. – Adam Hart May 13 '15 at 18:10
  • @ZevChonoles while both questions bring up similar concerns about URI security, IMHO I think each has a value as an individual question. That question addresses image security specifically, and delves into alternate methods of securing and encoding that data (eg base64 encoding). This question more specifically addresses the concept of secrecy vs obscurity, and applies it to why a login is better than a URI independent of the type of data in question. – CodeMoose May 14 '15 at 11:37
  • Also look at this question and its answers (specially think of your client testing the site with Google Chrome (autsch!): http://security.stackexchange.com/questions/63124/how-can-outsiders-discover-the-pages-that-are-being-hosted-on-my-server/63167#63167 – Sir Cornflakes May 15 '15 at 12:07

6 Answers6

69

I'll extend on one point at a slightly more abstract level about why public authenticated spaces are preferable to hidden unprotected spaces. The other answers are all perfectly good and list multiple attacks one should know better to avoid.

Everyone with formal training should've heard at some point of the Open Design security principle. It states that systems must not rely on details of their design and implementation being secret for their functioning. What does that tell us about secret passwords vs. secret URLs?

Passwords are authentication secrets. They are known by a challenged entity that provides them to a challenging entity in order to authenticate. Both parties need a form of storage, and a communication channel. Stealing the password requires compromising either of the three. Typically:

  1. The user must be trapped or forced into revealing the password
  2. The server must be hacked into so that it reveals a hashed version of the password
  3. The confidentiality of the channel between the user and the server must be compromised

Note that there are plenty of ways for authentication to be toughened, starting by adding an additional authentication factor with different storage requirements and transmission channels, and therefore with different attack channels (Separation of Privileges principle).

We can already conclude that obscure URLs cannot be better than passwords because in all attack vectors on passwords, the URL is either known (2 and 3) or obtainable (1).

Obscure URLs on the other hand are manipulated much more commonly. This is in large part due to the fact that multiple automated and manual entities in the Internet ecosystem process URLs routinely. The secrecy of the URL relies on it being hidden in plain sight, meaning it must be processed by all these third-parties just as if it were a public, already-known commodity, exposing it to the eyes of all. This leads to multiple issues:

  • The vectors through which these obscure URLs can be stored, transmitted and copied are much more numerous
  • Transmission channels are not required to be confidentiality-protected
  • Storage spaces are not required to be confidentiality or integrity protected, or monitored for data leakage
  • The lifetime of the copied URLs is by and large out of the control of the original client and server principals

In short, all possibilities of control are immediately lost when you need that a secret be treated openly. You should only hide something in plain sight if it is impossible for third-parties to make sense of that thing. In the case of URLs, the URL can only be functional in the whole Internet ecosystem (including your client's browser, a variety of DNS servers and your own Web server) if it can be made sense of, so it must be kept in a format where your adversaries can use it to address your server.

In conclusion, respect the open design principle.

DavidTheWin
  • 103
  • 1
Steve Dodier-Lazaro
  • 6,828
  • 29
  • 45
  • 6
    Wish I could +2 this - exactly what I was looking for. I was looking at the problem too closely under a microscope, and failed to consider the implications in the ecosystem at large. Thanks for the great answer! – CodeMoose May 13 '15 at 02:44
54

Since we're talking theoretically, here are several reasons why a random URL alone is not sufficient enough to protect confidential data:

  • URLs can be bookmarked.
  • URLs are recorded in the browser history (public kiosk).
  • URLs are displayed in the address bar (shoulder surfers).
  • URLs are logged (think 3rd party proxy).
  • URLs can be leaked via Referrer headers

I'm unclear about some of your bullet points.

Are you saying that this potential webserver / website / platform does indeed have directory fuzzing protection, or is this hypothetical?

Even so, it doesn't protect against the items I mentioned above.

l0b0
  • 3,011
  • 21
  • 29
k1DBLITZ
  • 3,953
  • 15
  • 20
  • 16
    and please last but not least: URLs are collected by your government for your own good, or something. – Steve Dodier-Lazaro May 12 '15 at 22:00
  • 7
    Also if a laptop that had the page open, it is a lot easier to accidentally attempt to reload the URL over an insecure network than to accidentally POST the password over an insecure network. – kasperd May 12 '15 at 22:50
  • +1 for being concise and informative - thanks for the answer! – CodeMoose May 13 '15 at 02:40
  • 2
    @Steve DL Not only the government. If you browse from your employer's network, a free starbuck wifi, your ISP DSL or your mobile phone data connection they have access to your browsing history. Your employeer might be interested to see if you entered into facebook. At least if you do not use TOR or a similar software – borjab May 14 '15 at 10:43
  • 2
    @SteveDL ... and even by organisations standing above governments. For example an innocously configured Internet Explorer may ask Microsoft whther it would be safe to visit that URL, which they test by visiting that URL. In other words, you will *really* notice unintended surprise accesses to the obscure URLs in your logs. – Hagen von Eitzen May 14 '15 at 15:19
11

Guessing the URL, however, is blind. It requires being on the right domain (and subdomain)

However, most respectable spiders don't "guess" at sites, they just follow links’

Considering major search engines not to be respectable is a defensible position, but it doesn't change the fact that they do more than follow links. In particular, search engines can and do enumerate DNS entries, so the mere existence of a subdomain is a risk.

A lot of stuff ends up on Google even though people swear they never linked to it from anywhere and Google doesn't return any page that links to the site.

That's in addition to the problem that people generally don't treat URLs as confidential, and that URLs appear in all kinds of places such as server, browser and proxy logs. URLs are also visible to, and used by, many more browser extensions than passwords. If the “hidden” site has outgoing links, the URL is likely to appear in Referer: headers.

There's also the risk that through a misconfiguration, a link to the hidden site appears in a non-hidden place, for example if the hidden site is hosted on a site that offers a local search facility.

The login page is linked from a website - it's a visible wall for an attacker to beat on. It's evidence that something exists worth attacking for.

That doesn't make sense. Use decent software and a randomly-generated password, and there's no attack surface worth pursuing. In contrast, a hidden directory doesn't even look like something worth attacking, it looks like something that's open to the public.

A secret URL is particularly risk-prone because if the URL is leaked accidentally and a search engine discovers it, the whole site content will become exposed through that search engine. A password doesn't fail as catastrophically: if the password is leaked, it still takes some voluntary action for someone to start downloading the data, it doesn't automatically start a machinery that will publish it for everyone to see.

Gilles 'SO- stop being evil'
  • 51,415
  • 13
  • 121
  • 180
  • 1
    While you raise good points, I don't feel you've addressed the spirit of the question. I agree on the DNS vulnerability - that's why this question involves subdirectories, not subdomains. The accidental indexing is moot, since one of the assumptions of the question is that the subdirectory is established undisclosed and unindexed. Lastly, I respectfully disagree that the login page point "doesnt make any sense". What value is there in an attacker picking a random domain and combing for possibly hidden content? It'll almost never be successful. A login page, even locked down, provides feedback. – CodeMoose May 12 '15 at 22:16
  • While I understand the points you're trying to make, it may be beneficial to put a bit more effort into understanding the actual question so you can answer accurately. I'll also try to make edits to make it more clear what I'm asking. – CodeMoose May 12 '15 at 22:17
  • 1
    @CodeMoose Feedback that just says “wrong password” isn't useful feedback. Accidental indexing is not a moot point, it's a pretty common risk. I'm reasonably confident that I have understood the question, and I addressed quite a few of your points directly. – Gilles 'SO- stop being evil' May 12 '15 at 22:23
  • And yet, what I'm reading seems to say "here's why this isn't a realistic question to ask", not "given those points, here's the x factor you haven't considered". Again, I appreciate the input, you very clearly know what you're talking about :) I'm just looking for an answer in a different avenue, and was trying to give feedback to that effect. – CodeMoose May 12 '15 at 22:32
  • 6
    @CodeMoose I really struggle to understand your comment. My answer is pretty much only “here's the x, y, z factors you haven't considered”, and I nowhere say anything like “this isn't a realistic question”. It's a realistic and reasonable question, and the answer is a strong no. – Gilles 'SO- stop being evil' May 12 '15 at 22:37
  • @CodeMoose I think you were expecting a high-level comment on the design of your security requirement, but you asked the question in implementation terms, hence you attracted implementation-level replies. Is that correct? – Steve Dodier-Lazaro May 12 '15 at 23:06
  • @SteveDL That might be correct - I thought I had a solid grasp on the question, but it looks like it's more nuanced than I thought. Thanks for bearing with me! – CodeMoose May 13 '15 at 02:39
  • @Gilles thanks also for the excellent answer, learned a lot of unexpected things in this process. – CodeMoose May 13 '15 at 02:44
  • But what if the password is in fact leaked in a search engine friendly way, such as a link to http://username:password@example.com/secretpage.html ? – Hagen von Eitzen May 14 '15 at 15:22
  • 1
    @HagenvonEitzen Do people still use basic HTTP authentication these days? – Gilles 'SO- stop being evil' May 14 '15 at 15:25
3

I agree with the other answers that it is a bad idea, simply because people (=> developers => applications that log information) do not consider URL's to be private and thus there are a lot of different ways the key could be leaked. What you however have correctly recognized is that passwords essentially are a form of security through obscurity. And that conceptually there is nothing wrong with the scheme you're proposing. The only problem is introduced due to the fact that the scheme you're proposing is misusing systems in ways they were not intended for.

Even using a dictionary-weak, two-word string for both the password and the URL, there are still billions of guessable options. Placing it in the URL doesn't magically reduce that list.

True, but it doesn't make it safer either.

Login pages can have brute-force protection, so an attacker would get optimistically 20 attempts to guess. URL guessing would have to be caught by the server's DoS or spam protection, and may allow 200 404-producing guesses if you're anticipating an attack - still not statistically significant to billions of options.

If you're anticipating an attack you will probably limit it equally to best practices for brute-force protection for your type of application. So indeed, it isn't worse if done right, but it definitely isn't better and will likely be worse as you will have to do a lot more custom work.

The login page is linked from a website - it's a visible wall for an attacker to beat on. It's evidence that something exists worth attacking for. Guessing the URL, however, is blind. It requires being on the right domain (and subdomain), and operating on faith that, even after tens of thousands of incorrect guesses, you're still going to turn something up.

Absolutely true, and for this reason I have seen some companies hiding their intranet login pages on slightly unpredictable URLs. Is it something to rely on? Definitely not. Is it something that might stop certain low-profile attackers? Definitely.

Either way, this however only provides limited benefit on it's own compared to a large trade off as described in the first paragraph.

The URL has an extra susceptibility to being index/spidered externally. However, most respectable spiders don't "guess" at sites, they just follow links. A malicious "guessing" spider would be caught by the same DoS/spam protection as point 2.

The only issue with spiders is that they might find a random cache somewhere where the URL was linked and index this in a way that is easier to find for others. Random guessing is indeed not a problem.

David Mulder
  • 1,349
  • 1
  • 8
  • 18
-2

I've personally used HTTPS-with-unguessable-URL as a protocol for delivering files securely. With browser history turned off at the receiving side, and if the URL is communicated as securely as a password would be, this is pretty much as secure as an HTTPS login page. Which is much less secure than, e.g., GnuPG.

Atsby
  • 1,118
  • 8
  • 6
  • 1
    The URL itself is not encrypted, though. Any node in the path can see the URL and then retrieve that file. – schroeder May 13 '15 at 22:29
  • 1
    @schroeder The use of HTTP**S** prevents that (yes, URLs are sent through SSL/TLS in HTTP**S**). Of course, you need a real certificate, as opposed to self-signed, to avoid MITM attacks. – Atsby May 13 '15 at 22:40
  • True, but with notable exceptions: client-side proxies, the server's access logs, and load-balancers/SSL off-loaders on the server's side. – schroeder May 13 '15 at 23:02
  • @schroeder Well, technically a POST password passes through all of those as well in those scenarios. The only thing different would be the logging. – Atsby May 14 '15 at 01:22
-3

Like others have said if you plan on leaving this directory and website up for a day or two with confidential information and data and you are in a serious time crunch then this would be ok but not "best practice". In other words its not recommended but if you feel you need to take the chance.

The main issue with this concept is what if the client has a rootkit or key logger on his or her machine? What if another party obtains the link? What I'm saying is anyone who obtains this link will be able to access this confidential data. I would put a quick login on the page to only allow access to the clients that data is for.

702cs
  • 127
  • 1
  • 2
    If a client's device is compromised, then you need authentication factors that do *not* require trusting the client machine to keep the client's input to itself. A password would equally be captured by a keylogger. – Steve Dodier-Lazaro May 12 '15 at 23:08
  • 702cs, as @SteveDL already commented, your solution does not protect against loggers. Also this type of risky behavior should be avoided not endorsed as its high risk and you can not predict the outcome. Keep up and Keep on answering (we all get 'bad' answers before we learn what good answers are ;) ) – LvB May 13 '15 at 01:35