How do proxy servers sniff data?

Question

I live in a country which most website on the internet is blocked by government so we mostly use wide variety of proxies such Web proxies, VPN, SOCKS and most of them are free.

My question is: Is there any way, those proxy servers sniff our data even we use SSL on the webpage ? if there is , How ?

There's some technical details here that don't seem to be addressed. Does sniffing mean recording traffic of more specifically that confidentiality is breached? A proxy or gateway can record all of your traffic whether it is encrypted or not. You would then be relying on key secrecy, perfect forward security (PFS) features, and perhaps hope to protect current and past communications. Is this a concern in your scenario? — adric, May 01 '13 at 12:07

score 18 · Accepted Answer · edited Oct 07 '21 at 06:58

When a Web browser uses an HTTP proxy, things go the following way. Let's assume that the target URL is http://www.example.com/index.html. The browser then connects to the proxy and says to it: "I want to get the page at http://www.example.com/index.html". The proxy complies, get the data, and sends it back to the browser. By construction, the proxy sees all the data. The connection between proxy and browser could be encrypted, but the proxy still sees everything.

If the target URL uses SSL (https://www.example.com/index.html: mind the s in https), then the browser connects to the proxy, and tells it: "I want to connect to port 443 of www.example.com; do it and then relay all bytes in both directions". Section 5.2 of RFC 2817 describes this mechanism. The proxy then acts as relay for all the bytes between the browser and the Web server, these bytes encoding whatever the browser and Web server wish them to be, i.e. in practice a SSL handshake and subsequent data. In that case, the proxy is outside of the SSL tunnel and cannot see the exchange data. The SSL tunnel is still between the Web server and the browser, and the proxy is a purely transport mechanism.

The proxy is still in ideal position to attempt Man-in-the-Middle attacks, since all communications go through it; but SSL is protected against that, namely by virtue of the Web server certificate being validated by the client. This of course relies on the human user not to click through warnings about invalid certificates. In some organizations, local sysadmins install extra "rogue" root CA certificates in desktop systems so that they may create fake server certificate on the proxy, and do the MitM attack, granting them access to the exchanged data (so that they can apply antivirus filters even on HTTPS traffic, or so they say at least).

Barring such a rogue CA installation (which is, basically, a breach into the security of the client machine), the proxy won't be able to peek at the data exchanged between the browser and the HTTPS server. The proxy will still be able to know which server was contacted, and to observe the size of the data exchanges: encryption hides the contents, not the size.

Situation for SOCKS is conceptually similar. The dialog between browser and proxy is different, but the basics remain the same: the proxy will be able to see all the HTTP traffic, and none of the HTTPS traffic.

Edit: apparently, some other kinds of "proxies" exist, in which your browser contacts a dedicated server, which itself runs a browser, does your browsing for you, and returns the pages to you. Unfortunately, some people call that "Web proxies", a terminology which has been in use for two decades to designate what is, technically, a HTTP proxy. Confusion ensues.

Such "browsing proxies" are similar, in concept, with opening a session on the server with a remote desktop protocol and running your browser from there. This grants all conceivable power to the proxy administrators on your browsing activities. You don't want to do that if you don't absolutely trust the proxy sysadmins.

And you'd prefer not use the "Web proxy" expression if there is any risk of ambiguity.

Could you elaborate on what happens when a rogue CA is installed by the sysadmins? For example do EV certificates still show as EV, and is it always possible to see the rogue CA on top of the certification path? There is a rogue CA here, but they don't seem to be using it right now (I make sure the EV cert shows whenever I log in). — Luc, May 02 '13 at 07:55
"EV" certificates are normal certificates which are marked as "EV", and ultimately come from a root CA that the browser knows as being "EV-compliant" (the marking is done with a Certificate Policy OID which is specific to each CA). At that point, it really depends on the browser and whether it can accept new "trusted CA" _with EV-issuing power_. Of course, the sysadmin who planted the rogue CA had administrative power on the machine, so he could do about everything he wished. — Thomas Pornin, May 02 '13 at 11:07

score 3 · Answer 2 · edited Mar 17 '17 at 13:14

The only plausible way I can think of is that you somehow allow the proxy software in question to add it's own CA into your listed of trusted root CAs. This will allow the proxy server to perform a MITM attack on your connection.

If this step is not done, there should be no way for the proxy server to read data sent over SSL.

See: Man-in-the-middle Blue Coat proxy SSL or what? for more information.

Adi · Answer 3 · 2013-05-01T20:53:12.970

First of all I'd like to make a distinction between what is commonly known as HTTP Proxies, for which, other answers apply, the confidentiality will not be broken unless you add the proxy's CA to your trusted CAs; and Web Proxies (we still can't agree on a name, you may call it HTML Proxy or Browsing Proxy).

I'll not go in details, but basically, a web proxy is a website which you visit and ask it to "proxy"/"rout" a web page for you. For example, HideMyAss, ZendProxy, AWebProxy, or basically any of the results if you Google for "Web proxy".

Those web proxies are capable of breaking the confidentiality of your HTTPS session, and they're capable of doing so and without being detected. So, only use web proxies you trust, or, better, don't use them at all.

A web proxy stands between you and the SSL-enabled website you're visiting. It acts as an intermediary browser. It decrypts what the server sends you, and it encrypts what you send to the server. It manages your session and handles your cookies. Sitting in the application layer, it can do all of that and possibly more.

Since you're visiting https://web-proxy.com all the time, there's no reason for your browser to warn you when the proxy breaks the confidentiality.

On the other hand, proxies that you configure in the browser's or system settings, can be secure. When visiting a website that should be secure, verify that you're on the right domain (such as "example.com") and that the https connection is not broken (usually a lock icon shows when it's secure). This is different from web proxies because with a web proxy, the domain name is the proxy's domain.

Manishearth · Answer 4 · 2013-05-01T20:57:00.707

Not really. However, make sure you type https in the URL bar and not http -- most websites 301 redirect to HTTPS, but the proxy server can stop this from happening.

Also, watch out for phishing-like attacks--the proxy server may serve you a site like mаil.google.com (the a is Cryllic). This ought to pup up a warning on most implementations, though -- redirecting from an HTTPS site requires the certificate of that site. Also, a lot of proxy servers that have a landing page (with login) generally ask you to add an SSL exception (because trying to access an HTTPS page before login will lead to an insecure redirect). There are chances that this SSL exception will be exploited.

Note that the proxy server can see the domain you are trying to access.

As @Adnan correctly mentions, Web proxies(web pages where you just enter a website in with no browser or network configuration) are completely insecure. Web proxies fetch pages on your behalf and relay the HTML to you (unlike other proxies which relay a slightly modified raw response to you). They post requests on your behalf, and they are the "client" in HTTPS connections (they just hand you a copy of the decrypted HTML and encrypt your responses for you). So they have the option to read all your data. If you want to be secure from sniffing, use HTTPS on a HTTP/SOCKS proxy.

@oleksii: oops. Don't know why I wrote that :P Fixed, thanks :) — Manishearth, May 01 '13 at 08:32
I knew I should look out for "mai1.google.com", but that Cryllic "a" just looks entirely the same. This is scary. How about stаckexchange.com (note the cryllic "а" in there), can you register a domain like that too? Or does it work on subdomains only? I wonder because the top- and second-level domains are not under your control, but the subdomain(s) can be used without registration and managed by your own nameservers. — Luc, May 01 '13 at 10:24
@Luc: Technically mai1.google.com is safe as well, since Google owns the domain (until someone uses a Cryllic o in the `google`). I'm not sure how easy it is to register a domain with such characters and get a certificate. These things are generally automated, so there's a chance one can slip by unnoticed. They seem to have gotten wise after the null-character fiasco, though, so they may have specific checks in place to prevent phishing. — Manishearth, May 01 '13 at 10:31

How do proxy servers sniff data?

4 Answers4