Cross-IPFS-site scripting

IPFS vs same-origin policy

November 09, 2018 (Last Modified: November 11, 2018)

Introduction

These days, browsers are pretty secure, and some are even privacy conscious (Firefox, Brave) that block third party trackers by default. But today’s browsers are ultimately designed for HTTP, not IPFS. And they have a different threat model in mind. All the sites on IPFS are served from the same origin as the gateway, which has some interesting implications for privacy and security.

How IPFS gateway works

IPFS gateway is a web server that connects to some IPFS node daemon. When you type ipfs daemon, you start a network-connected IPFS node, opening an HTTP API server, and starting a web server as the gateway, which by default runs at localhost:8080.

When you want to visit some site on IPFS, you simply open your browser, and in the address bar you type something like

http://localhost:8080/ipfs/QmTWbkEyU2acYD2NHNgZRc4tN943m1vGEdguFQAjxbHZtH

The hash points to an old version of the root of this blog. Here we use our IPFS gateway at localhost:8080. The browser sends a GET request to the gateway, telling the gateway that the page it wants has that particular multihash. The gateway talks to the local IPFS node, who does some libp2p magic to find the peer who has the file from DHT (distributed hash table) then receives the file. The local node returns the file to the gateway server, and the gateway server returns the page to browser in the response body of the GET request.

The browser has no idea that this is an IPFS link, so it treats it like a regular HTTP site, applying all the security measures as if it is a regular website.

If you don’t run your own local IPFS node and instead you use someone else’s gateway, for example ipfs.io or cloudflare-ipfs.com, basically the same thing happens, but instead of browser sending request to your localhost, it sends GET request to an external gateway.

Side note 1: Using an external IPFS gateway is pretty bad for privacy, since the gateway can see what files you are getting, and they usually log everything.

Side note 2: Most of those gateway operators host their gateway behind some firewall, and/or some load balancer, and/or some web server (e.g. Nginx, Apache), or else. Your request may have gone through some other services before ending up at the gateway server.

Violation of same-origin policy

Browsers have this security feature called same-origin policy. To quote from MDN:

The same-origin policy is a critical security mechanism that restricts how a document or script loaded from one origin can interact with a resource from another origin. It helps to isolate potentially malicious documents, reducing possible attack vectors.

IPFS gateway by design violates this policy, because all webpages are served from the same domain, which is the gateway’s domain, regardless of whether the pages are under the same merkle root. So all the webpages on IPFS (on the same gateway) are considered “same-origin”, when they should not be.

Negative examples

In this section I will list some negative examples where user’s privacy can be violated, or sensitive data leaked to adversaries.

SPAs (single page app) that use local storage

Alice made a wonderful SPA that let user send Ether, similar to myetherwallet.com. User enters his/her private key, the app gets the blockchain state from some upstream RPC server (say, etherscan.io), then the app uses the private key to generate a transaction and an ECDSA signature, and finally the app posts the signed transaction to the upstream RPC.

Alice wants to improve user experience by storing the private key in local storage, under key privKey, so user does not have to type his/her key every time.

Mallory knows about Alice’s app. So Mallory hosts a webpage on IPFS with some JavaScript to read privKey from local storage, then uses XHR to send this data to a server Mallory controls. Any user who has used Alice’s app and has private key saved get their private key stolen when they visit Mallory’s site.

Alice runs an IPFS gateway and she wants to spy on her users, maybe so she can sell users browsing history to advertisers. So upon first connection to the gateway, the gateway sets a alice_gateway_uid cookie in response. In subsequent requests from the same browser, the browser happily includes the alice_gateway_uid cookie in requests. Unfortunately, Alice forgot to make the cookie HttpOnly.

Mallory knows about Alice’s gateway. So Mallory hosts a webpage on IPFS with some JavaScript, to read this alice_gateway_uid cookie. Not only Mallory learns that the user is using Alice’s gateway, Mallory also learns users UID. Because the cookie is set by Alice, Mallory can use the same UID to track users too. Further, Mallory can even edit the cookie in JavaScript for impersonation attacks.

Example of a cross-IPFS-site tracking script

This is a fairly simple example. This scripts assigns user a tracker ID, and logs the last 1000 links a user has been to. It can be effectively used to track user across sites that include the same script. Futhermore, any website on IPFS can read such history out of local storage without any restrictions, once they are stored.

Note there are limitations:

History will only include links to sites that include the script on their pages
It cannot track user across gateways

const trID = localStorage.getItem('trackerID')
if (!trID) {
  localStorage.setItem('trackerID', crypto.getRandomValues(new Uint8Array(32)))
}
const trHistory = JSON.parse(localStorage.getItem('trackerHistory')) || []
trHistory.unshift({
  url: window.location.href,
  date: new Date(),
})
if (trHistory.length > 1000) {
  trHistory = trHistory.slice(0, 1000)
}
localStorage.setItem('trackerHistory', JSON.stringify(trHistory))
// or post the history elsewhere on the internet, or pastebin, etc

Security recommendations

For gateway operators…

Domain separation: Serve gateway in a domain specifically for IPFS.
- A good example: Cloudflare has its IPFS gateway served on cloudflare-ipfs.com.
- A bad example: ipfs.io uses the same domain as both the IPFS homepage (ipfs.io/media) and the gateway (ipfs.io/ipfs/xxx).
  - I suspect they use some path based load balancer or some reverse proxy, so anything that is not /ipfs or /ipns goes to one webserver, and anything that starts with those 2 prefixes goes to the gateway.
No cookies: It’s a good idea to not put any cookie or data in local storage under the gateway domain.
- If you must plant cookies, make sure they are HttpOnly

For web developers…

Use a separate domain with CNAME and dnslink: Instead of telling people that your site is at ipfs.io/ipfs/some-hash, you should get your own domain for your site, say my-dapp.com. Then you set up CNAME and dnslink in your DNS records. I have written a short post on how to do this.
Never use cookies or local storage: Because other sites on IPFS can read and write your app’s storage. Even if you have a separate domain, you cannot prevent your users from visiting your site directly through a gateway under the IPFS gateway domain.
- While you can use cookie with path=/ipfs/your-site-hash, which disallows sites at a different IPFS path from reading or writing it, it is still a bad idea as those cookies are sent to the gateway whenever browser makes a request under that path, in which case the gateway learns the cookie and can write to it in an HTTP response.

For IPFS users, content consumers…

Disable cookies and site storage: Because they can be used to track you, and browsers don’t block them cause they aren’t third party cookies, it’s often the best just to disable cookies altogether. It’s recommended to use a separate browser with such settings, specifically for IPFS.

Conclusion and proposed solutions

It’s generally a bad idea to use a regular web browser for IPFS, because those browsers have a different threat model, and the security measures are simply not built for IPFS.

Security can be improved by separating sites under different folders (or “merkle root”) on IPFS. A webpage under /ipfs/siteA/page1 should be considered a different “origin” than /ipfs/siteB/page1. However this can be hard to enforce, and it’s always possible to refer to a file by just the multihash.

An easier solution is to simply disable cookies in browser, but that limits what an SPA hosted on IPFS can do.