Discord Attachments and Privacy

Attachment privacy

Discord attachments use no access control list (ACL). Anyone with the URL to the attachment can download the attachment without any authentication. This is by design not secure, as it is a typical security through obscurity approach. We will see why that is the case in a second.

First let us see how the attachment URL is generated. For example, here I have an attachment picture uploaded by someone in NYC mech keyboard group.

https://media.discordapp.net/attachments/457683857911840769/505482925517570048/IMG_20181026_022500.jpg

The construction of this URL is basically

hostname/attachments/{ChannelID}/{MessageID}/{FileName}

So here we have

ChannelID = 457683857911840769
MessageID = 505482925517570048
FileName = IMG_20181026_022500.jpg

I imagine Discord engineers argue that because Channel IDs and Message IDs and maybe file names are practically impossible to guess, therefore this scheme is secure against unauthorized access of attachments.

Is it true that they are impossible to guess? Before digging in any deeper, let us see how the IDs are generated.

Snowflakes

The IDs are in a format called Snowflake. See Discord documentation for details. The basic gist is, the top 42 bits of the ID is essentially the timestamp. So let us use the image attachment posted above, we have the channel ID and message ID.

Channel ID is created whenever the channel was created. So by looking at the channel ID in the attachment link, it leaks some metadata (creation time) of the channel. Usually, to get channel and guild information you would need to be a member, but if someone shares an attachment link, you can tell when the channel was created.

Let’s look at our example, where our channel ID is 457683857911840769. We quickly see that the channel was created at unix timestamp 1529190735, or in Jun 16 2018.

We can do the same thing to message ID, which in our case is 505482925517570048. Decoding it gives us unix timestamp 1540586920, which is around Oct 26 2018.

Moral of the story

  • To Discord users: Don’t share attachment links if you don’t like people knowing your channel’s creation time and the timestamp of the message. But in the worst case, consider all your attachments public.
  • To Discord devs: Implement a proper ACL for the attachments and don’t rely on the secrecy of the IDs.

Script to convert Snowflake to timestamp

Requires bignumber.js. Install that first.

const bn = require('bignumber.js')
function sfDecode(sf) {
  function pad(n, width, z) {
    z = z || '0';
    n = n + '';
    return n.length >= width ? n : new Array(width - n.length + 1).join(z) + n;
  }
  const padded = pad(new bn(sf).toString(2), 64)
  const ts = new bn(padded.substr(0, 42), 2)
  return {
    timestamp: new Date(new Date("2015-01-01").getTime()+ts.toNumber()),
    wid: parseInt(padded.substr(42, 5), 2),
    pid: parseInt(padded.substr(47, 5), 2),
    cnt: parseInt(padded.substr(52, 12), 2),
  }
}
console.log(sfDecode("457683857911840769").timestamp.getTime())

Wait, there is more!

I appreciate it that you are still reading this meaningless post. So in this section we’ll see how an attacker would be able to exploit the weakness in the attachment URL to gain access to attachments in channels the attacker is not a memeber of.

Everything here is theoretical. Even I don’t like Discord myself, I do not condone brute-force or attacks against Discord.

There is no randomness in the URL and it’s entirely based on obscurity of the IDs, so it shouldn’t be too hard to attack.

Brute force and entropy

Here we assume the attacker knows the channel ID and maybe the filename. (Maybe the attacker used to be part of the channel before he/she got kicked out.)

We already know that the first 42 bits of the message ID is timestamp. If the attacker knows roughly what time a message is posted, how many numbers would the attacker guess before he/she gets the correct message ID? Let’s do some math and find out!

Let’s use our example, where message ID is

505482925517570048
=
0000011100000011110101100000101011100111100000000000000000000000

And let’s decode the snowflake from here (below number is printed in decimal)

120516520862 || 0 || 0 || 0

Add the time to epoch time, we can finally get something readable by humans

Timestamp = Fri Oct 26 2018 16:48:40 GMT
WorkerID = 0
ProcessID = 0
Counter = 0

Some heuristics, based on my very limited data set:

  • Process ID is always 0 and I have not seen an attachment link with snowflake Process ID not zero. So let’s give it 0 bit of entropy.
  • Worker ID is almost always 0, but sometimes can go to 3 or 4. Let’s give it 3 bits.
  • Counter is almost always 0 but I’ve seen it going to 20. Let’s give it 5 bits.

If our attacker only knows the day the message is posted, he/she would have to guess the exact timestamp from the day. There are 86400000 milliseconds in a day, which takes about 27 bits.

So to brute force message ID given the day of a message, attacker needs to brute force about 35 bits.

Similarly we can calculate that, if the attacker knows the hour of a message, attacker needs to brute force 30 bits. Luckily Discord is smart enough to use Cloudflare, likely with rate limiting, so an attacker can’t easily make 2^30 guesses without being throttled.

Google Cloud Storage

For any attachment, one can access it through 2 different links.

https://media.discordapp.net/attachments/457683857911840769/505482925517570048/IMG_20181026_022500.jpg
https://cdb.discordapp.com/attachments/457683857911840769/505482925517570048/IMG_20181026_022500.jpg

They all go through Cloudflare. But if you GET the 2nd link you will see some Google response headers. Maybe Discord is using Google Cloud Storage for the attachments. Let’s try asking Google Cloud Storage, with bucket name discord, and object ID being the attachment URL with / encoded to %2F.

https://www.googleapis.com/storage/v1/b/discord/o/attachments%2F457683857911840769%2F505482925517570048%2FIMG_20181026_022500.jpg

We get this response

{
 "kind": "storage#object",
 "id": "discord/attachments/457683857911840769/505482925517570048/IMG_20181026_022500.jpg/1540586921108736",
 "selfLink": "https://www.googleapis.com/storage/v1/b/discord/o/attachments%2F457683857911840769%2F505482925517570048%2FIMG_20181026_022500.jpg",
 "name": "attachments/457683857911840769/505482925517570048/IMG_20181026_022500.jpg",
 "bucket": "discord",
 "generation": "1540586921108736",
 "metageneration": "1",
 "contentType": "image/jpeg",
 "timeCreated": "2018-10-26T20:48:41.108Z",
 "updated": "2018-10-26T20:48:41.108Z",
 "storageClass": "STANDARD",
 "timeStorageClassUpdated": "2018-10-26T20:48:41.108Z",
 "size": "3212729",
 "md5Hash": "95Ke7eNUlRiBoY9f8aw8wA==",
 "mediaLink": "https://www.googleapis.com/download/storage/v1/b/discord/o/attachments%2F457683857911840769%2F505482925517570048%2FIMG_20181026_022500.jpg?generation=1540586921108736&alt=media",
 "cacheControl": "max-age=2592000",
 "crc32c": "kcqF/A==",
 "etag": "CIDKtsH9pN4CEAE="
}

Welp, it also leaks a bunch of metadata, that’s supposed to be only viewable by message participants. But maybe this can be used to bypass Cloudflare’s rate limiting and anti DDOS?