DevTooling Discord

This post is very much inspired by Hussein Nasser's "Dev Tool them ALL!" series, where he picks an application he wants to learn more about, like Netflix or Reddit, and uses Chrome DevTools to get a better understanding of how the backend actually works. I noticed Discord hasn't been covered yet, so I figured I'd give it a shot myself. Discord has a well-documented API, which I've worked with before, mostly with bots. There are plenty of open-source libraries and wrappers for the bot API, but I'm more interested in how the web client interacts with regular user accounts. Hussein also takes note of specific things like request timing and HTTP version, but for the sake of time I've decided to omit that information for now as I'll be focusing mostly on the actual content and data.

Context

There is a lot to Discord, it is an absolutely massive application with lots of moving parts. For this post, I am only going to be devtooling the first page you see after logging in, the home/friends/DM's overview page at the discord.com/channels/@me:

Initial HTML

Discord being built with React, initially serves a blank HTML page with a bunch of scripts that populate the page with data. What I did find interesting however, was that embedded within the HTML page was a script that that set the window.GLOBAL_ENV as equal to the following object:

text

API_ENDPOINT: '//discord.com/api',
API_VERSION: 9,
GATEWAY_ENDPOINT: 'wss://gateway.discord.gg',
WEBAPP_ENDPOINT: '//discord.com',
CDN_HOST: 'cdn.discordapp.com',
ASSET_ENDPOINT: '//discord.com',
MEDIA_PROXY_ENDPOINT: '//media.discordapp.net',
WIDGET_ENDPOINT: '//discord.com/widget',
INVITE_HOST: 'discord.gg',
GUILD_TEMPLATE_HOST: 'discord.new',
GIFT_CODE_HOST: 'discord.gift',
RELEASE_CHANNEL: 'stable',
MARKETING_ENDPOINT: '//discord.com',
BRAINTREE_KEY: 'production_5st77rrc_49pp2rp4phym7387',
STRIPE_KEY: 'pk_live_CUQtlpQUF0vufWpnpUmQvcdi',
NETWORKING_ENDPOINT: '//router.discordapp.net',
RTC_LATENCY_ENDPOINT: '//latency.discord.media/rtc',
ACTIVITY_APPLICATION_HOST: 'discordsays.com',
PROJECT_ENV: 'production',
REMOTE_AUTH_ENDPOINT: '//remote-auth-gateway.discord.gg',
SENTRY_TAGS: {"buildId":"9ab8626bcebceaea6da570b9c586172d02b9c996","buildType":"normal"},
MIGRATION_SOURCE_ORIGIN: 'https://discordapp.com',
MIGRATION_DESTINATION_ORIGIN: 'https://discord.com',
HTML_TIMESTAMP: Date.now(),
ALGOLIA_KEY: 'aca0d7082e4e63af5ba5917d5e96bed0',

This is actually pretty cool, because it gives us an overview of the various services and 'modules' of Discord that it seems to be comprised of. Right off the bat, we have the API_ENDPOINT, API_VERSION, RELEASE_CHANNEL, and PROJECT_ENV. All of these just define the environment for Discord, and not all that much special is going on.

The keys ending with _HOST and _ENDPOINT also are rather intuitive, and it's also clear that Discord is still using its legacy discordapp.com domain, especially for its CDN on cdn.discordapp.com which is much more difficult to migrate.

There were a few domains I wasn't aware of at all here, especially the ACTIVITY_APPLICATION_HOST at discordsays.com. I decided to take a closer look and found this list detailing Discord's domains and their uses. Among those was discord.co, which just redirects you to discord.com, but if you visit admin.discord.co, you're redirected to what appears to be Discord's Cloudflare Access admin page.

I'm sure I could find plenty more internal tools hunting around, but honestly I'm not that interested and there's no way I'm going to be able to access them anyway. What is important to note however is that Discord is using the following services:

Stripe - payment processor and infrastructure
Braintree - payments partner
Sentry application monitoring and error tracking
Algolia - search provider

That's a reasonable amount of information to discern from just the initial HTML, especially for a React app, so now let's take a look at the requests that follow.

Discord's Gateway

Right after loading some CSS and JS, the first connection we establish afterwards is to Discord's gateway at wss://gateway.discord.gg/?encoding=json&v=9&compress=zlib-stream. You might have noticed in the URL params that this connection is actually compressed with zlib, meaning that we won't be able to read the content of the messages that easily without first decompressing it.

Upon first glance, it seems like the server sends us a small packet laying down some standards, the client responds with token and other information, and then the server sends us a massive 319 kilobyte packet of data. I assume this large packet contains the bulk of the information you see on the page, like all your friends, activities, servers, etc. Let's take a closer look.

I've put together a little Python script where you can paste in the raw hex data and it'll decompress it and output the JSON to a file:

Python

import zlib
import json

payloads = [
  "Enter each copied hex value payload here",
  "One after another like this"
]

z = zlib.decompressobj()

for i, string in enumerate(payloads):
  bytes = bytearray.fromhex(string)
  d = z.decompress(bytes)
  jsonobj = json.loads(d.decode("UTF8"))

  file = open(f"payload-{i}.json", "w")
  file.write(json.dumps(jsonobj, indent=2))
  file.close()

0. Initial Packet

JSON

{
  "t": null,
  "s": null,
  "op": 10,
  "d": {
    "heartbeat_interval": 41250,
    "_trace": ["[\"gateway-prd-us-east1-d-l709\",{\"micros\":0.0}]"]
  }
}

Yup, I was right. It specifies the heartbeat interval and seems to include some extra metadata about which specific gateway server we're connected to. I can recognize the us-east1-d Google Cloud region which is located in Moncks Corner, South Carolina. I'm not sure what the l709 part is, but I'm guessing it's a specific server instance.

Upstream Packet

After the server's initial packet, the client responds with this:

JSON

{
  "op": 2,
  "d": {
    "token": "[Redacted for hopefully obvious reasons]",
    "capabilities": 4093,
    "properties": {
      "os": "Linux",
      "browser": "Chrome",
      "device": "",
      "system_locale": "en-US",
      "browser_user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36",
      "browser_version": "107.0.0.0",
      "os_version": "",
      "referrer": "http://localhost:3000/",
      "referring_domain": "localhost:3000",
      "referrer_current": "",
      "referring_domain_current": "",
      "release_channel": "stable",
      "client_build_number": 158183,
      "client_event_source": null
    },
    "presence": {
      "status": "online",
      "since": 0,
      "activities": [],
      "afk": false
    },
    "compress": false,
    "client_state": {
      "guild_versions": {},
      "highest_last_message_id": "0",
      "read_state_version": 0,
      "user_guild_settings_version": -1,
      "user_settings_version": -1,
      "private_channels_version": "0",
      "api_code_version": 0
    }
  }
}

As expected, we see browser and referrer information, as well as the token we're using to authenticate. The capabilities field is interesting, as it's a bitfield that equates to 0b1000000000101 in binary. This field isn't documented anywhere, but I'm guessing it behaves similarly to the intents field available for bots, which specifies which events you want to receive and the kind of privileges your application has. This is all very speculative however.

From here we also set our presence information, which is all client authoritative, especially the activities. Discord integrates with its Game SDK which allows applications to manage their own presence information on the client.

1. READY

Here we go. The enormous 319 kb READY packet. Obviously I can't post the entire thing raw here, so I've removed any personal data and reduced long arrays to one or two elements. The "d" data field contains everything, so I've broken it up into multiple code snippets in order to explain things better, keep in mind this whole section is about one packet. Let's take a look.

JSON

{
  "t": "READY",
  "s": 1,
  "op": 0,
  "d": {
    "v": 9,
    "users": [
      {
        "username": "...",
        "public_flags": 0,
        "id": "123456789101112131",
        "discriminator": "1234",
        "avatar_decoration": null,
        "avatar": "b77e9d067cd4cb2998021c07fdc76b70"
      }
    ],
    "user_settings_proto": "[A long token I've redacted]",
    "user_guild_settings": {
      "version": 739,
      "partial": false,
      "entries": [
        {
          "version": 16,
          "suppress_roles": false,
          "suppress_everyone": false,
          "notify_highlights": 0,
          "muted": false,
          "mute_scheduled_events": false,
          "mute_config": null,
          "mobile_push": true,
          "message_notifications": 2,
          "hide_muted_channels": false,
          "guild_id": "123456789101112131",
          "flags": 0,
          "channel_overrides": [
            {
              "muted": true,
              "mute_config": null,
              "message_notifications": 2,
              "collapsed": false,
              "channel_id": "123456789101112131"
            }
          ]
        }
      ]
    }
  }
}

First we have the users array. This is all of the raw user data and is the base of every single display of a user on the page. In my case, this array contained over 200 users, but they all consistently had the same five fields. user_guild_settings contains all of the entries for every single preference for every single guild you're in. This includes things like whether or not you want to receive notifications for a specific channel, role, or anything else you can configure, as well as whether it is collapsed or not. Pretty crazy how there's a lot of minuscule preferences that are stored on the server so that the experience across multiple clients is virtually the same.

JSON

{
  "user": {
    "verified": true,
    "username": "miapolis",
    "purchased_flags": 2,
    "public_flags": 128,
    "premium_type": 0,
    "premium": false,
    "phone": "[redacted]",
    "nsfw_allowed": true, // Shh...
    "mobile": true,
    "mfa_enabled": true,
    "id": "508420859476836364",
    "flags": 176,
    "email": "[my personal gmail, redacted]",
    "discriminator": "4249",
    "desktop": true,
    "bio": "\u27a5 **https://miapolis.io**",
    "banner_color": null,
    "banner": null,
    "avatar_decoration": null,
    "avatar": "6263cbeb027f04306b95bca72f488645",
    "accent_color": null
  }
}

About what I would expect, although I'm surprised that phone number and email information is that immediately accessible. I would think that there would be a more elaborate user object that would contain that information and wouldn't be sent to the client in the second packet.

JSON

{
  "sessions": [
    {
      "status": "dnd",
      "session_id": "all",
      "client_info": {
        "version": 0,
        "os": "unknown",
        "client": "unknown"
      },
      "activities": [
        {
          "type": 4,
          "name": "Custom Status",
          "id": "custom",
          "emoji": {
            "name": "\ud83e\udd48"
          },
          "created_at": 1668361470618
        }
      ],
      "active": true
    }
  ],
  "session_type": "normal",
  "session_id": "0f5e23fee8b995ea8fa3513ffd5b047e",
  "resume_gateway_url": "wss://gateway-us-east1-d.discord.gg"
}

The session information also contains additional sessions, I'm assuming other clients that are logged in to the same account that are also online. These other sessions have an actual session id other than just "all", as the session id for the current session is specified outside of the array.

JSON

{
  "relationships": [
    {
      "user_id": "123456789101112131",
      "type": 1,
      "nickname": "Your Mom",
      "id": "123456789101112131"
    }
  ]
}

The relationships field is interesting in how minimal the data is. There's the redundant user_id and id fields which are equivalent, but also a type field which has the value 1 for every single element in the array. The nickname field is nullable and seems to be a recent addition with the new friend nicknames feature.

I assume that this data is not used to populate the friends list and is only used to cache nicknames on the client.

JSON

{
  "read_state": {
    "version": 320451,
    "partial": false,
    "entries": [
      {
        "mention_count": 0,
        "last_pin_timestamp": "1970-01-01T00:00:00+00:00",
        "last_message_id": "123456789101112131",
        "id": "1123456789101112131"
      }
    ]
  }
}

The read_state.entries array is one of the largest in this entire packet. The size of the array is equivalent to the sum of the number of channels in every single server you're in. There's mention, pin, and message information, in my case I had some 877 channels.

JSON

{
  "private_channels": [
    {
      "type": 1,
      "recipient_ids": ["123456789101112131"],
      "last_message_id": "123456789101112131",
      "id": "123456789101112131",
      "flags": 0
    },
    {
      "type": 3,
      "recipient_ids": [
        "123456789101112131",
        "123456789101112131",
        "123456789101112131"
      ],
      "owner_id": "123456789101112131",
      "name": "Group Chat",
      "last_message_id": "123456789101112131",
      "id": "123456789101112131",
      "icon": null,
      "flags": 0
    }
  ]
}

This private_channels array is what actually populates the DM list on the left. I've included two different types of channels here, one is just a DM channel with the id being equivalent the user id of whoever you're messaging, and the other is a group DM with recipients and the owner listed. All channels on Discord are treated the same way, whether a channel is part of a server or not. If you want to DM a specific user, you just navigate to discord.com/channels/@me/[user id] and the same is true for group DM's. The difference between DM's and a server is that the @me part changes to the guild id: discord.com/channels/[guild id]/[channel id].

JSON

{
  "merged_members": [
    [
      {
        "user_id": "508420859476836364",
        "roles": [
          "123456789101112131",
          "123456789101112131",
          "123456789101112131"
          // ...
        ],
        "premium_since": null,
        "pending": false,
        "nick": "Ethan",
        "mute": false,
        "joined_at": "2021-06-19T20:22:37.518000+00:00",
        "flags": 0,
        "deaf": false,
        "communication_disabled_until": null,
        "avatar": null
      }
    ]
  ]
}

Now here's where it gets interesting. This was a new discovery for me, this merged_members array contains a unique member object for every single server you're in. It's your own member object in each server, and contains what'd you expect like nickname, role, server boost and server avatar info.

The bulk of this entire packet is in the guilds array, take a look at how the line numbers jump when all of the properties on the packet are collapsed:

These are the lazy guild objects, which contain enough information to populate the server list on the very left, and also additional things member count, emojis, stickers, and also full channel details. I can't show even one full object because it is pretty massive, but guilds are really well documented. Here's an overview of the fields though:

After the guilds field, there's a few more additional fields mostly relating to experiments, regional data, and a few more fields for various tokens. That's about everything you need to know for this payload.

2. READY_SUPPLEMENTAL

This packet seems to provide additional information about the user's guilds as well as presences. Included in the payload is a merged_presences field, which contains two arrays: guilds and friends. An object in either of those arrays would look like this:

JSON

{
  "user_id": "123456789101112131",
  "status": "dnd",
  "client_status": {
    "desktop": "dnd"
  },
  "activities": [
    {
      "type": 4,
      "state": "This is my custom status",
      "name": "Custom Status",
      "id": "custom",
      "created_at": 1668361619691
    },
    {
      // Example of an Overwatch 2 activity presence
      "type": 0,
      "timestamps": {
        "start": 1668361616187
      },
      "name": "Overwatch 2",
      "id": "3ee3f6de7c0d087a",
      "created_at": 1668361619692,
      "application_id": "356875221078245376"
    }
  ]
}

Also, there's an additional merged_members array that has the same objects covered before, except that it's members other than your own user. These contain more nickname and role information done a per-server basis.

Lastly, there's a guilds array specified in the payload, except this one's different from all the others we've seen. It's defined by three properties: id, voice_states, and embedded_activities. This is the data I believe used to show voice channel and activity information when hovering over server icons:

If a server has nobody in a public voice channel and there aren't any ongoing activities, then the voice_states and embedded_activities will simply be empty. Otherwise, a typical object would look something like this:

JSON

{
  "voice_states": [
    {
      "user_id": "123456789101112131",
      "suppress": false,
      "session_id": "0f5e23fee8b995ea8fa3513ffd5b047e",
      "self_video": false,
      "self_stream": true,
      "self_mute": false,
      "self_deaf": false,
      "request_to_speak_timestamp": null,
      "mute": false,
      "deaf": false,
      "channel_id": "123456789101112131"
    },
    // Additional members in the vc...
  ],
  "id": "123456789101112131",
  // Activity objects
  "embedded_activities": []
}

What is interesting here is that Discord makes the session id of other users public, meaning that it's not sensitive information after all. This information is used to distinguish between multiple clients for a user in a vc, as you can only be connected to a vc on one device at a time.

Additional Payloads

After READY and READY_SUPPLEMENTAL, Discord sends a SESSIONS_REPLACE packet. This just seems to contain additional information about all of the online clients for the user, their statuses, and activities. This is speculative, but I'm pretty sure that Discord combines activity information between multiple clients (it obviously supports multiple activities being shown for a single user), and that's why this session information is needed.

After that, Discord really only uses the gateway to emit some common updates. These are the following:

PRESENCE_UPDATE - someone in your friend or relevant member list goes online/offline, changes status, rich presence, etc.
MESSAGE_CREATE - A member in a server sent a message. Contains guild, message and author information, but also raises an important question: How, when, and to whom does Discord propagate this event? While I was looking through the data, I saw a random message from a user I've never interacted with, in a channel I've never talked in, in a large server I'm not active in at all. There's no way that any message anyone sends in a large server is propagated to all online members in that server, that seems really expensive and impractical to me. I'm pretty confident that this behavior differs based on the size of the server, messages per day, and other factors. Something pretty interesting to think about.
MESSAGE_UPDATE - A member in a server edited their message. Contains the same information as in the MESSAGE_CREATE event, with a few more fields. I also assume that this event is only propagated to you if the message is recent enough or scrolled into view. It wouldn't make sense at all if you could edit old messages from years ago and lead to all online server members being notified.

There might be a few more packets that I'm missing, but those ones seem like the most important anyway. The client periodically sends a heartbeat packet based on the previously specified interval by the server, and that's about it.

Takeaways

This part is the most speculative by far, but I seeing all this data actually gives me a pretty solid understanding of how Discord loads in their home/friends/overview page. Everything is done on the basis of partial objects. A few bits of information are loaded in as a frame of reference, and then the more detailed objects come in.

With the READY packet, users and relationships serves as outlining partial information, and from there the details are filled in from fields like merged_members in the following packet. Private channels and group DM's are also included. From there, the supplemental packet loads in presence and other finishing information, and with that, the information is able to persist on the client and be modified accordingly during events that mutate the data.

Of course, that's only the basis of the gateway covered, and Discord also has a REST API to interact with on top of that. Let's take a look at the requests we make from here.

REST Endpoints

https://discord.com/api/v9/users/@me/billing/payment-sources
https://discord.com/api/v9/users/@me/billing/country-code

Interestingly these two are the first requests made to the HTTP API, and just fetch billing info.

https://status.discord.com/api/v2/scheduled-maintenances/upcoming.json

I've seen Discord do scheduled maintenances before, but it really doesn't happen often now. status.discord.com also redirects to discordstatus.com, which is using Atlassian Statuspage, and kept separate from Discord.

https://discord.com/api/v9/applications/detectable

Returns a list of all well-known registered game sdk integrations that work with Discord. I believe this is the data that's used when an activity has a recognized application id, and from there you can get icons and other asset information.

Spotify

If you have Spotify premium and have your account connected, Discord will make requests to the following endpoints:

https://api.spotify.com/v1/me/player - Spotify will return information about your current session(s) and what you're listening to
https://api.spotify.com/v1/me - Returns profile and account info
https://api.spotify.com/v1/me/player/devices - Devices, whether they're active or not, model information and even the volume percent within the app
https://api.spotify.com/v1/me/notifications/player?connection_id=[id] - finally Discord will attempt to establish a bridge and connect to Spotify's player in order to display rich presence information and more

Bad Domains

This is where content filtering comes into play. Discord, like any other internet platform, will have bad actors that attempt to phish and spread malware. One of the many technical preventions installed is keeping track of bad domains:

We work to identify links that might mislead or redirect users such as by reviewing a regularly updated list of unsafe and phishing websites. Some of the things we block with this filter are sites that might try to steal personal information like passwords or credit card numbers, your Discord login information, or even financial information.

And so the client makes a request to https://cdn.discordapp.com/bad-domains/updated_hashes.json which contains the hashes for the constantly updating list of bad domains Discord maintains.

Analytics

Discord uses the endpoint https://discord.com/api/v9/science for analytics. Every navigation, modal open, menu expand, and click is logged there with a simple POST request. For example, this is all of what's sent over when you click on someone's profile:

JSON

{
  "token": "[redacted]",
  "events": [
    {
      "type": "open_popout",
      "properties": {
        "client_track_timestamp": 1668734284516,
        "type": "Profile Popout",
        "guild_id": "123456789101112131",
        "channel_id": "123456789101112131",
        "other_user_id": "123456789101112131",
        "sku_id": null,
        "is_friend": true,
        "has_images": false,
        "party_platform": null,
        "game_platform": null,
        "profile_user_status": "dnd",
        "is_streaming": false,
        "has_custom_status": false,
        "profile_has_nitro_customization": false,
        "profile_has_theme_color_customized": false,
        "profile_has_theme_animation": false,
        "profile_has_theme_emoji": false,
        "has_nickname": false,
        "has_guild_member_avatar": false,
        "has_guild_member_banner": false,
        "has_guild_member_bio": false,
        "client_performance_memory": 0,
        "accessibility_features": 524416,
        "rendered_locale": "en-US",
        "accessibility_support_enabled": false,
        "client_uuid": "[redacted]",
        "client_send_timestamp": 1668734284530
      }
    }
  ]
}

I think it's safe to say that Discord certainly has a lot of insightful information as to how users are interacting with their app, and definitely no shortage of it.

Conclusion

At first glance, from a purely use-of-technology standpoint, Discord isn't doing anything all too special. What is interesting is how they manage to fragment large amounts of data and persist it on the client efficiently. Dealing with constant friend updates, presences, streams and more can be very taxing, and combine that with large-scale propagations of message events in massive servers, you'll need very good infrastructure to handle that.

Discord has a blog as well, and they'll occasionally post an engineering article or two, admittedly less now than they used to. I find all of it super interesting though, like when they explained how they optimized the member-list with Rust, or how they manage all of their voice servers. Even with Discord's technical blog, there's still a surprising amount of stuff to be learned just from tracing some packets and looking at a bit of back and forth data exchange. I hope you got something out of this, and if you did, be sure to also check out Hussein's Dev Tool them ALL! series I mentioned before.

DevTooling Discord

Context

Initial HTML

Discord's Gateway

0. Initial Packet

Upstream Packet

1. READY

2. READY_SUPPLEMENTAL

Additional Payloads

Takeaways

REST Endpoints

Spotify

Bad Domains

Analytics

Conclusion

Hosting on DigitalOcean's Droplets

The Pitfalls of NVIDIA on Linux