iTranslated by AI
WebSocket Authentication: A Survival Guide
Introduction
Are you using WebSockets?
WebSockets require several considerations that differ from HTTP communication, which web developers are likely most familiar with. Authentication is one of them.
For example, the browser's WebSocket API does not allow adding custom headers during the handshake[1]. Therefore, the standard method of setting a token in the request's Authorization header for server-side authentication cannot be used.
Even in RFC 6455, which defines the WebSocket protocol, the description regarding authentication remains vague (English original):
This protocol doesn't prescribe any particular way that servers can
authenticate clients during the WebSocket handshake. The WebSocket
server can use any client authentication mechanism available to a
generic HTTP server, such as cookies, HTTP authentication, or TLS
authentication.
So, what authentication methods are practical? Let's take an overview below.
Prerequisites
This article assumes a situation where[2][3]:
- The client connects to a WebSocket endpoint using the browser's WebSocket API.
- We want to implement some form of authentication upon connection, as letting anyone connect would be problematic.
Note that library and runtime support for authentication will be mentioned later.
Authentication Methods
1. "Ticket"-based Authentication
Below, we will primarily discuss "ticket"-based authentication, as introduced in WebSocket Security (devcenter.heroku.com).
This approach performs authentication as follows. It assumes the existence of an (HTTP) server capable of creating authentication "tickets"[4].
- To start a WebSocket connection, the client first connects to an HTTP server to obtain a "ticket."
- The HTTP server creates a ticket[5].
- The server stores this ticket (in a database or cache) and returns it to the client.
- The client connects to the WebSocket server and sends this "ticket" as part of the initial handshake.
- The WebSocket server validates this ticket.
This authentication method is easy to handle and covers common scenarios where the HTTP server and WebSocket server are separate.
Putting it in Query Parameters
const token = "your_auth_token";
const uri = `wss://example.com/api/websocket?token=${token}`;
const webSocket = new WebSocket(uri);
In this method, the token is stored in the query parameters for server-side authentication.
Since query parameters are part of the URL, there is a risk that the token may be recorded in logs somewhere[6]. Therefore, when adopting this method, it is necessary to take precautions such as making the token's expiration time extremely short or making it usable only once.
"First Message" Method
const token = "your_auth_token";
const uri = "wss://example.com/api/websocket";
const websocket = new WebSocket(uri);
websocket.onopen = () => websocket.send(token);
This approach is often referred to as the "First Message" method.
First, the WebSocket server establishes the connection without authentication. Then, after the connection is opened, the client sends the token as the very first message. If authentication fails, the server terminates the connection.
This is one of the more viable methods because it carries a lower risk of credential leakage compared to using query parameters and is easy to implement. However, since the WebSocket connection is established before authentication, you should consider whether this matches your application's requirements[7].
Using the Sec-WebSocket-Protocol Header
const token = "your_auth_token";
const uri = `wss://example.com/api/websocket`;
// new WebSocket(uri, protocols)
const webSocket = new WebSocket(uri, ["your-protocol" ,token]);
This is a method that uses the Sec-WebSocket-Protocol header.
This header is intended for the WebSocket client and server to agree upon a subprotocol. A subprotocol is a constraint agreed upon between the client and server.
For example, by specifying Sec-WebSocket-Protocol: json[9], both parties can agree that data will be exchanged in JSON format.
This header is the only one that can be freely set via the WebSocket API during the WebSocket handshake.
Furthermore, besides choosing from the WebSocket Subprotocol Name Registry, you can also set any custom name that the client and server agree upon[10].
Due to these two reasons, there is room for a "hack" where the token is placed in the Sec-WebSocket-Protocol header for server-side authentication.
While it may seem like a crude approach, it is surprisingly used in some major software[11].
For example, you can see it used for WebSocket authentication in the Kubernetes API:
This header is also utilized in AWS AppSync.
Since this approach doesn't align with the original intent of the WebSocket protocol, there might be some hesitation to use it, but it could be viable in certain situations.
2. Other Authentication Methods
Cookies
This method involves sending a cookie containing authentication information during the WebSocket connection establishment and verifying it on the server.
While this method works, there are several points to consider when adopting it.
First, since WebSocket connections are not subject to CORS policies, a Cross-Site WebSocket Hijacking (CSWSH) attack is possible if no countermeasures are taken.
This attack works on the same principle as CSRF (Cross-Site Request Forgery). Because WebSockets lack CORS control, a site like evil.example.com can easily open a connection to a WebSocket endpoint on good.example.com and force the browser to send cookies.
Several countermeasures can be considered, the main ones being:
- Verifying the
Originheader on the server side. - Using a mechanism similar to CSRF tokens.
Also, as a general note unrelated to WebSockets, if the client and server are running on different domains, it becomes impossible to send cookies. WebSocket servers are often operated separately from HTTP servers, so this is worth noting.
Basic Authentication
This is a method of using Basic authentication, such as wss://username:password@example.com/api/websocket/.
Although Basic authentication is listed as an available method in the RFC, it is practically unusable because modern browsers are moving toward disabling the use of credentials in URLs like the one above[12].
TLS
The TLS authentication mentioned in RFC 6455 refers to what is known as "mutual TLS authentication."
For typical web services, requiring users to configure client certificates is usually not realistic.
Supplement: Libraries and Runtimes Supporting Authentication
Some libraries and runtimes support authentication via WebSockets.
Socket.IO
In Socket.IO, you can use authentication tokens with code like the following.
Client side
// plain object
const socket = io({
auth: {
token: "abc"
}
});
// or with a function
const socket = io({
auth: (cb) => {
cb({
token: "abc"
});
}
});
Server side
io.use((socket, next) => {
const token = socket.handshake.auth.token;
// ...
});
Internally, this appears to utilize the "First Message" method.
Deno
// WebSocket with headers
const wsWithProtocols = new WebSocket("ws://localhost:8080", {
headers: {
"Authorization": "Bearer foo",
},
});
Deno's WebSocket API includes a proprietary extension that allows the use of headers.
Conclusion
If you are implementing it yourself, the "First Message" method is likely the one with the fewest headaches.
If a library is available, it is also a good idea to rely on it.
Wishing you all a wonderful WebSocket life!
Reference Links
Explains the "First Message" method in Japanese.
A Stack Overflow answer that summarizes authentication options well.
Nothing has changed in the 15(!) years since this question was opened.
This passage was quite memorable.
An article written by Armin Ronacher (creator of Flask) in 2012. It's a great piece for understanding the complexities and difficulties of WebSockets, and it remains relevant today.
Websockets make you sad. There, I said it. What started out as a really small simple thing ended up as an abomination of (what feels like) needles complexity.
WebSockets make you unhappy. There, I said it. What started as a truly small and simple thing ended up as an abomination with (what feels like) unnecessary complexity.[13]
-
Discussions regarding custom headers can be found in the
whatwg/websocketsissue Support for custom headers for handshake #16. ↩︎ -
The goal of this article is to list methods that "look like they could be used for authentication" and does not dive into the specifics of authentication. While this content might be useful for non-browser connections, the underlying assumptions may differ. ↩︎
-
Per-message authentication during a connection and handling credential revocation are outside the scope of this article. ↩︎
-
This assumes the HTTP server has some form of authentication. At first glance, this might look like just moving the goalpost, but in practice, a WebSocket server is rarely operated alone. The typical pattern is to combine it with HTTP-based services, so this premise isn't that unusual. ↩︎
-
Tickets should contain info required for tracking/management, such as user IDs. ↩︎
-
Many articles state that "query parameters are not encrypted and could be read by proxy servers," but when using HTTPS, everything except the IP address and hostname is encrypted. This means the URL path is also encrypted (Ref: Introduction to HTTPS). Another downside to query parameters is the risk of being recorded in browser history or bookmarks, or being seen over one's shoulder, but for a WebSocket endpoint, these may not be significant concerns. Of course, as noted in the body, precautions like using short-lived tokens are still necessary. ↩︎
-
For instance, an attack could involve consuming WebSocket connection slots by opening connections and never sending an authentication token. A possible countermeasure is setting a timeout for the "First Message" and closing the connection if it isn't received. ↩︎
-
The reason for including a protocol like
your-protocolalongside the token in this sample is due to theSec-WebSocket-Protocolspec. The server selects and responds with the first subprotocol it supports from the client's list. Since including the token in the response is undesirable, the server should respond with a non-token protocol when using this method. ↩︎ -
The documentation suggests using names that include a domain, such as "json.example.com," to prevent name collisions. ↩︎
-
This isn't meant to imply it's okay simply because major software uses it. Honestly, I'm not entirely sure to what degree this method is considered poor practice, ↩︎
-
It is also insecure without HTTPS. Though, in an age where even Hiroshi Abe's Homepage is HTTPS, this feels like common sense. ↩︎
-
Translation by the author. ↩︎
Discussion