iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
☠️

Understanding XSS, CSRF, CORS, Same-Origin Policy, and Cookies

に公開

I've finally organized my thoughts on this recently, so I'm writing this as a memorandum. It would be more accurate to read Mr. Tokumaru's book rather than looking at this article. There might be mistakes, so please be careful.

  1. XSS
  2. Cookie behavior
  3. Same-origin policy
  4. CORS and preflight requests
  5. CSRF and cookie SameSite flags
  6. Bonus: cookie vs LocalStorage

Quick explanations

  • XSS (Cross-Site Scripting) and CSRF (Cross-Site Request Forgery) are representative attack methods.
  • Same-origin policy is the default browser behavior that rejects GET requests (Note: not POST) between different domains. You need to open a hole using CORS.
  • CORS (Cross-Origin Resource Sharing) is a mechanism for access permissions where the server tells the browser to allow HTTP requests from different domains. However, it does not activate unless a preflight request is sent. Since there are unique conditions for whether a preflight request is sent, CORS alone cannot prevent malicious requests.
  • CSRF is established when a cookie is automatically sent from a different domain using a loophole in CORS. Recently, browsers have adopted settings where cookies are not automatically sent from different domains. That is the SameSite flag.
  • Authentication using access tokens with LocalStorage is one way of session management that doesn't use cookies. Since LocalStorage can be accessed from JS, there is a risk that access tokens could be stolen after a successful XSS attack.

Looking at it this way, CSRF has too many moving parts. That's why it's so confusing.

Cross Site Scripting (XSS)

It stands for Cross-Site Scripting. This is the easiest to understand.

What to remember first

Attacks targeting "sites that display strings received from users as-is on the screen"

Examples:

  • Chat services
  • Input confirmation screens
  • SNS

Countermeasures

Always perform HTML sanitization (escaping) when outputting data to the screen. Some template engines and SPA libraries do this by default.

Check this IPA page for details:
https://www.ipa.go.jp/security/vuln/websecurity-HTML-1_5.html

Specifically

There are surprisingly many sites that display strings received from users as-is on the screen.

Examples:

  • Chat services
  • Input confirmation screens
  • SNS

The easiest examples are chat services and SNS. If strings received from a user are output as-is without HTML escaping, that user will be able to display arbitrary HTML tags on a third party's browser. This creates vulnerabilities that provide methods for new attacks or for executing JS by embedding script tags (in other words, the ability to do anything).

Conceptual diagram of XSS

A "curveball" example of XSS is when a string received from a query is output as-is without HTML escaping. This also carries a similar vulnerability if the query string is output directly without HTML escaping.

Conceptual diagram of XSS using queries

Next, I'd like to move on to CSRF, a typical attack, but because it requires more prerequisite knowledge than XSS, I'll digest those first. Prerequisites include cookie behavior, CORS, and Same-origin policy.

First, cookies. They are mainly used for session management and are standard in browsers. When you want to maintain a session, JWT and cookies are often the two choices.

What to remember first

  • The server sets a cookie in the browser through HTTP response headers and requests the cookie in subsequent requests.
  • Cookies are stored in the browser on a per-domain basis.
  • When an HTTP request is sent to that domain, the cookie is automatically sent, even if the request is from a site on a different domain. There are exceptions (SameSite flag).

Specifically

When building a web service, you may want to maintain a logged-in state. This logged-in state is often called a session. There are several ways to manage sessions, but for HTTP requests, request headers or cookies are often used. Please search for how to implement session management; there are many sites that explain it clearly.

Now, the problem is that there are two types of attacks that can be performed against sites that use cookies for session management.

  1. Attacks that illegally piggyback on someone else's session by putting an arbitrary session variable in a cookie (session hijacking attack).
  2. Attacks that exploit the fact that cookies are automatically sent even if the request is from a site on a different domain when an HTTP request is sent to that site (CSRF attack).

The first one will be explained in this section. The second one will appear in the CSRF section, so I won't explain it here.

About session hijacking attacks.
This attack succeeds when session variables are simple and predictable, or when communication is not encrypted and session variables are easy to intercept. The following sequence diagram should make it easy to understand. Also, depending on the cookie settings, it is possible to allow JS to retrieve cookies. In this case, if an XSS attack is successful, session variables could be stolen.

Example of a session hijacking attack

Check this IPA page for details:
https://www.ipa.go.jp/security/vuln/websecurity-HTML-1_4.html

Same-origin policy

What to remember first

  • Browser behavior that prohibits GET requests to different domains.
  • This behavior can be overridden by server-side CORS settings.
  • It has no effect on POST requests.
    • The response fails, but the request succeeds.
    • If a preflight is sent, it can be prevented because CORS will be applied.
      • Whether a preflight is sent depends on the request conditions.

Specifically

To be more precise, it is a policy applied when any of the protocol (http or https), host (example.com), or port (example.com:3000) differs, rather than just "different domains." Nowadays, frontend and backend are often separated and have different URLs. For example, the frontend might be https://example.com and the backend https://api.example.com. In this case, since it's a subdomain, the host is different. Therefore, if you were developing both on localhost and then deployed them to different domains, communication would fail due to the Same-origin policy. CORS settings are required on the backend side.

In particular, please read this part of the documentation:

https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy#network_access_to_other_origins

  • Cross-origin writes are typically allowed. Examples are links, redirects, and form submissions. Some rarely used HTTP requests require a preflight.
  • Cross-origin embedding is typically allowed. Examples are listed below.
  • Cross-origin reads are typically disallowed, but read access is often leaked by embedding. For example, you can read the width and height of an embedded image, the actions of an embedded script, or the availability of an embedded resource.

Embedding things, especially iframe, is complicated, so I don't fully understand it yet (homework).
The most dangerous one here is the first one.

Cross-origin writes are typically allowed. Examples are links, redirects, and form submissions. Some rarely used HTTP requests require a preflight.

In other words, since preflight requests are not sent for typical requests through forms, the Same-origin policy is not applied.

Example of a malicious external domain's HTML making a POST request using a Same-origin loophole
<form action="https://example.com/send-money?to=hogehoge">
  <input value="あああ" name="userId" hidden></input>
  <button type="submit">Click here to learn how to make big money</button>
</form>

It seems better to talk about CORS and preflight requests. But why such a specification with loopholes...
Also, the example above is actually a CSRF attack. For developers who say, "We make it so you can't hit the API without logging in!", if you use cookies for session management, something bad happens due to the cookie specification... I'll leave that to the CSRF section.

CORS and preflight requests

What to remember first

  • CORS overrides the behavior of the Same-origin policy to allow access (whitelist format)
  • Configured on the server side
  • Specify permitted domains in the Access-Control-Allow-Origin response header
  • For requests between different domains, the browser first sends a CORS preflight request using the HTTP method OPTIONS to check CORS, and then sends the actual request if permitted.
  • However, CORS preflight requests are not sent for simple requests. The CORS check can be bypassed.
    • Simple requests generally do not occur in API requests exchanging application/json, but they do occur in requests using forms.
    • Simple requests remain for compatibility with the legacy browser form specification.

Specifically

CORS is a specification for allowing HTTP requests between different domains. By default, these are blocked by the Same-origin policy. In other words, CORS is a whitelist format, not a blacklist format. It can be configured by specifying the permitted domain in the Access-Control-Allow-Origin response header.

How is it permitted? It uses a special HTTP request called a preflight request. This request is automatically sent by the browser in advance when sending an HTTP request to a domain different from its own. The HTTP method used is OPTIONS. Based on the result of this preflight request, the browser can determine if the server allows requests from that domain. If permitted, the browser then sends the actual request.

However, there is a loophole. CORS preflight requests are not sent for "simple requests."

The conditions for a request to be a simple request are quite complex. For details, please see the link below. The key point to remember here is that requests using forms easily become simple requests and bypass CORS checks.

https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS#simple_requests

It's all starting to get a bit confusing, isn't it? (laughs).

CSRF

Now we have all the prerequisite knowledge to explain CSRF. There are so many moving parts... (orz).

What to remember first

  • CSRF is an attack that cleverly exploits security loopholes while using browser behavior.
  • It uses loopholes in the Same-origin policy and CORS.
  • In particular, it exploits the fact that cookies are automatically sent, even for requests from different domains.
    • You can control this behavior with the SameSite flag. In recent modern browsers, it should be enabled by default.

Countermeasures

  • If using cookie-based sessions, add a CSRF token to forms and ensure the cookie's SameSite flag is enabled.
  • Reject unused Content-Types.
  • Require password entry before high-risk operations.

Check this IPA page for details:
https://www.ipa.go.jp/security/vuln/websecurity-HTML-1_6.html

Specifically

As a prerequisite, suppose Person A is logged into a fictional bank site called bank.com.

When Person A opens a malicious site, there is a link (actually a form submission button disguised as a link) that says "Click here to learn how to make big money," and Person A clicks it.

Since bank.com had no CSRF protection, the malicious request was executed, and money was sent from Person A to a malicious organization.

Example of a malicious external domain's HTML making a POST request using a Same-origin loophole
<form action="https://bank.com/send-money?to=hogehoge">
  <input value="あああ" name="userId" hidden></input>
  <button type="submit">Click here to learn how to make big money</button>
</form>

CSRF sequence diagram

Why does this happen? There are two reasons:

  • The Same-origin policy and CORS have loopholes regarding forms.
  • Cookies are automatically sent even from different domains.

To prevent these causes, the following countermeasures are fundamentally effective:

  • When using cookies, protect with CSRF tokens.
  • Enable the SameSite flag so that cookies are not automatically sent from different domains.

.

Strengths of cookies

  • Can be set so they cannot be accessed from JS, preventing session variables from being stolen if an XSS attack occurs.
  • Expiration dates can be set without additional code.
    Weaknesses of cookies
  • They are automatically sent from different domains. However, recently, browser vendors have been shifting towards enabling the SameSite flag by default if it is not set. Specifically, it would be best to research the status of each browser.

Strengths of LocalStorage

  • Good compatibility with JS.
    Weaknesses of LocalStorage
  • Can be accessed from JS within the same domain. Access tokens could be stolen if an XSS attack occurs.
  • Expiration dates cannot be set. You need to keep the expiration date within the access token itself and determine expiration in your code during requests.

There also seems to be a method of managing access tokens in memory rather than in LocalStorage.
https://blog.flatt.tech/entry/auth0_access_token

Mr. Tokumaru's slides are very detailed on this topic.
https://www.slideshare.net/ockeghem/phpconf2021spasecurity

References

https://note.crohaco.net/2019/http-cors-preflight/

Discussion