Closed53

Semgrep で DOM-XSS を検出するルールを作ってみる

takutoytakutoy

Registry にルールがないか探してみる
https://semgrep.dev/r?q=dom+xss&lang=JavaScript

takutoytakutoy

このルールをベースに作ってみよう。

やること

  1. テストコードを書く
  2. taint mode で source-sink で検出できるようにする
  3. source と sink を増やして検出力アップ
takutoytakutoy

window.location が source のパターンを想定してテストを改変

dom-based-xss2.js
const qs = window.location.search;
const hash = window.location.hash;

// ruleid:dom-based-xss
document.write(qs);
document.write(hash);

// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
takutoytakutoy

元のテストコードでルールを実行

$ semgrep --config dom-based-xss.yaml dom-based-xss.js                                              
Scanning 1 file.


Ran 1 rule on 1 file: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
  $ semgrep shouldafound --help
user@develop:/mnt/c/semgrep/dom-xss$ 
user@develop:/mnt/c/semgrep/dom-xss$ semgrep --config dom-based-xss.yaml dom-based-xss.js 
Scanning 1 file.

Findings:

  dom-based-xss.js 
     dom-based-xss
        Detected possible DOM-based XSS. This occurs because a portion of the URL is being used to
        construct an element added directly to the page. For example, a malicious actor could send
        someone a link like this:
        http://www.some.site/page.html?default=<script>alert(document.cookie)</script> which would
        add the script to the page. Consider allowlisting appropriate values or using an approach
        which does not involve the URL.

          2┆ document.write("<OPTION value=1>"+document.location.href.substring(document.location.href.indexOf("default=")+8)+"</OPTION>");


Ran 1 rule on 1 file: 1 finding.
takutoytakutoy

改変したテストコードで実行。検出できないことを確認。

$ semgrep --config dom-based-xss.yaml dom-based-xss2.js
Scanning 1 file.

Ran 1 rule on 1 file: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
  $ semgrep shouldafound --help
takutoytakutoy

ルール改造の邪魔なのでメタデータを消しておく

rules:
- id: dom-based-xss
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-either:
  - pattern: document.write(<... document.location.$W ...>)
  - pattern: document.write(<... location.$W ...>)
takutoytakutoy

taint mode にして sink と source を書く

rules:
- id: dom-based-xss
  mode: taint
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-sources:
  - pattern: window.location
  pattern-sinks:
  - pattern-either:
    - pattern: document.write(...)
takutoytakutoy

実行。検出できた!

$ semgrep --config dom-based-xss.yaml dom-based-xss2.js 
Scanning 1 file.

Findings:

  dom-based-xss2.js 
     dom-based-xss  
        dom-xss     

          5┆ document.write(qs);
          ⋮┆----------------------------------------
          6┆ document.write(hash);


Ran 1 rule on 1 file: 2 findings
takutoytakutoy

次は source を増やしたい。dom-xss の source になりそうなものは

takutoytakutoy
dom-based-xss.yaml
rules:
- id: dom-based-xss
  mode: taint
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-sources:
  - pattern: window.location
  pattern-sinks:
  - pattern-either:
    - pattern: document.write(...)
    - pattern: document.writeln(...)
    - pattern: document.domain = $PAYLOAD
    - pattern: element.innerHTML = $PAYLOAD
    - pattern: element.outerHTML = $PAYLOAD

テストコード

dom-based-xss.js
const qs = window.location.search;
const hash = window.location.hash

// ruleid:dom-based-xss
document.write(qs);
document.write(hash);

// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");

const element = document.createElement('p');

// ng
element.innerHTML = qs;

// ng
element.innerHTML = hash;

// ok
element.innerHTML = "test"
takutoytakutoy

element.innerHTML = "test" が誤検知されてしまった。なんでや

$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.

Findings:

  dom-based-xss.js
     dom-based-xss
        dom-xss

          5┆ document.write(qs);
          ⋮┆----------------------------------------
          6┆ document.write(hash);
          ⋮┆----------------------------------------
         15┆ element.innerHTML = qs;
          ⋮┆----------------------------------------
         18┆ element.innerHTML = hash;
          ⋮┆----------------------------------------
         21┆ element.innerHTML = "test"


Ran 1 rule on 1 file: 5 findings.
takutoytakutoy
const qs = window.location.search;
const hash = window.location.hash

// ruleid:dom-based-xss
document.write(qs);
document.write(hash);

// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");

// ng
const element = document.createElement('p');

// ng
element.innerHTML = qs;

// ng
element.innerHTML = hash;

// ok
element.innerHTML = "test"

// ok
element.innerHTML = "test2"
$ semgrep --config dom-based-xss.yaml dom-based-xss.js 
Scanning 1 file.

Findings:

  dom-based-xss.js 
     dom-based-xss
        dom-xss

          5┆ document.write(qs);
          ⋮┆----------------------------------------
          6┆ document.write(hash);
          ⋮┆----------------------------------------
         15┆ element.innerHTML = qs;
          ⋮┆----------------------------------------
         18┆ element.innerHTML = hash;
          ⋮┆----------------------------------------
         21┆ element.innerHTML = "test"


Ran 1 rule on 1 file: 5 findings.

"test2" は検出されない。21行目時点の element.innerHTML は tainted と判定されてしまうのかな?

takutoytakutoy

今度は検出されなくなった。なんでや

$ semgrep --config dom-based-xss.yaml dom-based-xss.js 
Scanning 1 file.

Findings:

  dom-based-xss.js 
     dom-based-xss
        dom-xss

          5┆ document.write(qs);
          ⋮┆----------------------------------------
          6┆ document.write(hash);


Ran 1 rule on 1 file: 2 findings.
takutoytakutoy

ルールが metavariable になってなかった。修正

rules:
- id: dom-based-xss
  mode: taint
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-sources:
  - pattern: window.location
  pattern-sinks:
  - pattern-either:
    - pattern: document.write(...)
    - pattern: document.writeln(...)
    - pattern: document.domain = $PAYLOAD
    - pattern: $ELEMENT.innerHTML = $PAYLOAD
    - pattern: $ELEMENT.outerHTML = $PAYLOAD
takutoytakutoy

ヨシ

$ semgrep --config dom-based-xss.yaml dom-based-xss.js 
Scanning 1 file.

Findings:

  dom-based-xss.js 
     dom-based-xss
        dom-xss

          5┆ document.write(qs);
          ⋮┆----------------------------------------
          6┆ document.write(hash);
          ⋮┆----------------------------------------
         13┆ e1.innerHTML = qs;
          ⋮┆----------------------------------------
         17┆ e2.innerHTML = hash;


Ran 1 rule on 1 file: 4 findings.
takutoytakutoy

jQuery の sink もいくつか追加してみよう。とありあえず add() と html()

https://api.jquery.com/add/#add-html
https://api.jquery.com/html/#html-htmlString

takutoytakutoy

テストケース追加

// ok
$("div.test").html("test")

// ng
$("div.test").html(hash)

// ng
$("div.test").add(qs)

ルールも追加

rules:
- id: dom-based-xss
  mode: taint
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-sources:
  - pattern: window.location
  pattern-sinks:
  - pattern-either:
    - pattern: document.write(...)
    - pattern: document.writeln(...)
    - pattern: document.domain = $PAYLOAD
    - pattern: $ELEMENT.innerHTML = $PAYLOAD
    - pattern: $ELEMENT.outerHTML = $PAYLOAD
    - pattern: $JQ.add($PAYLOAD)
    - pattern: $JQ.html($PAYLOAD)
takutoytakutoy

ちゃんと検出できてる

$ semgrep --config dom-based-xss.yaml dom-based-xss.js 
Scanning 1 file.

Findings:

  dom-based-xss.js 
     dom-based-xss
        dom-xss

          5┆ document.write(qs);
          ⋮┆----------------------------------------
          6┆ document.write(hash);
          ⋮┆----------------------------------------
         13┆ e1.innerHTML = qs;
          ⋮┆----------------------------------------
         17┆ e2.innerHTML = hash;
          ⋮┆----------------------------------------
         27$("div.test").html(hash)
          ⋮┆----------------------------------------
         30$("div.test").add(qs)


Ran 1 rule on 1 file: 6 findings.
takutoytakutoy

source を増やしたい

takutoytakutoy
rules:
- id: dom-based-xss
  mode: taint
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-sources:
  - pattern-either:
    - pattern: location
    - pattern: document.URL
    - pattern: window.name
    - pattern: document.referrer
    - pattern: document.documentURI
    - pattern: document.baseURI
    - pattern: document.cookie
  pattern-sinks:
  - pattern-either:
    - pattern: document.write(...)
    - pattern: document.writeln(...)
    - pattern: document.domain = $PAYLOAD
    - pattern: $ELEMENT.innerHTML = $PAYLOAD
    - pattern: $ELEMENT.outerHTML = $PAYLOAD
    - pattern: $JQ.add($PAYLOAD)
    - pattern: $JQ.html($PAYLOAD)
const qs = window.location.search;
const hash = window.location.hash;

// ruleid:dom-based-xss
document.write(qs);
document.write(hash);

// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");

// ng
const e1 = document.createElement('p');
e1.innerHTML = qs;

// ng
const e2 = document.createElement('p');
e2.innerHTML = hash;

// ok
const e3 = document.createElement('p');
e3.innerHTML = "test"

// ok
$("div.test").html("test")

// ng
$("div.test").html(hash)

// ng
$("div.test").add(qs)

// ng
const referer = document.referrer
$("div.test").add(referer.substring(1,2))
takutoytakutoy

新しく追加したルールは検出されたけど、いままでのがダメになってしまった

$ semgrep --config dom-based-xss.yaml dom-based-xss.js 
Scanning 1 file.

Findings:

  dom-based-xss.js 
     dom-based-xss
        dom-xss

         34$("div.test").add(referer.substring(1,2))


Ran 1 rule on 1 file: 1 finding.
takutoytakutoy

location だと window.location にマッチしないからか。そらそうだ

takutoytakutoy

ルールを修正したらいけた

rules:
- id: dom-based-xss
  mode: taint
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-sources:
  - pattern-either:
    - pattern: location
    - pattern: window.location
    - pattern: document.location
    - pattern: document.URL
    - pattern: window.name
    - pattern: document.referrer
    - pattern: document.documentURI
    - pattern: document.baseURI
    - pattern: document.cookie
  pattern-sinks:
  - pattern-either:
    - pattern: document.write(...)
    - pattern: document.writeln(...)
    - pattern: document.domain = $PAYLOAD
    - pattern: $ELEMENT.innerHTML = $PAYLOAD
    - pattern: $ELEMENT.outerHTML = $PAYLOAD
    - pattern: $JQ.add($PAYLOAD)
    - pattern: $JQ.html($PAYLOAD)
$ semgrep --config dom-based-xss.yaml dom-based-xss.js 
Scanning 1 file.

Findings:

  dom-based-xss.js 
     dom-based-xss
        dom-xss

          5┆ document.write(qs);
          ⋮┆----------------------------------------
          6┆ document.write(hash);
          ⋮┆----------------------------------------
         13┆ e1.innerHTML = qs;
          ⋮┆----------------------------------------
         17┆ e2.innerHTML = hash;
          ⋮┆----------------------------------------
         27$("div.test").html(hash)
          ⋮┆----------------------------------------
         30$("div.test").add(qs)
          ⋮┆----------------------------------------
         34$("div.test").add(referer.substring(1,2))


Ran 1 rule on 1 file: 7 findings.
takutoytakutoy

こんなことをしても検出できるか?

const searchParams = new URLSearchParams(window.location.search)
const firstname = searchParams.get('firstname')
$("div.test").add(firstname)
takutoytakutoy
const names = [searchParams.get('firstname'), searchParams.get('lastname')]
$("div.test").add(names.join(' '))
takutoytakutoy

検出されないパターンを発見

const arr = []
arr.push(searchParams.get('firstname'))
arr.push(searchParams.get('lastname'))
$("div.test2").add(arr.join(' '))
takutoytakutoy

このパターンを検出できるようにしたい

const arr = []
arr.push(searchParams.get('firstname'))
arr.push(searchParams.get('lastname'))
$("div.test2").html(arr.join(' '))
takutoytakutoy

できたもの

dom-based-xss.yaml
rules:
- id: dom-based-xss
  mode: taint
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-sources:
  - pattern-either:
    - pattern: location
    - pattern: window.location
    - pattern: document.location
    - pattern: document.URL
    - pattern: window.name
    - pattern: document.referrer
    - pattern: document.documentURI
    - pattern: document.baseURI
    - pattern: document.cookie
  pattern-propagators:
  - pattern: $S.push($E)
    from: $E
    to: $S
  pattern-sinks:
  - pattern-either:
    - pattern: document.write(...)
    - pattern: document.writeln(...)
    - pattern: document.domain = $PAYLOAD
    - pattern: $ELEMENT.innerHTML = $PAYLOAD
    - pattern: $ELEMENT.outerHTML = $PAYLOAD
    - pattern: $JQ.add($PAYLOAD)
    - pattern: $JQ.html($PAYLOAD)

テストコード

dom-based-xss.js
const qs = window.location.search;
const hash = window.location.hash;

// ruleid:dom-based-xss
document.write(qs);
document.write(hash);

// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");

// ng
const e1 = document.createElement('p');
e1.innerHTML = qs;

// ng
const e2 = document.createElement('p');
e2.innerHTML = hash;

// ok
const e3 = document.createElement('p');
e3.innerHTML = "test"

// ok
$("div.test").html("test")

// ng
$("div.test").html(hash)

// ng
$("div.test").add(qs)

// ng
const referer = document.referrer
$("div.test").add(referer.substring(1,2))

// ng
const searchParams = new URLSearchParams(window.location.search)
const firstname = searchParams.get('firstname')
$("div.test").add(firstname)

// ng
const names = [searchParams.get('firstname'), searchParams.get('lastname')]
$("div.test").add(names.join(' '))

// ng
const arr = []
arr.push(searchParams.get('firstname'))
arr.push(searchParams.get('lastname'))
$("div.test2").html(arr.join(' '))
takutoytakutoy

実行結果

$ semgrep --config dom-based-xss.yaml dom-based-xss.js 
Scanning 1 file.

Findings:

  dom-based-xss.js 
     dom-based-xss
        dom-xss

          5┆ document.write(qs);
          ⋮┆----------------------------------------
          6┆ document.write(hash);
          ⋮┆----------------------------------------
         13┆ e1.innerHTML = qs;
          ⋮┆----------------------------------------
         17┆ e2.innerHTML = hash;
          ⋮┆----------------------------------------
         27$("div.test").html(hash)
          ⋮┆----------------------------------------
         30$("div.test").add(qs)
          ⋮┆----------------------------------------
         34$("div.test").add(referer.substring(1,2))
          ⋮┆----------------------------------------
         39$("div.test").add(firstname)
          ⋮┆----------------------------------------
         43$("div.test").add(names.join(' '))
          ⋮┆----------------------------------------
         49$("div.test2").html(arr.join(' '))


Ran 1 rule on 1 file: 10 findings.
takutoytakutoy

javascript が html に埋め込まれてる場合は検出できないらしい

<html>
    <body>
        <script>
const qs = window.location.search;
const hash = window.location.hash;

// ruleid:dom-based-xss
document.write(qs);
document.write(hash);

// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
        </script>
    </body>
</html>
$ semgrep --config dom-based-xss.yaml dom-based-xss.html
Nothing to scan.


Ran 1 rule on 0 files: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
  $ semgrep shouldafound --help
takutoytakutoy

こんな感じ?

- id: extract-html-to-javascript
  mode: extract
  languages:
    - html
  pattern: <script ...>$...SCRIPT</script>
  extract: $...SCRIPT
  dest-language: javascript
takutoytakutoy

いかん

$ semgrep --config dom-based-xss.yaml dom-based-xss.html
Scanning 1 file.


Ran 2 rules on 1 file: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
  $ semgrep shouldafound --help
takutoytakutoy

パターンは間違ってなさそう

test.yaml
rules:
- id: extract-html-to-javascript
  languages:
    - html
  pattern: <script>$...SCRIPT</script>
  message: test
  severity: INFO
$ semgrep --config html.yaml dom-based-xss.html
Scanning 1 file.

Findings:

  dom-based-xss.html 
     extract-html-to-javascript
        test

          3<script>
          4┆ const qs = window.location.search;
          5┆ const hash = window.location.hash;
          67┆ // ruleid:dom-based-xss
          8┆ document.write(qs);
          9┆ document.write(hash);
         1011┆ // ok:dom-based-xss
         12┆ document.write("<OPTION value=2>English</OPTION>");
           [hid 1 additional lines, adjust with --max-lines-per-finding]


Ran 1 rule on 1 file: 1 finding.
takutoytakutoy
rules:
  - id: curl-eval
    severity: WARNING
    languages:
      - bash
    message: Evaluating data from a `curl` command is unsafe.
    mode: taint
    pattern-sources:
      - pattern: |
          $(curl ...)
      - pattern: |
          `curl ...`
    pattern-sinks:
      - pattern: eval ...
  - id: extract-docker-run-to-bash
    mode: extract
    languages:
      - dockerfile
    pattern: RUN $...CMD
    extract: $...CMD
    dest-language: bash
  - id: extract-python-os-system-to-bash
    mode: extract
    languages:
      - python
    pattern: os.system("$CMD")
    extract: $CMD
    dest-language: bash
$ semgrep --config _extract-test.yaml _extract-test.py 
Scanning 1 file.

Findings:

  _extract-test.py 
     curl-eval
        Evaluating data from a `curl` command is unsafe.

          3┆ if system('eval `curl -s "http://www.very-secure-website.net"`'):


Ran 3 rules on 1 file: 1 finding.
takutoytakutoy
rules:
  - id: extract-docker-run-to-bash
    mode: extract
    languages:
      - dockerfile
    pattern: RUN $...CMD
    extract: $...CMD
    dest-language: bash
  - id: extract-python-os-system-to-bash
    mode: extract
    languages:
      - python
    pattern: os.system("$CMD")
    extract: $CMD
    dest-language: bash
  - id: curl-eval
    severity: WARNING
    languages:
      - bash
    message: Evaluating data from a `curl` command is unsafe.
    mode: taint
    pattern-sources:
      - pattern: |
          $(curl ...)
      - pattern: |
          `curl ...`
    pattern-sinks:
      - pattern: eval ...
$ semgrep --config _extract-test.yaml _extract-test.py 
Scanning 1 file.


Ran 3 rules on 1 file: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
  $ semgrep shouldafound --help
takutoytakutoy

できた!!

rules:
- id: dom-based-xss
  mode: taint
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-sources:
  - pattern-either:
    - pattern: location
    - pattern: window.location
    - pattern: document.location
    - pattern: document.URL
    - pattern: window.name
    - pattern: document.referrer
    - pattern: document.documentURI
    - pattern: document.baseURI
    - pattern: document.cookie
  pattern-propagators:
  - pattern: $S.push($E)
    from: $E
    to: $S
  pattern-sinks:
  - pattern-either:
    - pattern: document.write(...)
    - pattern: document.writeln(...)
    - pattern: document.domain = $PAYLOAD
    - pattern: $ELEMENT.innerHTML = $PAYLOAD
    - pattern: $ELEMENT.outerHTML = $PAYLOAD
    - pattern: $JQ.add($PAYLOAD)
    - pattern: $JQ.html($PAYLOAD)
- id: extract-html-to-javascript
  mode: extract
  languages:
    - html
  pattern: <script>$...SCRIPT</script>
  extract: $...SCRIPT
  dest-language: javascript
$ semgrep --config dom-based-xss.yaml dom-based-xss.html
Scanning 1 file.

Findings:

  dom-based-xss.html 
     dom-based-xss
        dom-xss

          8┆ document.write(qs);
          ⋮┆----------------------------------------
          9┆ document.write(hash);


Ran 2 rules on 1 file: 2 findings.
takutoytakutoy

実際使うときは extract ルールは別ファイルにして、コマンドで順番を指定するのがよさそう

$ semgrep --config dom-based-xss.yaml --config extract-html-to-javascript.yaml dom-based-xss.html
Scanning 1 file.

Findings:

  dom-based-xss.html 
     dom-based-xss
        dom-xss

          8┆ document.write(qs);
          ⋮┆----------------------------------------
          9┆ document.write(hash);


Ran 2 rules on 1 file: 2 findings.
takutoytakutoy

できたもの

dom-based-xss.yaml
rules:
- id: dom-based-xss
  mode: taint
  message: dom-xss
  languages:
  - javascript
  - typescript
  severity: ERROR
  pattern-sources:
  - pattern-either:
    - pattern: location
    - pattern: window.location
    - pattern: document.location
    - pattern: document.URL
    - pattern: window.name
    - pattern: document.referrer
    - pattern: document.documentURI
    - pattern: document.baseURI
    - pattern: document.cookie
  pattern-propagators:
  - pattern: $S.push($E)
    from: $E
    to: $S
  pattern-sinks:
  - pattern-either:
    - pattern: document.write(...)
    - pattern: document.writeln(...)
    - pattern: document.domain = $PAYLOAD
    - pattern: $ELEMENT.innerHTML = $PAYLOAD
    - pattern: $ELEMENT.outerHTML = $PAYLOAD
    - pattern: $JQ.add($PAYLOAD)
    - pattern: $JQ.html($PAYLOAD)
extract-html-to-javascript.yaml
rules:
- id: extract-html-to-javascript
  mode: extract
  languages:
    - html
  pattern: <script>$...SCRIPT</script>
  extract: $...SCRIPT
  dest-language: javascript

テストデータ

dom-based-xss.html
<html>
    <body>
        <script>
const qs = window.location.search;
const hash = window.location.hash;

// ruleid:dom-based-xss
document.write(qs);
document.write(hash);

// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
        </script>
    </body>
</html>
takutoytakutoy

実行結果

$ semgrep --config dom-based-xss.yaml --config extract-html-to-javascript.yaml dom-based-xss.html
Scanning 1 file.

Findings:

  dom-based-xss.html 
     dom-based-xss
        dom-xss

          8┆ document.write(qs);
          ⋮┆----------------------------------------
          9┆ document.write(hash);


Ran 2 rules on 1 file: 2 findings.
このスクラップは2022/12/22にクローズされました