🐤

[Kaggle]鳥コンペ反省会発表資料

2020/09/25に公開

TensorFlow

Kaggle

Keras

idea

🐥🐣🐥 鳥コンペ 🐥🐣🐥

🐧🐤🐦 Cornell Birdcall Identification 🐦🐤🐧
bird

In this competition, you will identify a wide variety of bird vocalizations in soundscape recordings.

鳥さんの鳴き声をあてるコンペ

振り返り🦜

今回メダルを獲れた要因は運8割とfinal submissionの選択2割くらい

最終Public Leaderboardは595th
運次第ではメダル圏内(139th以内)までいけるかなという感じで時間切れ

銅メダルボーダラインのPrivate Scoreはほぼ運ゲー😇

結果

△487 shake upで108th finish🥳

選択 Final Submission

Public | Private

0.565 | 0.600 (Last Submission!)
0.560 | 0.592

Public Best Scoreは選択せず

0.566 | 0.585

理由

ハイスコアの公開カーネルと同じ構成なのでoverfitしている気がした
テキトーにやって出したscoreなので気に入らなかった

最後のSubmissionでぎりぎり銅メダル圏内入っていた。

結局ハイスコア(public score 0.568)の公開カーネル [Inference] Birdsong Baseline: ResNeSt50-fast とほぼ同じ構成でテキトーにやって出したPublic Score(0.566)を最後まで超えられなかった...

ハイスコアの公開カーネルが存在すると、多くのユーザーがそれをもとにして近いスコアを出してくるので銅メダルボーダライン上に固まりができる。

overfitしている可能性があるので、その場合はそれらのユーザーはまとめてshake downする。
おそらく自分より上位にいた500人近くはそんな感じかと。

以前自分もハイスコア公開カーネルにちょっと手を加えたsubmissionでpublic銅メダル圏内から圏外にshake downしたことがあった😢

環境

Google Colaboratory TPU & Kaggle Notebook

まずはColab上でいろいろ試してみて最終段階ではKaggle環境と併用
いろいろと制限や面倒はあるが容量大きいコンペならTPU環境はオススメ
速いし無料だしサクッと始めるにはとても良い

108th Place Solution

コードはこちら
[108th Place Solution] Birdcall Keras TPU

Single model
No external data

Model

ResNet50 + Attention Block

Data Augmentation¶

mixup
spec augmentation
pitch & tempo augmentation
melspectrogram cropping
Secondary labels
I only used secondary labels from data's duration < 40

Balancing data

For large data size classes, downsampling to 80 samples each class.(removing long duration samples)

For small data size classes, upsamling to 60 samples each class.(splitting long duration samples)

Inferece

segmentwise_output

Predicting 5 second periods with 2.5 seconds intervals. (Half overlapped each period)

基本Model

Araiさんの公開カーネル(Introduction to Sound Event Detection)が知らないこと多くて研究しがいがありそうだったのでここからスタートすることに決定

残念ながらTensorflowだとPANNsのpretrained weightがないのでbackboneはResNet50に変更(TPU使いたかったのでTensorflow縛り...)

ResNet101やEfficientNetB0~B3も試したがResNet50が一番よかった

segmentwise_outputの扱いが難しかった
probability値が全体的に高すぎて推論に使えない

カーネルのコメント欄でカエルの人がお勧めしていた構成に変えたらそれっぽい値が出るようになった

各OutputのScore比較

output	threshold	Public Score	Private Score
segmentwise_output	0.9	0.565	0.600
norm_att * segmentwise_output	0.15	0.554	0.604
clipwise_output	0.5	0.557	0.597

対処を試みた課題

Weak Label　(切り方次第で付与されているlabelのtargetが存在しないことがある)
Imbalanced Data (classのサンプル数、各サンプルの長さ)
Domain Shift (学習データと実際の評価データに質的な違いがある)

Weak Label

30秒で切って訓練してみたがうまくいかなかった
画像系CNN modelだと横に長すぎるデータはうまく処理できない？

30秒以下のデータのみSecondary labelを使った

Imbalanced Data

サンプル数が多いclassはdurationが長いデータを削除、少ないclassは長いデータ分割して不均衡を解消してみた
外部データも使いたかったがTPU環境で使えるtensorflowの関数だと16bit wav fileしかdecodeできなかった
(親切な人が作ってくれた外部データのdatasetは32bit wavだったりmp3だったりで処理が面倒なので諦めた、自力で作る気力は湧かず...)

Domain Shift

Puclic LeaderboardのScore全然上がらないなぁと思っているうちにコンペが終わってしまった
一応雰囲気でData Augmentation一通り入れておいたけどどれだけ効いたのかは不明
denoiseも雑にやってみたがよくならなかったのですぐに諦めた

まとめ

Validationが信頼できない & 提出が2sub/1日しかなかったのでScoreの比較があまりできず、上記の運とfinal submissionの選択以外は何が効いたかよく分かっていない😅
Single ModelでのPublic Scoreがなかなか上がらなくてEnsembleやっている余裕もなかった

でもフーリエ変換とかAttentionとかいろいろ勉強できたし結果的にメダルも獲れたので収穫いっぱいのコンペでした💯

(おまけ)Kaggleコンペ用Google Colaboratory環境 Tips

kaggle APIを使う準備
kaggle上で取得できるユーザー認証情報が記載されたkaggle.jsonファイルをgoogle drive上に保存しておく

# 1.5.6以外だとdirectory構造が無視してdownloadされることがあるのでversionを指定
!mkdir -p ~/.kaggle
!cp '/content/drive/My Drive/kaggle/kaggle.json' ~/.kaggle/
!pip uninstall -y kaggle
!pip install --upgrade pip
!pip install kaggle==1.5.6

colab上にコンペデータをダウンロード
(TPU環境ではmodel訓練時はGCPにあるデータを使うのでcolab上にダウンロードする必要はないが、中身を気軽に確認するにはcolab上にあったほうが使いやすい)

# コンペのデータセットの場合
!kaggle competitions download -c birdsong-recognition -p /content
!unzip /content/birdsong-recognition.zip -d /content/birdsong-recognition

# ユーザーが作成したデータセットの場合
!kaggle datasets download ttahara/birdsong-resampled-train-audio-00
!unzip -q '/content/birdsong-resampled-train-audio-00.zip' -d '/content/birdsong-resampled-train-audio-00'

model訓練時のデータはGCPから取得するのでurlの指定が必要
kaggle notebook上ではこのようにしてurlを取得する

from kaggle_datasets import KaggleDatasets
GCS_PATH0 = KaggleDatasets().get_gcs_path('birdsong-resampled-train-audio-00')
GCS_PATH1 = KaggleDatasets().get_gcs_path('birdsong-resampled-train-audio-01')
GCS_PATH2 = KaggleDatasets().get_gcs_path('birdsong-resampled-train-audio-02')
GCS_PATH3 = KaggleDatasets().get_gcs_path('birdsong-resampled-train-audio-03')
GCS_PATH4 = KaggleDatasets().get_gcs_path('birdsong-resampled-train-audio-04')
print(GCS_PATH0)
print(GCS_PATH1)
print(GCS_PATH2)
print(GCS_PATH3)
print(GCS_PATH4)

colab上ではそのurlを使う(1週間くらいでurlは変わるのでちょっと面倒)

GCS_PATH0 = 'gs://kds-b98d76bd5f7834b320a498ec776b03cc18c0f2bd6a2f62768bc537b4'
GCS_PATH1 = 'gs://kds-037d49f407c58e0de22d4d3e0f818339c72d58a70b8b0e18c28ff5c8'
GCS_PATH2 = 'gs://kds-fc6e708a8b93cf352fc5dbb6e1285d8d448c2630d0f196d1884fe32e'
GCS_PATH3 = 'gs://kds-3db6d48b13485226f22c8cfc1161476a502826ce71e2983f51925f1b'
GCS_PATH4 = 'gs://kds-7e5cb611f67dcef1dacbda3eaa568e89a375440bdbaabd0e1540dc0a'

訓練後に保存したmodel weight(model_XXXX.h5')をdataset化してkaggleで使えるようにする
google driveにdataset-metadata(dataset作成時に必要な情報が記載されている)を保存しておく

!mkdir /content/kaggle
!cp "/content/model.h5" '/content/kaggle/model_XXXX.h5'
# google driveにも保存しておく
!cp '/content/kaggle/model_XXXX.h5' '/content/drive/My Drive/kaggle/birdsong-recognition/model_XXXX.h5' 
!cp '/content/drive/My Drive/kaggle/birdsong-recognition/dataset-metadata-birdsong-recognition-weight.json' '/content/kaggle/dataset-metadata.json'
!kaggle datasets version -p /content/kaggle -m "Updated data XXXX"

# dataset作成時はこちら
# !kaggle datasets create -p /content/kaggle