【Kaggle】HMS 3rd Solution part2 ~MixNet~


This is a countinuing of part1.
In this article, I'll explain model and beyond of this solution:

5 Models

Authors made two model.
・2DCNN model using argumented melspectrogram
・1DCNN with squeezeformer

5.1 2DCNN

Architecture is here:

How to make argumented melspectrogram

First, stacking eeg and extract center 10 second, applying FFT with various win_length and hop_length for zooming center. It make model performance inprove.
Next, applying FFT to not clipped eeg, and clip both side(we don't need center infromation).
Lastly, the input image is constructed by extracted melspectrogram being sandwiched between sides melspectrogram.


After making melspecs, they normalize patchworked it with 2 ways.

  1. x.clip(-1024, 1024) / 32
  2. Subtract 15 other channel nodes, from each single node

Normalization 1 provides general value than sample/batch wise normalization.
Normalization 2 allows the original waveform of each channels to stand out(under the assuming that each node is of couse affected from others).


Above is how to make input images, and then we can get the prediction result by input it to efficientnet with Mixconv and KLLoss.

Mixconv: depthwise conv with free filter size. Divide the input into several groups and apply different size filters to each group.(In here, input is divided 4 groups and applied by 3x3, 5x5, 7x7, 9x9 size filter)

I'm sorry, today is over.
I'll write continuation at next time.


[1] DIETER, 3rd place solution, kaggle, 2024
[2] darraghdog, kaggle-hms-3rd-place-solution, github, 2024