🐰

【Kaggle】HMS 3rd Solution part1 ~Introduction~

2024/05/18に公開

機械学習

Kaggle

tech

This is a summary of this great work(HMS 3rd):

Don't forget upvote to him!

1. Competition Overview

Classify the type of activity shown by a patient's brain waves. This is a competition where predictions are made based on experts' evaluations as correct answers.

In this competition, both raw electroencephalogram (EEG) and spectrogram (spectrogram) are provided, and the spectrogram is 10 minutes long and the EEG waveform is 50 seconds long, and the middle 50 seconds of these data are the same data. , this period presents the same data in two ways.

eeg: Generally imagined waveform data (stock prices, etc.)
spectrogram: In general, a spectrogram is a three-dimensional (3D) image obtained by dividing electrical signals obtained from electrodes into short time windows and applying short-time Fourier transform (STFT). This refers to data that has information on time, frequency, and power spectral density. In other words, all dataset spectrogram files are information after Fourier transformation of EEG data. A total of four spectrograms (LL, LP, RP, RR) can be created using the data from the electrodes within the colored frames shown below.

overview:

2. Solution Overview

The solution is an ensenble that combine two model.

2DCNN with melspectrogram
1D-Convolutions to encode the raw eeg data before modeling with Squeezeformer blocks.

In this competition, slecting data was important. They used only having more than 9 votes(indicator of data quality).
Additionaly, they found out augmentation that the data creator applied and they apply reverse transform, and get original dataset only 6350 rows(other 100000 rows were discarded).
Correlation between LB and CV was much improved by this.

3. Preprocessing

They didin't make each dataset on the disk, they using torchauidio's GPU inprementation to make melspectrogram on the fly, and sped up the various experiment.

The double banana montange introduced by @cdeotte was used throughout they solution, but stacking 16 signals next to one another had limitations in that the each node would interact more strongly in the model with it’s neighbours.
They try to solve this problem, and found well worked conbination which essentially looks at the left and right side of the brain for each node together.

Fp1>F7 Fp2>F8 F7>T3 F8>T4 T3>T5 T4>T6 T5>O1 T6>O2 Fp1>F3 Fp2>F4 F3>C3 F4>C4 C3>P3 C4>P4 P3>O1 P4>O2

They observed it was important not to normalize data batch-wise and sample-wise from @medali1992 Resnet 1D GRU inprementation.
Instead that, they used x = x.clip(-1024, 1024) / 32 after the bandpass filter.

4. Augmentation

In both models, augmentation was highly effective.

The following augmentations were used by both models.
・Overwrite with zeros between 1 to 8 melspec nodes, from the total 16, in 50% of samples.
・Randomly choose a different narrower butter bandpass range in between 1 and 8 melspec nodes, from the total 16, in 20% of samples.
・In 50% of samples, randomly shift the 50s window around the center point by up to 20 seconds.

In the 1D models only.
・In 50% of samples, left-right flip the signal time wise.
・In 50% of samples, switch the sides of the brain left-right.

In the 2D models only.
・In 50% of samples for the zoomed center, using a static 50s window, randomly shift the 10 second center point within the window by up to 5 seconds.

Summary

sorry, today is end here.
This solution give us many things until this time, example:
・Using x = x.clip(-1024, 1024) / 32 as a normalization instead of batch/sample unit normalization
・They reduced data to 6350 form over 100000, but model performance was inproved. It shows precisely data is needed than huge data at times(of couse at leaset data size is needed).

It's so interesting result.
Next time I will explain from this continuation, thank you for reading.

Reference

[1] DIETER, 3rd place solution, kaggle, 2024
[2] darraghdog, kaggle-hms-3rd-place-solution, github, 2024