🦔

【ML】List of Pooling Types

2024/07/17に公開

This article is a summary of pooling in machine learning.

1. Max Pooling

・Description:
Selects the max value in the filter.
・Feature:
Captures the most prominent features and is less sensitive to the exact location of features.

2. Average(Mean) Pooling

・Description:
Takes the average value in the filter.
・Feature:
Provides a smoother representation, but can lose sharp features.

3. Global Pooling

・Description:
Apply the pooling to all of values in the map. So the output will be one value per channel.
・Feature:
Typically used at the end of the network to convert a feature map into a single vector, which can be fed into a fully connected layer.

4. Min Pooling

・Description:
Selects the minimum value in the filter.
・Feature:
Not commonly used, but it can be beneficial in scenarios where the presence of low values is more critical.

5. L2-norm Pooling

Description: Computes the L2 norm in the filter.
・L2-norm Formula
||X||_2 = \sqrt{\sum^n_{i=1} x^2_i}

where x_i are the elements of the vector X.

・Example
consider a 2x2 pooling window applied to the following values.

[[1, 2]
 [3, 4]]

The L2-norm of this window is computed as:
\sqrt{1^2 + 2^2 + 3^2 + 4^2} = \sqrt{1 + 4 + 9 + 16} = \sqrt{30} \simeq 5.48

・Feature:
Useful in certain contexts where the magnitude of the feature is important or needs to reduce the sensitivity to outliers compared to max pooling.

6. Mixed Pooling

・Description:
Combines max pooling and average pooling by taking a weighted average of both.

・Formula
MixedPool(X) = \alpha \cdot MaxPool(X) + (1 - \alpha) \cdot AvgPool(X)

・Feature:
Can potentially combine the strengths of both max pooling and average pooling.

7. Fractional Max Pooling

・Description:
Performs pooling over non-integer(fractional) window sizes and strides.
・Feature:
Offers more flexibility and can be useful for more precise downsampling.

8. Adaptive Pooling

・Description:
Adjusts the pooling operation to output a fixed size, regardless of the input size.
・Feature:
Useful when the network needs to handle variable input sizes, often used in fully convolutional networks.

9. Stochastic Pooling

・Description:
Selects a value by following the calculated probability.

・Example
consider a 2x2 pooling window applied to the following values.

[[1, 2]
 [3, 4]]

Then, probability of each value is \frac{1}{10}, \frac{2}{10}, \frac{3}{10}, \frac{4}{10}. (10 is sum of values)
For instance, the value 4 has a 40% chance of being selected, while the value 1 has a 10% chance.

・Feature:
Good: Adds a regularization effect, reducing overfitting by introducing randomness.
Bad: Compare to basic pooling, unstable(incread variance), complex calculation, sensitive to hyperparameter(window size, stride).

10. Spatial Pyramid Pooling (SPP)

・Description:
Pools feature maps at multiple levels and combines the results.
Please see this site for more detail information(Incredibly easy to understand).

・Feature:
Allows the network to maintain spatial information at multiple scales, useful for handling varying feature without resizing.

Reference

[1] 空間ピラミッドプーリング層 (SPP-net, Spatial Pyramid Pooling) とその応用例や発展型

Discussion