👌
[Quantization] int4 vs fp4 which to choose?
Int4 vs fp4
Comparison
Feature | INT4 | FP4 |
---|---|---|
Distribution | Uniform | Non-Uniform |
Dynamic Range | Limited, linear range. | Wide (due to exponent). |
Calibration | Sensitive to outliers | Robust to outlier. |
Dequant. Cost to FP16 | multiply-add operation (FP16_value = (INT4_value - ZeroPoint) * ScaleFactor ). |
bit-shift conversion(efficient in hardware). |
Power Consumption | lower | higher |
which to choose?
INT4: if the hardware doesn't support FP4 well. low power edge device
FP4: basically, any model
Discussion