👌

[Quantization] int4 vs fp4 which to choose?

に公開

Int4 vs fp4

Comparison

Feature INT4 FP4
Distribution Uniform Non-Uniform
Dynamic Range Limited, linear range. Wide (due to exponent).
Calibration Sensitive to outliers Robust to outlier.
Dequant. Cost to FP16 multiply-add operation (FP16_value = (INT4_value - ZeroPoint) * ScaleFactor). bit-shift conversion(efficient in hardware).
Power Consumption lower higher

which to choose?

INT4: if the hardware doesn't support FP4 well. low power edge device
FP4: basically, any model

Discussion