When Flat Minima Fail: Characterizing INT4 Quantization Collapse After FP32 Convergence
Summary
This paper reveals that neural networks undergo a three-phase quantization degradation during training: rapid learning, meta-stable plateau, and explosive INT4 collapse. Critically, the collapse begins when FP32 perplexity stops improving—not when learning rates decay—meaning continued training after convergence actively destroys quantization robustness while providing no FP32 benefit. INT8 quantization remains unaffected throughout, and weight distributions become more uniform (not more outlier-heavy) during collapse. The authors propose monitoring validation perplexity derivatives rather than learning rate schedules to predict quantization fragility, and demonstrate that calibrated oscillatory schedules can partially mitigate the problem.
Key findings
- INT4 quantization collapse follows a three-phase structure with a meta-stable plateau lasting ~70,000 steps before explosive divergence
- Divergence begins precisely when FP32 perplexity converges, not when learning rate decays, providing an actionable early stopping signal
- INT8 quantization remains <1% gap while INT4 reaches 517%, constraining the mechanism to 16-level grid resolution
- Weight kurtosis decreases during collapse phase, directly refuting outlier accumulation as the underlying mechanism
- Oscillatory schedules help only with calibrated amplitude; naive SGDR restarts uniformly worsen quantization robustness
How to implement
- Implement perplexity derivative monitoring in training pipelines to trigger early stopping before INT4 compatibility degrades, preserving quantization robustness for mobile/edge deployment
- Modify existing LLM training loops to save checkpoints at FP32 convergence point rather than training completion, ensuring better post-training quantization results
- Deploy calibrated oscillatory learning rate schedules in production training runs to maintain quantization compatibility while avoiding the performance degradation of naive warm restarts