In this thesis, we propose a generalized product quantization (PQ) algorithm for neural network compression. Compared to scalar quantization, PQ offers the potential to achieve an extremely high compression rate. However, the block size constraints pose a challenge in finding an appropriate quantization configuration under a restricted storage budget. To overcome this limitation, we propose an algorithm, adaptive padding, which enables PQ to be applied to arbitrary block sizes and makes the compression rate of a quantized model more flexible. Adaptive padding is orthogonal to previous PQ approaches which focus on better optimization. Moreover, we employ a simple approach to determine suitable block sizes for each layer. Experimental results demonstrate that our method can generalize PQ without additional accuracy drops and can effectively enhance the performances when incorporated with existing PQ works.