Skip to content
This repository has been archived by the owner on May 9, 2024. It is now read-only.
This repository has been archived by the owner on May 9, 2024. It is now read-only.

QuantizeDequantizeG's backward() function doesn't be executed!!!! #3

Open
wuzhiyang2016 opened this issue Jul 17, 2020 · 7 comments
Open
Labels

Comments

@wuzhiyang2016
Copy link

hello, when i train quantized model by pytorch-jacinto-ai-devkit, a class called QuantizeDequantizeG, when doing loss.backward(), it's backword() function is not used,,, the code is in xnn/layers/funtion.py

Look forward to your favourable reply sincerely ~

@mathmanu
Copy link
Collaborator

Sorry, I couldn't understand. Please post the exact error that you are getting and describe the situation in further detail.

@mathmanu
Copy link
Collaborator

mathmanu commented Jul 18, 2020

Hi wuzhiyang2016, Reading your comment carefully, I understood better.

What you are saying is that the backward method of QuantizeDequantizeG is not called during back-propagation (loss.backward()). This is a good question and it shows that you have tried to analyze and understand what is happening. Let me answer in detail:

This is the crux of Straight Through Estimation (STE) - is that backward does not involve any quantization - it goes backward straight through as though no quantization happened in the forward.

We support three kinds of Quant Estimation methods, that you can see as defined in qaunt_base_module.py:
class QuantEstimationType:
QUANTIZED_THROUGH_ESTIMATION = 0
STRAIGHT_THROUGH_ESTIMATION = 1
ALPHA_BLENDING_ESTIMATION = 2

STRAIGHT_THROUGH_ESTIMATION is default. You can see this being set in quant_train_module.py
self.quantized_estimation_type = QuantEstimationType.STRAIGHT_THROUGH_ESTIMATION

If you change the above to QUANTIZED_THROUGH_ESTIMATION, you can see that the backward of QuantizeDequantizeG will be called.

I would like to highlight a couple of limitations:
(1) ONNX export may not work if you do the above change - due to what seems like a change in handling of custom/symbolic functions in PyTorch. You can disable ONNX export if you try the above.
(2) If you try ALPHA_BLENDING_ESTIMATION and face an assertion, a small fix is required - in the forward function of QuantTrainPAct2, you can change the relevant lines to:
elif (self.quantized_estimation_type == QuantEstimationType.ALPHA_BLENDING_ESTIMATION):
if self.training:
# TODO: vary the alpha blending factor over the epochs
y = y * (1.0-self.alpha_blending_estimation_factor) + yq * self.alpha_blending_estimation_factor
else:
y = yq
#
elif (self.quantized_estimation_type == QuantEstimationType.QUANTIZED_THROUGH_ESTIMATION):

I hope this helps. Best regards,

@mathmanu mathmanu reopened this Jul 18, 2020
@wuzhiyang2016
Copy link
Author

very thanks, i just see your reply, let me take some time to understand your great reply~

@wuzhiyang2016
Copy link
Author

wuzhiyang2016 commented Jul 21, 2020

hello, i have check the class QuantizeDequantizeG it's backward code, dx is the same with the paper's formula (8), but the scale derivative is not the same with the paper's formula (6), could you give me some advice? the paper which i mean is "TRAINED QUANTIZATION THRESHOLDS FOR ACCURATE AND EFFICIENT FIXED-POINT INFERENCE OF DEEP NEURAL NETWORKS
"

@mathmanu
Copy link
Collaborator

mathmanu commented Jul 21, 2020

Hi,

The backward of QuantizeDequantizeG has numerical gradient: https://en.wikipedia.org/wiki/Numerical_differentiation
It's not the one from that paper.

Also our recommended quantized_estimation_type is STE, in which case this gradient is not used at all.

I hope it is clear.

@wuzhiyang2016
Copy link
Author

i get, it's really clear ! very thanks!!!

@mathmanu
Copy link
Collaborator

Keeping this open as an FAQ item, so that others can also benefit.

@mathmanu mathmanu reopened this Jul 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants