QAT model saving bug : KeyError: '__inference_depthwise_conv2d_layer_call_fn_126 #868

peri044 · 2021-10-22T06:37:58Z

Describe the bug
Please download the scripts to reproduce from : https://drive.google.com/drive/folders/15cajAZ9sAZ2Uyix8sDVSYku6QCqDCec7?usp=sharing

Command to run : python sample_qat.py.

I have a simple model with input layer and a depthwise conv2d layer. I quantize this model by adding quantize_and_dequantize nodes at the input of depthwiseconv2d layer (commented in the code). When I save the model and load it back, I see the following

  File "/home/dperi/Downloads/py3/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 544, in <lambda>
    "function": lambda: self._recreate_function(proto.function),
  File "/home/dperi/Downloads/py3/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 586, in _recreate_function
    proto, self._concrete_functions), setattr
  File "/home/dperi/Downloads/py3/lib/python3.6/site-packages/tensorflow/python/saved_model/function_deserialization.py", line 295, in recreate_function
    concrete_function_objects.append(concrete_functions[concrete_function_name])
KeyError: '__inference_depthwise_conv2d_layer_call_and_return_conditional_losses_117'

System information

TensorFlow version (installed from source or binary): 2.5 (Tried with 2.6 as well)

TensorFlow Model Optimization version (installed from source or binary):

Saved model loading fails especially for Depthwise convolution. It works fine for regular conv.

The text was updated successfully, but these errors were encountered:

Jia-HongHenryLee · 2021-10-25T16:16:22Z

Hi @Xhark ,
I also have the same bug when I want to quantize Mobilenet v2.

System information

TensorFlow version (installed from binary): 2.5.0 => TensorFlow Model Optimization version (installed from binary): 0.6.0

TensorFlow version (installed from binary): 2.5.1 => TensorFlow Model Optimization version (installed from binary): 0.7.0

TensorFlow version (installed from binary): 2.4.0 => TensorFlow Model Optimization version (installed from binary): 0.7.0

Python version: 3.8.12

Jia-HongHenryLee · 2021-10-26T05:24:52Z

Hi @Xhark and @peri044 ,

I use the following environment to solve my problem.
System information
TensorFlow version (installed from binary): tf-nightly-gpu 2.5.0.dev20201202 (https://www.cnpython.com/pypi/tf-nightly-gpu/download)
TensorFlow Model Optimization version (installed from binary): 0.6.0
Python version: 3.8.12

daverim · 2021-11-01T05:42:21Z

Hi peri044@ and Jia-HongHenryLee@

I'm looking into it now, but there are a couple of workarounds.
First, it seems to save correctly if you use

model.save('export_dir', save_format='h5')

I think this is caused by incorrect shape handling for the depthwise kernel quantization parameters, which results in functions not being traced/merged correctly.

Thanks for reporting this.

peri044 · 2021-11-07T20:17:02Z

Thank you @daverim for addressing this.
Can you let me know when this would be resolved or if there's an active PR for this ?
I haven't tried h5 format, since I'm using saved model format to pass it through TF2ONNX (with custom utilities) for processing.

peri044 · 2021-11-15T17:07:37Z

Hello @daverim, can you please suggest some pointers for me on how to fix this locally (using saved_model format)? Which files/functions to look at ? Thanks !!

ChanZou · 2021-11-15T19:18:41Z

Hey @peri044. If your ultimate goal is to convert the model into TFLite format you can pass ConcreteFunction around. from_concrete_functions of TFLiteConverter works just fine for me.

peri044 · 2021-11-15T20:46:32Z

Hello @ChanZou My ultimate goal is to use the saved_model format (if it works) and pass it through TF2ONNX to convert it into ONNX graph. TF2ONNX accepts saved_model format for graphs currently.

peri044 · 2022-01-06T00:56:15Z

Thank you @daverim for addressing this. Can you let me know when this would be resolved or if there's an active PR for this ? I haven't tried h5 format, since I'm using saved model format to pass it through TF2ONNX (with custom utilities) for processing.

Hello @daverim, any suggestions on how to resolve this would be appreciated. Thanks !!

daverim · 2022-01-06T02:04:10Z

Hi sorry for the delay.

I just tested your sample code and it seems to be resolved now. There are some warnings about un-traced functions.

Using: tf=2.8.0-dev20210930, tfmot=tensorflow_model_optimization=0.7.0

Please try and see if it works for you.
Thanks,
David

peri044 · 2022-01-26T00:54:47Z

Thanks @daverim. That works now.

gcunhase · 2022-03-31T07:27:06Z

@daverim I encountered the same error log for SeparableConv2D using TF 2.8.0 (no error with DepthwiseConv2D in that TF version):

...
Traceback (most recent call last):
  File "/home/PycharmProjects/tensorrt_qat/examples/mobilenet/run_qat_workflow.py", line 156, in <module>
    main(verbose=True)
  File "/home/PycharmProjects/tensorrt_qat/examples/mobilenet/run_qat_workflow.py", line 142, in main
    tf.keras.models.save_model(q_model, os.path.join(qat_save_finetuned_weights, "saved_model"))
  File "/home/PycharmProjects/tensorrt_qat/venv38_tf2.8_newPR/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/PycharmProjects/tensorrt_qat/venv38_tf2.8_newPR/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 403, in map_resources
    raise ValueError(
ValueError: Unable to save function b'__inference_block2_sepconv1_layer_call_fn_670910' because it captures graph tensor Tensor("xception/quant_block2_sepconv1/LastValueQuant_1/QuantizeAndDequantizeV4:0", shape=(3, 3, 64, 1), dtype=float32) from a parent function which cannot be converted to a constant with `tf.get_static_value`.

Do you have any idea what caused the error in DepthwiseConv2D and if the same fix would work for SeparableConv2D?
Thank you!

k-w-w · 2022-05-18T22:48:08Z

The best way to avoid this issue is to disable the layer tracing when creating the SavedModel, but you'll have to manually define the serving_default function (this is the default name that is used in TF2ONNX).

@tf.function
def predict(*args, **kwargs):
  return model(*args, **kwargs)

arg_spec, kwarg_spec = model.save_spec()
model.save(path, save_traces=False, signatures={
  "serving_default": predict.get_concrete_function(*arg_spec, **kwarg_spec)
})

gcunhase · 2022-05-19T01:11:06Z

Hi @k-w-w thank you for your feedback! This specific issue (for DepthwiseConv) has been solved, as mentioned in a comment on Jan 26th above, but the same issue persists for SeparableConv here.

I tried your suggestion, but it did not solve my issue, since the problem is not with tf2onnx, but with saving the TF model. Do you have any additional suggestions please?
Thank you!

k-w-w · 2022-05-19T03:23:21Z

@gcunhase Are you getting the same error even with save_traces=False?

gcunhase · 2022-05-19T03:57:39Z

@k-w-w yes

k-w-w · 2022-05-19T17:01:04Z

@gcunhase can you paste the error trace?

gcunhase · 2022-05-20T04:39:34Z

@k-w-w :

...
Traceback (most recent call last):
  File "/home/nvidia/PycharmProjects/nvbugs/internal_filed/tf_key_inference_bug/TF_bug_separableconv2d/sample.py", line 24, in <module>
    model.save(model_save_path)
  File "/home/nvidia/PycharmProjects/nvbugs/venv38_trt_regression/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/nvidia/PycharmProjects/nvbugs/venv38_trt_regression/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 403, in map_resources
    raise ValueError(
ValueError: Unable to save function b'__inference_separable_conv2d_layer_call_fn_961' because it captures graph tensor Tensor("model/quant_separable_conv2d/LastValueQuant_1/QuantizeAndDequantizeV4:0", shape=(3, 3, 3, 1), dtype=float32) from a parent function which cannot be converted to a constant with `tf.get_static_value`.

gcunhase · 2022-05-20T04:40:08Z

This bug also has the reproducible code, so we can move our discussion there if you agree.

gcunhase · 2022-07-21T16:01:11Z

This bug can be closed for DepthwiseConv2D.
For Conv2DTranspose and SeparableConv2D, please move the discussion here.
Thank you!

peri044 added the bug Something isn't working label Oct 22, 2021

sngyhan assigned Xhark Oct 25, 2021

gcunhase mentioned this issue Apr 18, 2022

QAT model saving bug: Unable to save function b'__inference_separable_conv2d_layer_call_fn_961' #964

Open

gcunhase mentioned this issue May 11, 2022

Saved model loading : KeyError: '__inference_depthwise_conv2d_layer_call_fn_126' keras-team/keras#14913

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QAT model saving bug : KeyError: '__inference_depthwise_conv2d_layer_call_fn_126 #868

QAT model saving bug : KeyError: '__inference_depthwise_conv2d_layer_call_fn_126 #868

peri044 commented Oct 22, 2021 •

edited

Loading

Jia-HongHenryLee commented Oct 25, 2021 •

edited

Loading

Jia-HongHenryLee commented Oct 26, 2021

daverim commented Nov 1, 2021

peri044 commented Nov 7, 2021

peri044 commented Nov 15, 2021

ChanZou commented Nov 15, 2021

peri044 commented Nov 15, 2021

peri044 commented Jan 6, 2022

daverim commented Jan 6, 2022

peri044 commented Jan 26, 2022

gcunhase commented Mar 31, 2022

k-w-w commented May 18, 2022 •

edited

Loading

gcunhase commented May 19, 2022

k-w-w commented May 19, 2022

gcunhase commented May 19, 2022

k-w-w commented May 19, 2022

gcunhase commented May 20, 2022

gcunhase commented May 20, 2022

gcunhase commented Jul 21, 2022

QAT model saving bug : KeyError: '__inference_depthwise_conv2d_layer_call_fn_126 #868

QAT model saving bug : KeyError: '__inference_depthwise_conv2d_layer_call_fn_126 #868

Comments

peri044 commented Oct 22, 2021 • edited Loading

Jia-HongHenryLee commented Oct 25, 2021 • edited Loading

Jia-HongHenryLee commented Oct 26, 2021

daverim commented Nov 1, 2021

peri044 commented Nov 7, 2021

peri044 commented Nov 15, 2021

ChanZou commented Nov 15, 2021

peri044 commented Nov 15, 2021

peri044 commented Jan 6, 2022

daverim commented Jan 6, 2022

peri044 commented Jan 26, 2022

gcunhase commented Mar 31, 2022

k-w-w commented May 18, 2022 • edited Loading

gcunhase commented May 19, 2022

k-w-w commented May 19, 2022

gcunhase commented May 19, 2022

k-w-w commented May 19, 2022

gcunhase commented May 20, 2022

gcunhase commented May 20, 2022

gcunhase commented Jul 21, 2022

peri044 commented Oct 22, 2021 •

edited

Loading

Jia-HongHenryLee commented Oct 25, 2021 •

edited

Loading

k-w-w commented May 18, 2022 •

edited

Loading