Skip to content
This repository was archived by the owner on Oct 13, 2021. It is now read-only.

Masking RNN with zeros input #386

Merged
merged 11 commits into from
Feb 27, 2020
Merged

Conversation

cjermain
Copy link
Contributor

This PR reproduces a discrepancy between the ONNX and Keras models for handling inputs that are fully masked. This occurs for RNN models that contain both a masking layer, and use a non-zero bias. When one of the input entries is fully zero (i.e. will be fully masked by the Masking layer), the RNN output is non-zero. Turning off the bias or using the default bias reproduces consistent behavior with Keras.

(Pdb) up
> /home/user/keras-onnx/tests/test_layers.py(1817)test_masking_bias()
-> self.assertTrue(run_onnx_runtime(onnx_model.graph.name, onnx_model, x, expected, self.model_files))
(Pdb) x
array([[[365.8103 , 107.39152, 701.68   , 615.55023, 468.88876],
        [831.7788 , 680.243  , 890.5216 , 550.38116, 970.3993 ],
        [134.01295, 771.0398 , 594.99915, 315.76196, 753.1429 ]],

       [[  0.     ,   0.     ,   0.     ,   0.     ,   0.     ],
        [  0.     ,   0.     ,   0.     ,   0.     ,   0.     ],
        [  0.     ,   0.     ,   0.     ,   0.     ,   0.     ]]],
      dtype=float32)
(Pdb) down
> /home/user/keras-onnx/tests/test_utils.py(153)run_onnx_runtime()
-> return res
(Pdb) expected
[array([[ 0.7615942,  0.       , -0.7615942,  0.       ,  0.       ,
         0.       ,  0.7615942,  0.       ],
       [ 0.       ,  0.       ,  0.       ,  0.       ,  0.       ,
         0.       ,  0.       ,  0.       ]], dtype=float32)]
(Pdb) actual
[array([[ 0.76159436,  0.        , -0.76159436,  0.        ,  0.        ,
         0.        ,  0.76159436,  0.        ],
       [ 0.2423832 ,  0.26090473,  0.08070117,  0.30603507,  0.13741694,
         0.24133024,  0.513518  ,  0.37147206]], dtype=float32)]

This is a problem for architectures using multiple RNN layers with masking. Any suggestions or thoughts are appreciated!

@claassistantio
Copy link

claassistantio commented Feb 17, 2020

CLA assistant check
All committers have signed the CLA.

@cjermain
Copy link
Contributor Author

The following is the ONNX graph in Netron. It seems like any masked rows are set to zeros in the input to the LSTM (at the Mult operator). I would have expected a operator after the LSTM to set any masked values to their corresponding value (i.e. zero in this case).
temp onnx

@cjermain
Copy link
Contributor Author

I've added a section to generate the sequence_lens input for the LSTM operator, based on the output mask from the Masking layer. This replicates the functionality in Keras, and fixes the blocking test that I had added earlier.

temp_sequential onnx

@cjermain cjermain requested a review from wenbingl February 24, 2020 03:15
@cjermain
Copy link
Contributor Author

This same strategy should be able to be applied to bidirectional.py. I will try that next.

@wenbingl
Copy link
Member

ready for reviewing or still WIP?

@cjermain cjermain changed the title [WIP] Masking RNN with zeros input Masking RNN with zeros input Feb 25, 2020
@cjermain
Copy link
Contributor Author

@wenbingl it would be great to get a review. I am thinking to hold off on the bidirectional implementation as a separate PR.

Copy link
Member

@wenbingl wenbingl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fixing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants