Skip to content
This repository was archived by the owner on Mar 5, 2025. It is now read-only.

fix: web3-validator ajv memory leak #5574

Merged
merged 3 commits into from
Nov 3, 2022

Conversation

mpetrunic
Copy link
Contributor

@mpetrunic mpetrunic commented Oct 28, 2022

Description

Please include a summary of the changes and be sure to follow our Contribution Guidelines.

Ajv from version 7 onwards uses object references to compile and cache schema. Since web3.js is passing a new object/array for every single validation, this causes every object to be compiled and attached to the schema cache which results in a massive memory leak.

This PR removes that leak by stringifying schemas and using the stringified form as the cache key. But since we now store both string, and compiled schema, memory usage is higher.

Also, we still have leak if the user starts using one-time validation schemas. Which we would need to remove or they end up permanently attached to the default validator.

resolves #5515

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist for 1.x:

  • I have selected the correct base branch.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • Any dependent changes have been merged and published in downstream modules.
  • I ran npm run dtslint with success and extended the tests and types if necessary.
  • I ran npm run test:cov and my test cases cover all the lines and branches of the added code.
  • I ran npm run build with success.
  • I have tested the built dist/web3.min.js in a browser.
  • I have tested my code on the live network.
  • I have checked the Deploy Preview and it looks correct.
  • I have updated the CHANGELOG.md file in the root folder.

Checklist for 4.x:

  • I have selected the correct 4.x base branch.
  • Within the description, the feature or issue is discussed in detail or an issue is linked.
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added any required tests for the changes I made
  • I have made corresponding changes to the documentation
  • Any dependent changes have been merged and published in downstream modules.
  • I ran yarn successfully
  • I ran yarn lint successfully
  • I ran yarn build:web successfully
  • I ran yarn test:unit successfully
  • I ran yarn test:integration successfully
  • I ran compile:contracts successfully
  • I have tested my code.
  • I have updated the corresponding CHANGELOG.md file in the packages I have edited.

@jdevcs jdevcs added the 4.x 4.0 related label Oct 28, 2022
@jdevcs jdevcs requested a review from a team October 28, 2022 17:37
Copy link
Contributor

@Muhammad-Altabba Muhammad-Altabba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution.
It seems great!
However, I think it would be better to use a hash for the key instead of the full JSON. And I added a comment regarding this.
Thanks,

@mpetrunic mpetrunic mentioned this pull request Nov 1, 2022
@mpetrunic
Copy link
Contributor Author

I don't think I caused tests to fail, they seem to be flaky but I cannot rerun them^^

@Muhammad-Altabba
Copy link
Contributor

I don't think I caused tests to fail, they seem to be flaky but I cannot rerun them^^

Sorry @mpetrunic but it seems to be caused by the last update as the pipeline issues keep showing when I re-run it 3 times. And it is only shown after the latest update to use the hash function. And so it would be so nice of you to count for the time of the hash function and to count how many times this hash function is called (to check if there is something wrong with the code that causes unneeded calls for example).
Thanks,

@mpetrunic
Copy link
Contributor Author

I don't think I caused tests to fail, they seem to be flaky but I cannot rerun them^^

Sorry @mpetrunic but it seems to be caused by the last update as the pipeline issues keep showing when I re-run it 3 times. And it is only shown after the latest update to use the hash function. And so it would be so nice of you to count for the time of the hash function and to count how many times this hash function is called (to check if there is something wrong with the code that causes unneeded calls for example). Thanks,

Do you suggest increasing the timeout on those tests or reverting the last commit?

I mean converting to bytes, hashing, and converting to hex has some cost and this is quite a hot path as it gets called for every validation.

@mpetrunic mpetrunic force-pushed the fix/from-wei-mem-leak-5515 branch from c58b831 to e1a233e Compare November 2, 2022 19:31
@Muhammad-Altabba
Copy link
Contributor

I don't think I caused tests to fail, they seem to be flaky but I cannot rerun them^^

Sorry @mpetrunic but it seems to be caused by the last update as the pipeline issues keep showing when I re-run it 3 times. And it is only shown after the latest update to use the hash function. And so it would be so nice of you to count for the time of the hash function and to count how many times this hash function is called (to check if there is something wrong with the code that causes unneeded calls for example). Thanks,

Do you suggest increasing the timeout on those tests or reverting the last commit?

I mean converting to bytes, hashing, and converting to hex has some cost and this is quite a hot path as it gets called for every validation.

Many thanks @mpetrunic
And what do you think about using a cheaper hash function like the one called hashCode here:
#5574 (comment)

@mpetrunic mpetrunic force-pushed the fix/from-wei-mem-leak-5515 branch from e1a233e to 68b58a4 Compare November 2, 2022 20:03
@mpetrunic
Copy link
Contributor Author

I don't think I caused tests to fail, they seem to be flaky but I cannot rerun them^^

Sorry @mpetrunic but it seems to be caused by the last update as the pipeline issues keep showing when I re-run it 3 times. And it is only shown after the latest update to use the hash function. And so it would be so nice of you to count for the time of the hash function and to count how many times this hash function is called (to check if there is something wrong with the code that causes unneeded calls for example). Thanks,

Do you suggest increasing the timeout on those tests or reverting the last commit?
I mean converting to bytes, hashing, and converting to hex has some cost and this is quite a hot path as it gets called for every validation.

Many thanks @mpetrunic And what do you think about using a cheaper hash function like the one called hashCode here: #5574 (comment)

It generates a 32bit number as the hash. I don't think that has enough collision resistance. And collision here would be a nightmare to debug.

Ideally, we would use md5 or sha1 but they are not available so I switched now to blake2b which should be the fastest.

@mpetrunic mpetrunic force-pushed the fix/from-wei-mem-leak-5515 branch from 68b58a4 to 0733bc3 Compare November 2, 2022 20:54
@jdevcs jdevcs merged commit ca536a4 into web3:4.x Nov 3, 2022
@jdevcs
Copy link
Contributor

jdevcs commented Nov 3, 2022

Merged this in to 4.x even there is release branch out there as: #5580 (comment)

@mpetrunic mpetrunic deleted the fix/from-wei-mem-leak-5515 branch November 3, 2022 10:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
4.x 4.0 related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants