-
Notifications
You must be signed in to change notification settings - Fork 230
Use get or create to reduce the number of objects created #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@singh1114 #40 says "use update_or_create
" but in this PR you use get_or_create
. Can you explain why? Even better would be if you can edit #40 to explain what the problem is that using either of these (nevermind, that is already explained by the comment by @pombredanne)QuerySet
methods is meant to solve.
@singh1114 Thank you for this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good otherwise, though do you think there could be a unit test that would show that there were issues with a bare create before?
b303b74
to
2e9dec3
Compare
@pombredanne Did that! |
I think we already had tests which specified the number of objects being created. Reducing the count in the tests itself will account for the same thing. |
Fixes #40 Signed-off-by: Ranvir Singh <[email protected]>
2e9dec3
to
7f1b0ce
Compare
@singh1114 I still don't see how this PR fixes #40. Did you mean to reference #28, which is about duplicate data? Either way, I think the import process needs to be redesigned (and called "import" instead of "data dump", but let's not bikeshed about that just now :). Apart from issues #28 and #40, the current process is also not scalable, already way too slow, and it will only get worse by using either We should distinguish between importing historical data once, when an instance of vulnerablecode is provisioned, and continuously importing updates from vulnerability sources. Otherwise the time for the latter process will continue to grow with the amount of data. |
@haikoschol Sorry meant to refer #28 for the duplicate data. #40 doesn't seem to be correct and is a duplicate to #28 Agreed with your view of differentiating both the procedures. For improving the speed of the dump, we can use Anyway, we will have to make sure the data in the DB is unique. |
@singh1114 I agree that we should use However, I don't think that #40 is a duplicate of #28. I suggest we continue the discussion what #40 is about directly on the issue instead of this PR. |
At this stage and based on the discussions here and the changes that were applied, I think it is best to close and continue the discussion in #40 |
Remove macos 10.14 job from azure-pipelines.yml
Fixes #40
Signed-off-by: Ranvir Singh [email protected]