Use get or create to reduce the number of objects created #58

singh1114 · 2019-09-24T18:47:22Z

Fixes #40

Signed-off-by: Ranvir Singh [email protected]

haikoschol

@singh1114 #40 says "use update_or_create" but in this PR you use get_or_create. Can you explain why? ~~Even better would be if you can edit #40 to explain what the problem is that using either of these QuerySet methods is meant to solve.~~ (nevermind, that is already explained by the comment by @pombredanne)

pombredanne · 2019-09-26T06:30:40Z

@singh1114 Thank you for this!
do you mind to rebase your PR? I merged first @haikoschol 's and there is a merge conflict now.

pombredanne

This is looking good otherwise, though do you think there could be a unit test that would show that there were issues with a bare create before?

singh1114 · 2019-09-28T16:58:41Z

@pombredanne Did that!

singh1114 · 2019-09-28T17:01:43Z

This is looking good otherwise, though do you think there could be a unit test that would show that there were issues with a bare create before?

I think we already had tests which specified the number of objects being created. Reducing the count in the tests itself will account for the same thing.

Fixes #40 Signed-off-by: Ranvir Singh <[email protected]>

haikoschol · 2019-09-30T08:15:26Z

@singh1114 I still don't see how this PR fixes #40. Did you mean to reference #28, which is about duplicate data?

Either way, I think the import process needs to be redesigned (and called "import" instead of "data dump", but let's not bikeshed about that just now :). Apart from issues #28 and #40, the current process is also not scalable, already way too slow, and it will only get worse by using either get_or_create or create_or_update.

We should distinguish between importing historical data once, when an instance of vulnerablecode is provisioned, and continuously importing updates from vulnerability sources. Otherwise the time for the latter process will continue to grow with the amount of data.

singh1114 · 2019-09-30T19:08:56Z

@haikoschol Sorry meant to refer #28 for the duplicate data. #40 doesn't seem to be correct and is a duplicate to #28

Agreed with your view of differentiating both the procedures.

For improving the speed of the dump, we can use bulk_create (and it would be good to call it dump if we use that but handling duplicates in that case will be more difficult).

Anyway, we will have to make sure the data in the DB is unique.

haikoschol · 2019-10-01T15:44:38Z

@singh1114 I agree that we should use bulk_create, at least for the initial import, but probably for updates as well.

However, I don't think that #40 is a duplicate of #28. I suggest we continue the discussion what #40 is about directly on the issue instead of this PR.

pombredanne · 2019-11-18T09:45:09Z

At this stage and based on the discussions here and the changes that were applied, I think it is best to close and continue the discussion in #40

Remove macos 10.14 job from azure-pipelines.yml

singh1114 requested review from pombredanne and haikoschol September 24, 2019 18:47

haikoschol reviewed Sep 25, 2019

View reviewed changes

pombredanne reviewed Sep 26, 2019

View reviewed changes

singh1114 force-pushed the update_or_create branch from b303b74 to 2e9dec3 Compare September 28, 2019 16:56

Make use of get_or_create to reduce the number of objects created

7f1b0ce

Fixes #40 Signed-off-by: Ranvir Singh <[email protected]>

singh1114 force-pushed the update_or_create branch from 2e9dec3 to 7f1b0ce Compare September 28, 2019 18:22

pombredanne closed this Nov 18, 2019

pombredanne deleted the update_or_create branch January 27, 2020 11:46

pombredanne pushed a commit that referenced this pull request Apr 2, 2025

Merge pull request #58 from nexB/remove-mac-1014

7eb4ca8

Remove macos 10.14 job from azure-pipelines.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use get or create to reduce the number of objects created #58

Use get or create to reduce the number of objects created #58

singh1114 commented Sep 24, 2019

haikoschol left a comment •

edited

Loading

pombredanne commented Sep 26, 2019

pombredanne left a comment

singh1114 commented Sep 28, 2019

singh1114 commented Sep 28, 2019

haikoschol commented Sep 30, 2019

singh1114 commented Sep 30, 2019

haikoschol commented Oct 1, 2019

pombredanne commented Nov 18, 2019

Use get or create to reduce the number of objects created #58

Use get or create to reduce the number of objects created #58

Conversation

singh1114 commented Sep 24, 2019

haikoschol left a comment • edited Loading

Choose a reason for hiding this comment

pombredanne commented Sep 26, 2019

pombredanne left a comment

Choose a reason for hiding this comment

singh1114 commented Sep 28, 2019

singh1114 commented Sep 28, 2019

haikoschol commented Sep 30, 2019

singh1114 commented Sep 30, 2019

haikoschol commented Oct 1, 2019

pombredanne commented Nov 18, 2019

haikoschol left a comment •

edited

Loading