Skip to content

feat: added a function to utilize purl integration #4164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 11, 2024

Conversation

inosmeet
Copy link
Contributor

@inosmeet inosmeet commented Jun 7, 2024

This implementation resolves name collision for docutils #3152.
I have temporarily added the database related code in parsers directory for demonstration until we include it in cache.
For the simplicity's sake, I have only modified python parser. After this one gets green flag, I'll add the other.

@terriko @anthonyharrison

modified python language parser, tailored purls according to the
purl2cpe requirement, resolves docutils name collision

Signed-off-by: Meet Soni <[email protected]>
@inosmeet inosmeet changed the title feat: added a function to utilise purl integration feat: added a function to utilize purl integration Jun 7, 2024
if cpeList != []:
for item in cpeList:
vendor, product, version = sbom.decode_cpe23(str(item))
location = "/usr/local/bin/product"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what to do about this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is location being used for? How is the path established? As written, this won't work on Windows.

I think the location in the ProductInfo is the file location of the application which is being scanned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too thought that it's the location of the product binary being scanned, but wasn't sure so I put a default string similar to find_vendor() method.

Apart from windows, it should be working on the Linux, however it's not able to find the installed database in the cache.

@terriko
Copy link
Contributor

terriko commented Jun 7, 2024

From your email, it sounds like we just need to add purl2cpe in as a data source. It's not quite as complicated as the other data sources because right now we aren't parsing data or anything, just grabbing the db file and going, so it's going to be a slightly awkward fit but I think we can make it work.

I'd suggest you do it rather than me, but since I was just in the sources updating stuff let me lay out what I think needs to happen

Here's what you'd need to do:

  1. Make the file cve_bin_tool/data_sources/purl2cpe.py
  2. Make a Data_Source for purl2cpe inside that file. You can look at the other data sources for how the __init__ function parameters are set up, but don't bother duplicating much of the rest of it.
    • The only function that actually matters here is the get_cve_data one. That's the one that's called in cvedb when the data sources are updated.
    • The function name becomes slightly nonsensical here since we're not actually getting CVE data but rather purl/cpe data, but I don't think it's worth you going and refactoring it to be get_data() right now in part because it'll break my EPSS PR that's still in progress. But pretend it's called get_data if it makes it feel less weird.
  3. Inside the get_cve_data() function, you want to download the purl2cpe database and stick it in the .cache directory. You should be able to figure that out from the other data sources; make sure to include error checking for failures during the request call so it'll fail gracefully if there's a timeout or something. You don't need to do parsing or anythign for this one, just download and make sure it's in the correct place.
  4. Go to cli.py and add the ability to enable/disable the data source.
  5. Go to cvedb.py and edit the populatedb() function so it doesn't try to do anything further with the purl2cpe data.
    • again, I think we probably need to refactor the sources so this works better eventually, but not today!
  6. If you can think of any tests to go with this, that would be great, but it's ok if we don't have any for now. I'll be writing a "check if it's disabled" type test for the EPSS stuff and we can hook into that when it's ready, at least.

Writing this makes me think maybe I should also add some docs on how to write a data source, so I'll do the docs writing, but you should write the code for purl2cpe specifically!

@inosmeet
Copy link
Contributor Author

This should be working now. After removing cache manually and running the tool, it works fine locally. Failing tests too, pass.

Copy link
Contributor

@anthonyharrison anthonyharrison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments.

Need to create some test scripts for PURL2CPE data source.

if cpeList != []:
for item in cpeList:
vendor, product, version = sbom.decode_cpe23(str(item))
location = "/usr/local/bin/product"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is location being used for? How is the path established? As written, this won't work on Windows.

I think the location in the ProductInfo is the file location of the application which is being scanned.

@inosmeet
Copy link
Contributor Author

Added some comments.

Need to create some test scripts for PURL2CPE data source.

Yeah, I'll be working on the tests this week, after we're satisfied with the code.

Copy link
Contributor

@terriko terriko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so... our repo settings won't let me merge this because the tests don't pass, but the tests aren't passing because the cache doesn't have the purl2cpe database yet.

Can we maybe split out the part that adds the database so I can merge that and then update the cache, and then we'll be able to merge the rest?

@terriko
Copy link
Contributor

terriko commented Jun 11, 2024

The cache update is running now. I think you can see it here if you want to track when it finishes successfully and we can re-run tests here and hopefully have them all pass:

https://github.com/intel/cve-bin-tool/actions/workflows/update-cache.yml

@terriko terriko added the awaiting maintainer Need a maintainer to respond / help out label Jun 11, 2024
@terriko
Copy link
Contributor

terriko commented Jun 11, 2024

Whoops, didn't mean to mark that as approved before the tests run but whatever.

Cache appears to have updated cleanly, I've resolved the merge conflict, and I believe the tests should pass this time. I've got to run to a meeting but I'll be back to check on them later.

Copy link
Contributor

@terriko terriko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, tests are passing, let's review for real!

I think we're going to need to refactor decode_cp23 out of here and where it is in sbom manager and make a single function that we call, probably in util for now. Since you've probably gone to bed and this should solve at least the docutils issue, I think I'm going to err on the side of merging as is and I'll file a separate issue for a refactor.

@terriko terriko marked this pull request as ready for review June 11, 2024 20:37
@terriko terriko merged commit e9f1ea8 into intel:main Jun 11, 2024
22 checks passed
@inosmeet inosmeet deleted the purl2cpe branch June 19, 2024 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting maintainer Need a maintainer to respond / help out
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants