Skip to content

Dubious code for vendor guessing in OSV source #3200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gluesmith2021 opened this issue Aug 4, 2023 · 1 comment · Fixed by #3225
Closed

Dubious code for vendor guessing in OSV source #3200

gluesmith2021 opened this issue Aug 4, 2023 · 1 comment · Fixed by #3225

Comments

@gluesmith2021
Copy link
Contributor

gluesmith2021 commented Aug 4, 2023

This is probably only a potential issue as it doesn't seem to have any effect for now.

On the following line, what is "github" doing there? was it supposed to be part of another condition to check?

https://github.com/intel/cve-bin-tool/blob/ae66713878e2b53d22134fab6eb34d9462beac71/cve_bin_tool/data_sources/osv_source.py#L300C16-L300C16

As for now, only the "/" check is really a condition ("github" is always converted to True for the if), so a vendor name is extracted for every product name with a slash in it.

It has no effect for now: when populating the DB, for any given CVE, either

  • the vendor name guessed from OSV source is replaced with previously ingested vendor names for the CVE product,
  • the CVE entry is dropped if there is no pre-existing vendor name.

This would become an issue if guessed vendor names would actually be used (or are there cases already where they are used?) For instance, many product names for Andoid-related CVEs from OSV are path-like, and the second last part does not hint at a vendor name, such as "platform/frameworks/base", "platform/build/soong", "platform/external/v8", etc.

This also raises the question whether OSV source parser should try to guess vendor names at all if they are not meant to be used.

@terriko
Copy link
Contributor

terriko commented Aug 8, 2023

I'd have to dig through the history, but I think this was started to address a particular issue with how things that looked like github urls were being parsed? I think you're right that it's not the best possible solution.

I think eventually PURL adoption might solve this, and we've been noodling around some ideas about having a set of mappings that uses PURL internally (See #3180 for some discussion), but that's likely a ways out.

I'm very open to ideas on viable heuristics to make this less inaccurate Right Now if anyone's got good ideas, and I'm open to someone working on #3180 to make it happen sooner and/or better.

terriko pushed a commit that referenced this issue Aug 14, 2023
removed use of `ECOSYSTEM` versions, and corrected vendor name guessing from package name.

fixes #3200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants