-
Notifications
You must be signed in to change notification settings - Fork 229
Manage vulnerability sources in database #152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manage vulnerability sources in database #152
Conversation
7a7295c
to
6404723
Compare
4e76c42
to
08ef849
Compare
vulnerabilities/import_runner.py
Outdated
if vuln_info.cve_id: | ||
vuln, created = models.Vulnerability.objects.get_or_create(cve_id=vuln_info.cve_id) | ||
if created: | ||
vuln.summary = vuln_info.summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need some way to determine, which summary
is better and then decide to update the existing summary or leave it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not optimistic there is one. But even if there was a way, if all we have is the summary, we can't detect that a new vulnerability is actually an update to an existing vulnerability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of moving summary
from Vulnerability
model to the ImpactedPackage
model . So the context won't be lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory it makes sense that the same vulnerability could be described in different terms, depending on the affected package. But in practice, do any of the source we ingest so far do this? Or would we end up copying the same description to every ImpactedPackage
? That wouldn't really be an issue though.
Fundamentally the questions are:
- What is the definition of "vulnerability"?
- What is the minimum number of attributes (and what are those) for identifying a vulnerability?
c9e33a2
to
9c54a5d
Compare
7e322b7
to
63e607d
Compare
87ea90a
to
0eed1b5
Compare
16e4afb
to
6c4c5a1
Compare
6c4c5a1
to
38ad8fe
Compare
We do and it seems I missed that. Good catch!
Also good catch! :) |
a42de02
to
88fbe9f
Compare
Signed-off-by: Shivam Sandbhor <[email protected]>
This change introduces the model Importer, which contains metadata and a pointer to the implementation of the importers DataSource subclass. This subclass knows how to query whatever storage mechanism is used in that case (e.g. git repository with yaml files, a single OVAL XML file, HTTP API returning JSON, etc.) Additionally it defines an API and data model that DataSource subclasses need to provide to ImportRunner. ImportRunner leverages that API and data model to efficiently store vulnerability information in the database. Existing importers have not been migrated yet (only renamed from "scraper"). Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Having separate methods added_files() and updated_files() is not ideal since the same file can be returned by both methods. When a file has been added and then later modified within the window of time considered by these methods, the file should only be reported as an added file. Therefore this commit combines the methods added_files() and updated_files() into file_changes(), which returns a tuple of sets for added and updated files, respectively. It also fixes a time-math issue; commit timestamps are converted to UTC before being compared to the cutoff timestamp (that is always in UTC). Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Reduce the amount of converting between PackageURL and string to the bare minimum. Add/improve some type annotations and reduce API surface area. Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Many derived classes need a Unix timestamp with the cutoff date as well as a method to split a list of advisories into batches. Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
88fbe9f
to
06c6314
Compare
Signed-off-by: Haiko Schol <[email protected]>
06c6314
to
42c99fb
Compare
Signed-off-by: Haiko Schol <[email protected]>
This is an implementation of #123.
The change adds a new model
Importer
that stores metadata as well as the name of a class (avulnerabilities.data_source.DataSource
) that fetches and transforms vulnerability data into a common, DB schema agnostic format. This format is encoded in the two data classesvulnerabilities.data_source.Package
andvulnerabilities.data_source.VulnerabilityInfo
.Running an
Importer
and storing the results in the database is done byImportRunner
.The unused and redundant model
PackageReference
has been removed.Another minor, but overdue change is renaming the
scraper
module toimporters
.@reviewers:
It might be helpful to start with looking at how the
import
management command uses the above described classes and then jump into the rest of the code from there.