Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds dependency analysis #18543

Conversation

juanvallejo
Copy link
Contributor

@juanvallejo juanvallejo commented Feb 9, 2018

Depends on #18466
Depends on #18606

Adds dependency graph analysis.
Outputs "yours", "mine", "ours" dependencies.

Usage:

$ ./depcheck analyze --root=github.com/openshift/origin --entry=cmd/... --entry=pkg/... --entry=tools/... --entry=test/... --openshift --dep=github.com/openshift/origin/vendor/k8s.io/kubernetes

Output of the command above

cc @deads2k

@openshift-ci-robot openshift-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 9, 2018
@deads2k
Copy link
Contributor

deads2k commented Feb 9, 2018

graph still looks like you're cutting at github/org not github/org/repo

@juanvallejo juanvallejo force-pushed the jvallejo/analysis-deps-yours branch from 2437936 to 400edcd Compare February 10, 2018 00:26
@juanvallejo
Copy link
Contributor Author

@deads2k updated graph to reflect latest changes in #18466
Added "mine" and "ours" packages

@deads2k
Copy link
Contributor

deads2k commented Feb 12, 2018

remove all gihtub.com/openshift/origin/... packages from the yours/mine/ours output

@@ -91,7 +91,10 @@ func BuildGraph(packages *PackageList, excludes []string) (*depgraph.MutableDire

to, exists := g.NodeByName(dependency)
if !exists {
return nil, fmt.Errorf("expected child node for dependency %q was not found in graph", dependency)
// if a package imports a dependency that we did not visit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was supposed to end up in a separate pull. did I miss it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a pull here to begin addressing this #18606

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a pull here to begin addressing this #18606

whoa, that pull is big. I was expecting it to be 5 lines long.

@@ -108,6 +111,55 @@ func BuildGraph(packages *PackageList, excludes []string) (*depgraph.MutableDire
return g, nil
}

func copyGraph(g *depgraph.MutableDirectedGraph) (*depgraph.MutableDirectedGraph, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird

return nil
}

func printYours(g *depgraph.MutableDirectedGraph, roots, nodes []graph.Node, you *depgraph.Node) ([]graph.Node, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a print method that is mutating its input?? Don't do that.

from := g.From(n)

// found an existing orphaned node to exclude
if len(to) == 0 && len(from) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this happen? I would expect these to snipped before you call this method.

g.RemoveNode(you)

// find nodes not reachable from our given roots
yours := findOrphans(g, roots, nodes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a clean starting graph, the idea of "roots" doesn't exist.

for _, n := range nodes {
isOrphan := true
for _, root := range roots {
if hasPathFromTo(g, root, n) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right. If you started your graph without orphans, then any orphans you're finding later on the result of the removing the "from" on an inbound link.

return nil
}

func printYours(g *depgraph.MutableDirectedGraph, roots, nodes []graph.Node, you *depgraph.Node) ([]graph.Node, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic in here looks wrong. I expected you to produce a new graph by filtering out a selected set of nodes. Then after you've built your new graph, you simply scan all nodes to see if they don't have any inbound links. You track those, then built a new graph using the nodes without any inbound links as your input. Repeat until you have no changes. Then union all the nodes you removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that at no point do we discuss "roots". I'm still trying to figure out why that concept is needed.

@juanvallejo juanvallejo force-pushed the jvallejo/analysis-deps-yours branch 2 times, most recently from c6fadf7 to 45b85cd Compare February 14, 2018 08:09
@@ -55,7 +55,64 @@ func (g *MutableDirectedGraph) AddNode(n *Node) error {
return nil
}

func (g *MutableDirectedGraph) RemoveNode(n *Node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

godoc whether this leaves dangling edges

func (g *MutableDirectedGraph) NodeByName(name string) (graph.Node, bool) {
n, exists := g.nodesByName[name]
return n, exists && g.DirectedGraph.Has(n)
}

// remove nodes with no inbound edges
func (g *MutableDirectedGraph) PruneOrphans() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return a list of the nodes you removed.

// - "Yours": a list of every node in the set unique to the dependency tree of a given target node.
// - "Mine": a list of every node in the set unique to the dependency tree of the root nodes (set non-overlapping with given target node)
// - "Ours": a list of every node in the overlapping set between the dependency tree of the root nodes and a given target node
func (o *TraceImportsOpts) analyzeGraph(g *depgraph.MutableDirectedGraph) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really hard to read. Separate your library methods (things operating on input data), from your command wiring (things combining a set of library methods). I should have a small method for identifying "which repos are only present because this set of nodes is present".

// This operation is done recursively until we no longer cause
// any nodes to become orphaned after pruning the removed set.
// The resulting set of total nodes recursively removed is returned.
func calculateOrphans(g *depgraph.MutableDirectedGraph, targetNodes []*depgraph.Node) []*depgraph.Node {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the graph method I want. Peer to the other graph methods I'd think. Unit tests located there to check it.

continue
}

orphans = append(orphans, n.(*depgraph.Node))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this exactly what your PruneOrphans method does? I didn't look, but for that method to work it would have to be recursive. It is recursive, right? If not, write the test that will cause that method to break and then fix it.

// This operation is done recursively until we no longer cause
// any nodes to become orphaned after pruning the removed set.
// The resulting set of total nodes recursively removed is returned.
func calculateOrphans(g *depgraph.MutableDirectedGraph, targetNodes []*depgraph.Node) []*depgraph.Node {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad name. This is FindExclusiveDependencies(g, targetNodes)

}

type TraceImportsFlags struct {
Roots []string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be dead, rigth?

excludes []string
filters []string

Analyze string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad name, what is this?

`

type TraceImportsOpts struct {
Roots []string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead I hope

fmt.Printf("Analyzing a total of %v packages\n", len(g.Nodes()))
fmt.Println()

yours := calculateOrphans(g, []*depgraph.Node{you})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you messing with a single repo/node. That doesn't make sense for what this needs to do.

}

cmd.Flags().StringSliceVar(&flags.Roots, "root", flags.Roots, "set of entrypoints for dependency trees used to generate a depedency graph.")
cmd.Flags().StringVarP(&flags.Exclude, "exclude", "e", "", "json file containing a list of import-paths of packages to recursively exclude when traversing the set of given entrypoints specified through --root.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

description sounds bad. I'd want to choose recursive exclusion or not based on ... in golang, right?

cmd.Flags().StringVarP(&flags.Exclude, "exclude", "e", "", "json file containing a list of import-paths of packages to recursively exclude when traversing the set of given entrypoints specified through --root.")
cmd.Flags().StringVarP(&flags.OutputFormat, "output", "o", "", "output generated dependency graph in specified format. One of: dot.")
cmd.Flags().StringVarP(&flags.Collapse, "collapse", "c", "", "json file containing a list of import-paths of packages to collapse sub-packages into.")
cmd.Flags().StringVarP(&flags.Analyze, "analyze", "a", "", "output a summary report on the dependency set of a given repository. Mutually exclusive with --output")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if its mutually exclusive, why do we have this lever? Is this just a different output mode or is it (more likely) a different command.

},
}

cmd.Flags().StringSliceVar(&flags.Roots, "root", flags.Roots, "set of entrypoints for dependency trees used to generate a depedency graph.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really seems like you ought to have a set of common flags for building the graph, then a separate set of flags for analyzing it and that you're re-use the flags for building the graph in multiple different commands.

@juanvallejo
Copy link
Contributor Author

juanvallejo commented Mar 6, 2018

@deads2k Went ahead and rebased with approved changes from #18466.

Added the analysis bits on top.
Switched to using gonum Dijkstra's instead of home-grown breath-first path traversal :)
Command output is exactly the same as the updated one posted in #18543 (comment)

@juanvallejo
Copy link
Contributor Author

/retest

@juanvallejo
Copy link
Contributor Author

/test extended_networking_minimal

@deads2k
Copy link
Contributor

deads2k commented Mar 8, 2018

prereq merged. rebase for smaller diff

@juanvallejo juanvallejo force-pushed the jvallejo/analysis-deps-yours branch from 36b30fa to 039582b Compare March 8, 2018 15:30
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 8, 2018
`

type AnalyzeOptions struct {
*graph.GraphOptions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no anonymous inclusion

type AnalyzeOptions struct {
*graph.GraphOptions

// Packages to analyze against
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

against?

}

type AnalyzeFlags struct {
*graph.GraphFlags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no anonymous inclusion

// - "Yours": a list of every node in the set unique to the dependency tree of a given target node.
// - "Mine": a list of every node in the set unique to the dependency tree of the root nodes (set non-overlapping with given target node)
// - "Ours": a list of every node in the overlapping set between the dependency tree of the root nodes and a given target node
func (o *AnalyzeOptions) analyzeGraph(g *graph.MutableDirectedGraph) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs unit tests

@deads2k
Copy link
Contributor

deads2k commented Mar 8, 2018

Add some tests. Looks pretty good.

@juanvallejo juanvallejo force-pushed the jvallejo/analysis-deps-yours branch 2 times, most recently from d84f0e7 to 98802b7 Compare March 12, 2018 18:27
@juanvallejo
Copy link
Contributor Author

@deads2k thanks, added tests

@juanvallejo juanvallejo force-pushed the jvallejo/analysis-deps-yours branch from 98802b7 to f389dcf Compare March 19, 2018 14:03
@juanvallejo juanvallejo force-pushed the jvallejo/analysis-deps-yours branch from f389dcf to 4792756 Compare March 19, 2018 14:15
@juanvallejo
Copy link
Contributor Author

juanvallejo commented Mar 19, 2018

@deads2k addressed in-person feedback regarding calculateYours, calculateMineOurs methods. Updated tests as well. Commits squashed.

Approving per our conversation.

@juanvallejo juanvallejo added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 19, 2018
@juanvallejo
Copy link
Contributor Author

/test unit
/test gcp

@juanvallejo
Copy link
Contributor Author

/test gcp

@juanvallejo
Copy link
Contributor Author

@deads2k tests passing, adding /lgtm
@soltysh fyi

/lgtm

@openshift-ci-robot
Copy link

@juanvallejo: you cannot LGTM your own PR.

In response to this:

@deads2k tests passing, adding /lgtm
@soltysh fyi

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: juanvallejo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@juanvallejo juanvallejo added the lgtm Indicates that a PR is ready to be merged. label Mar 19, 2018
@openshift-merge-robot
Copy link
Contributor

/test all [submit-queue is verifying that this PR is safe to merge]

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 20, 2018

@juanvallejo: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/gcp 4792756 link /test gcp

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 18999, 18543).

@openshift-merge-robot openshift-merge-robot merged commit 08506eb into openshift:master Mar 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants