Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unregister views to avoid slow oom issue during meter cleanup #2005

Merged
merged 13 commits into from
Feb 17, 2021

Conversation

skonto
Copy link
Contributor

@skonto skonto commented Feb 1, 2021

@knative-prow-robot knative-prow-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 1, 2021
@google-cla google-cla bot added the cla: yes Indicates the PR's author has signed the CLA. label Feb 1, 2021
@knative-prow-robot knative-prow-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 1, 2021
@knative-prow-robot knative-prow-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 1, 2021
@knative-prow-robot
Copy link
Contributor

Hi @skonto. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@skonto skonto changed the title Unregister views to avoid slow oom issue during meter cleanup [wip]Unregister views to avoid slow oom issue during meter cleanup Feb 1, 2021
@knative-prow-robot knative-prow-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 1, 2021
@knative-prow-robot
Copy link
Contributor

@skonto: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@skonto skonto changed the title [wip]Unregister views to avoid slow oom issue during meter cleanup Unregister views to avoid slow oom issue during meter cleanup Feb 1, 2021
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 1, 2021
@skonto
Copy link
Contributor Author

skonto commented Feb 1, 2021

Unit tests pass locally on my side, need to retest.

@markusthoemmes
Copy link
Contributor

Do we have a reproducer that allows us to more confidently test that we are no longer leaking memory? Can we potentially write a unit(ish) test that makes sure we don't leak memory?

@codecov
Copy link

codecov bot commented Feb 1, 2021

Codecov Report

Merging #2005 (0758c06) into master (ca02ef7) will increase coverage by 0.05%.
The diff coverage is 73.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2005      +/-   ##
==========================================
+ Coverage   67.15%   67.20%   +0.05%     
==========================================
  Files         214      214              
  Lines        9091     9102      +11     
==========================================
+ Hits         6105     6117      +12     
- Misses       2711     2712       +1     
+ Partials      275      273       -2     
Impacted Files Coverage Δ
network/transports.go 86.04% <42.85%> (-6.27%) ⬇️
network/h2c.go 25.00% <66.66%> (+5.00%) ⬆️
controller/controller.go 86.41% <100.00%> (ø)
metrics/resource_view.go 88.63% <100.00%> (+2.08%) ⬆️
webhook/psbinding/psbinding.go 84.61% <100.00%> (ø)
webhook/resourcesemantics/defaulting/defaulting.go 80.86% <100.00%> (ø)
...k/resourcesemantics/validation/reconcile_config.go 89.23% <100.00%> (ø)
test/gcs/mock/mock.go 91.39% <0.00%> (+1.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca02ef7...5000a74. Read the comment docs.

@knative-prow-robot knative-prow-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 1, 2021
@skonto
Copy link
Contributor Author

skonto commented Feb 1, 2021

@markusthoemmes I added a unit test hope that helps. Not sure if there is way to check this better eg. compare actual mem allocations.

@skonto
Copy link
Contributor Author

skonto commented Feb 1, 2021

/retest

@knative-prow-robot
Copy link
Contributor

@skonto: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@skonto
Copy link
Contributor Author

skonto commented Feb 1, 2021

@markusthoemmes could you add /ok-to-test?

@markusthoemmes
Copy link
Contributor

/ok-to-test

@knative-prow-robot knative-prow-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 1, 2021
@skonto
Copy link
Contributor Author

skonto commented Feb 1, 2021

/retest

@skonto
Copy link
Contributor Author

skonto commented Feb 1, 2021

Not sure what is wrong here with the unit tests but they pass locally.

@skonto
Copy link
Contributor Author

skonto commented Feb 9, 2021

Now I see:

--- FAIL: TestDefaultObserved (0.00s)
    informed_watcher_test.go:392: foo1.count = 1, want 2

@skonto
Copy link
Contributor Author

skonto commented Feb 9, 2021

/retest

@skonto
Copy link
Contributor Author

skonto commented Feb 9, 2021

--- FAIL: TestMetricsExport (6.07s)
    --- FAIL: TestMetricsExport/OpenCensus (5.03s)
        e2e_test.go:235: Created exporter at localhost:12345
        logger.go:130: 2021-02-09T19:31:22.391Z	INFO	metrics/exporter.go:161	Flushing the existing exporter before setting up the new exporter.
        logger.go:130: 2021-02-09T19:31:22.398Z	INFO	metrics/opencensus_exporter.go:56	Created OpenCensus exporter with config:	{"config": {}}
        logger.go:130: 2021-02-09T19:31:22.398Z	INFO	metrics/exporter.go:174	Successfully updated the metrics exporter; old config: &{knative.dev/serving testComponent prometheus 1000000000 <nil> <nil>  false 19090 0.0.0.0 false   {   false}}; new config &{knative.dev/serving testComponent opencensus 1000000000 <nil> <nil> localhost:12345 false 0  false   {   false}}
        e2e_test.go:273: Timeout reading input
        e2e_test.go:279: Unexpected OpenCensus exports (-want +got):
              []metrics.metricExtract(Inverse(Sort, []string{
              	"knative.dev/serving/testComponent/global_export_counts<>:2",
              	"knative.dev/serving/testComponent/resource_global_export_count<>:2",
              	`knative.dev/serving/testComponent/testing/value<project="p1",rev`...,
            - 	`knative.dev/serving/testComponent/testing/value<project="p1",revision="r2">:1`,
              }))
{"level":"info","ts":1612899093.698988,"logger":"fallback","caller":"metrics/prometheus_exporter.go:51","msg":"Created Prometheus exporter with config: &{test test prometheus 0 <nil> <nil>  false 0  false   {   false}}. Start the server for Prometheus exporter."}
FAIL 

@skonto
Copy link
Contributor Author

skonto commented Feb 10, 2021

@julz @vagababov @markusthoemmes should this be merged now? I think the failing test should be handled on its own separately.

Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

/assign @evankanderson
For Evan to stamp approval.

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 10, 2021
@skonto
Copy link
Contributor Author

skonto commented Feb 11, 2021

@evankanderson gentle ping.

@skonto
Copy link
Contributor Author

skonto commented Feb 15, 2021

@evankanderson @vagababov should we merge? wdyth?

@evankanderson
Copy link
Member

/approve
/lgtm

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: evankanderson, skonto

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 16, 2021
@skonto
Copy link
Contributor Author

skonto commented Feb 17, 2021

/retest

@knative-prow-robot knative-prow-robot merged commit 6c2f94a into knative:master Feb 17, 2021
skonto added a commit to skonto/pkg that referenced this pull request Feb 17, 2021
…e#2005)

* unregister views

* add a test

* fix string issue

* fix

* fixes

* fixe races in tests

* fix exporter issue

* stop meter in cleanup

* fixes

* typo

* revert timeout change

* fixes

* change msg
knative-prow-robot pushed a commit that referenced this pull request Feb 17, 2021
…#2027)

* unregister views

* add a test

* fix string issue

* fix

* fixes

* fixe races in tests

* fix exporter issue

* stop meter in cleanup

* fixes

* typo

* revert timeout change

* fixes

* change msg
@skonto skonto mentioned this pull request Apr 8, 2021
skonto added a commit to skonto/pkg that referenced this pull request Apr 23, 2021
…e#2005)

* unregister views

* add a test

* fix string issue

* fix

* fixes

* fixe races in tests

* fix exporter issue

* stop meter in cleanup

* fixes

* typo

* revert timeout change

* fixes

* change msg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Indicates the PR's author has signed the CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants