Skip to content

FR: Support a WithMirror option #1200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mattmoor opened this issue Dec 7, 2021 · 29 comments · May be fixed by #2010
Open

FR: Support a WithMirror option #1200

mattmoor opened this issue Dec 7, 2021 · 29 comments · May be fixed by #2010
Labels
good first issue Good for newcomers lifecycle/frozen question Further information is requested

Comments

@mattmoor
Copy link
Collaborator

mattmoor commented Dec 7, 2021

... and default it to mirror.gcr.io 😎

@mattmoor mattmoor added the question Further information is requested label Dec 7, 2021
@github-actions
Copy link

github-actions bot commented Mar 8, 2022

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Keep fresh with the 'lifecycle/frozen' label.

@cognifloyd
Copy link

I would like a mirror option as well.

@oursland
Copy link

skaffold uses go-containerregistry, and consequently ignores the many mirrors I have configured. This results in many unnecessary requests upstream, and frequently encountering the Docker Hub pull rate limiter.

I would like to see this issue re-opened and mirror support added to go-containerregistry and skaffold.

@imjasonh
Copy link
Collaborator

Seems simple enough. I agree we should do it.

Do we want WithMirror to enable reading the normal Docker config location for the mirror(s) to use? I somewhat disagree that callers should specify mirrors themselves in Go code, especially if there's already a place this may be configured system-wide.

I also disagree that it should default to mirror.gcr.io, but could be convinced.

@mattmoor
Copy link
Collaborator Author

I think it probably makes sense to have WithMirror supported in Go code, but to default to a behavior consistent with clients that read this from ~/.docker/config.json 🤔

I would not hardcode mirror.gcr.io

@imjasonh
Copy link
Collaborator

Docker configures the mirror at /etc/docker/daemon.json, and there can be multiple mirrors (I believe it checks in order, not concurrently and fastest-wins)

https://docs.docker.com/registry/recipes/mirror/#configure-the-docker-daemon

Having WithMirrors(mirrors ...string) and WithMirrorsFromConfig(bool) (default true?) sounds like a reasonable balance. If both are set, the union of the lists can be checked, with config'ed mirrors first. WDYT?

@mattmoor
Copy link
Collaborator Author

Do we have any precedent for reading system-wide config files like /etc/docker/daemon.json? Perhaps we do something for Docker's system cert logic today? I thought maybe there was an obscure way to specify mirrors in ~/.docker/config.json, but perhaps that's wrong.

I think I'm more mixed on /etc/docker/daemon.json vs. ~/.docker/config.json. Would it be fair to say that providing something at the Go level (e.g. WithMirror) is fine for someone to PR while we continue to discuss better defaults?

Mostly just trying to figure out what parts we find contentious vs. have agreement on.

@imjasonh
Copy link
Collaborator

Yeah, a Go-level WithMirrors sgtm while we hash out reading the system config. 👍

@imjasonh imjasonh added the good first issue Good for newcomers label Jul 26, 2022
@jonjohnsonjr
Copy link
Collaborator

Perhaps we do something for Docker's system cert logic today?

Nope: #211

Would it be fair to say that providing something at the Go level (e.g. WithMirror) is fine for someone to PR while we continue to discuss better defaults?

SGTM. There are config files for docker, podman (et al), and containerd that all have slightly different behavior here. We may want to provide something to make it simple to use any of these, kind of like authn.DefaultKeychain.

It would be really great if we could have the flexibility or configuration options to allow an "untrusted" mirror to be used only for content-addressable objects and always consult upstream for tags. I think most people want the default of hitting the mirror for tags, too, but I'd love to set a precedent of the option if we can.

Do you expect the go-level WithMirror[s] to just be a url? Or like an interface so you can have non-http mirrors and support more configuration?

@imjasonh
Copy link
Collaborator

SGTM. There are config files for docker, podman (et al), and containerd that all have slightly different behavior here. We may want to provide something to make it simple to use any of these, kind of like authn.DefaultKeychain.

Yeah, I love this.

It would be really great if we could have the flexibility or configuration options to allow an "untrusted" mirror to be used only for content-addressable objects and always consult upstream for tags. I think most people want the default of hitting the mirror for tags, too, but I'd love to set a precedent of the option if we can.

Do you expect the go-level WithMirror[s] to just be a url? Or like an interface so you can have non-http mirrors and support more configuration?

I think these questions are related, which might have been your point.

We could have WithMirrors take an interface like:

type Mirror interface {
  GetManifest(string) (something, error)
  GetManifestByDigest(v1.Hash) (something, error)
  GetBlob(v1.Hash) (io.ReadCloser, error)
}

...then we could have a method to construct a Mirror from a URL string, like remote.NewMirror("mirror.example.com"), that did the "normal" thing of going to the same URL for all three.

If you wanted to implement something that only mirrored blobs, you could have GetManifest* always return remote.ErrMirrorNotFound. If you wanted something that only mirrored contents by digest, only implement GetManifestByDigest and GetBlob.

@ingwonsong
Copy link
Contributor

ingwonsong commented Aug 5, 2022

Do you think that it makes sense to have a kind of mapping between original registries and mirror registries?

e.g.)
1 .example.com -> mirror.example.com
2. docker.io -> mirror.gcr.io
3. * -> blackhole.foo.bar ( '*' is not the scope of this issue, but let's assume that we have a mirror site supporting multiple upstream registries)

@oursland
Copy link

oursland commented Aug 5, 2022

For informational purposes, I'll describe my team's configuration.

We're using skaffold for local development with k3d. All of our custom containers derive from DockerHub. The Kubernetes environment, however, is dependent upon many images delivered from quay and GCR.

Consequently, in my configuration skaffold (and therefore go-containerregistry) needs to be able to hit the local DockerHub registry mirror in order to minimize network bandwidth and quota utilization. It is not necessary for skaffold to be configured for other registries in my application.

All of the registries are configured in the k3d-config.yaml, which does honor the registry mirrors.

@ingwonsong k3d's configuration supports multiple-registry mapping, which I have found quite useful, particularly when performing development in absence of a reliable Internet connection.

Docker Registry Instances

    docker run -d -p 6000:5000 \
        -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io \
        --restart always \
        --name registry-docker.io registry
    docker run -d -p 6001:5000 \
        -e REGISTRY_PROXY_REMOTEURL=https://quay.io \
        --restart always \
        --name registry-quay.io registry
    docker run -d -p 6002:5000 \
        -e REGISTRY_PROXY_REMOTEURL=https://gcr.io \
        --restart always \
        --name registry-gcr.io registry
    docker run -d -p 6003:5000 \
        -e REGISTRY_PROXY_REMOTEURL=https://k8s.gcr.io \
        --restart always \
        --name registry-k8s.gcr.io registry

Docker Configuration

    {
        "registry-mirrors": [
            "http://localhost:6000"
        ],
        "features": {
            "buildkit": true
        },
        "insecure-registries": [
            "localhost:6000"
        ],
        "experimental": false,
        "builder": {
            "gc": {
                "enabled": true,
                "defaultKeepStorage": "20GB"
            }
        }
    }

k3d-config.yaml Configuration

kind: Simple
apiVersion: k3d.io/v1alpha4
metadata:
  name: k3s-default
servers: 1
agents: 3
image: rancher/k3s:v1.24.3-k3s1
ports:
- port: 8080:80
  nodeFilters:
  - loadbalancer
options:
  k3d:
    wait: true
    timeout: 1m0s
    disableLoadbalancer: false
    disableImageVolume: false
    disableRollback: false
  k3s:
    extraArgs:
    - arg: --tls-san=172.17.0.1
      nodeFilters:
      - server:*
    - arg: --disable=traefik
      nodeFilters:
      - server:*
    - arg: --kubelet-arg=eviction-hard=imagefs.available<1%,nodefs.available<1%
      nodeFilters:
      - server:*
      - agent:*
    - arg: --kubelet-arg=eviction-minimum-reclaim=imagefs.available=1%,nodefs.available=1%
      nodeFilters:
      - server:*
      - agent:*
  kubeconfig:
    updateDefaultKubeconfig: true
    switchCurrentContext: true
  runtime: {}
registries:
  create:
    name: registry.localhost
    hostPort: "8000"
  config: |
    mirrors:
      docker.io:
        endpoint:
        - "http://host.k3d.internal:6000"
      quay.io:
        endpoint:
        - "http://host.k3d.internal:6001"
      gcr.io:
        endpoint:
        - "http://host.k3d.internal:6002"
      k8s.gcr.io:
        endpoint:
        - "http://host.k3d.internal:6003"

@ingwonsong
Copy link
Contributor

ingwonsong commented Aug 5, 2022

What do you think about fallback to the original registry when mirror registry is failing?

More generally, if we have a list of mirrors: mirror1, mirror2, ...,mirrorN,

Try mirror1, mirror2, .., mirrorN, originalRegistry in order until it succeeds.

WDYT?

@ingwonsong
Copy link
Contributor

ingwonsong commented Aug 5, 2022

On the other hand, I suspect that such fallback may cause unexpected behaviors.
How does Dockerd deal with this?

@imjasonh @jonjohnsonjr

@imjasonh
Copy link
Collaborator

imjasonh commented Aug 6, 2022

What do you think about fallback to the original registry when mirror registry is failing?

More generally, if we have a list of mirrors: mirror1, mirror2, ...,mirrorN,

Try mirror1, mirror2, .., mirrorN, originalRegistry in order until it succeeds.

WDYT?

This is basically a requirement. Mirrors are allowed to return 404 etc if an image isn't mirrored, which signals to the client that it should go to the original registry.

You can simulate this by setting the mirror to a fake server or a non-mirror registry, where the requested image won't exist. You'll just always try the mirror and fallback to the original registry when it fails.

@iusetabs
Copy link

iusetabs commented Aug 10, 2022

Hey guys, coming from the Skaffold issue thread. I see this issue has been noted as "good first issue". I have been digging into this issue as we are getting rate limited because of the requests going to docker rather than the mirrors I specify in my daemon.

This is the first time I really dive deep into Go code, but I ended up at the pkg/name/registry.go:NewRegistry function which is maybe where changes can be made. I would be interested in trying to tackle this issue, but I've no experience with this repository - any advice? 😉

@iusetabs
Copy link

As a quick PoC, I made some small modifications to the NewRegistry function and hard coded a registry mirror I'm using. I also needed to change pkg/name/repository.go:hasImplicitNamespace so it added "library" in the url to fetch the image.

I then pointed skaffold at my local version of this repo and I was able to pull images using the mirror, whilst using the release version of skaffold produced rate limits 😉

@imjasonh
Copy link
Collaborator

Hey thanks for digging into this!

I had been imagining this as an option on methods in pkg/v1/remote, e.g.:

img, err := remote.Image(ref,
  remote.WithMirrors(
    remote.NewMirror("mirror.example.com"),
    remote.NewMirror("mirror2.example.com")))

(based lightly on the design outlined in #1200 (comment))

The reason being that lots of other things in go-containerregistry deal with name.References and their embedded name.Registrys, but don't do any remote operations at all (like tarball, daemon, layout), and so don't need to have any concept of a "mirror".

If you want to arrange a time to join a video chat and talk through this, maybe write some code together, that would be great. I'm especially free Fridays, including tomorrow.

@iusetabs
Copy link

Sounds good! 😄 Let's discuss a time for a call about this, I'll drop you a msg on your Twitter 😄 Thx again for your help!

@ingwonsong
Copy link
Contributor

was there any progress regarding this issue?

@imjasonh
Copy link
Collaborator

Nope, not really.

I think we'd still like this, and if anybody's interested in discussing it more I'd be happy to help.

@iosifnicolae2
Copy link

we also need this as it's blocking GoogleContainerTools/skaffold#7368

@phillebaba
Copy link

So my suggestion to avoid reinventing the wheel would be to copy how containerd has implemented mirror configuration. It seems to cover the majority of use cases without being overly verbose.
https://github.com/containerd/containerd/blob/main/docs/cri/config.md#registry-configuration

Has there been any consensus on how this would be implemented. Otherwise I would be interested in implementing support for a containerd like mirror configuration. The reason I think that a file based configuration approach would be good is because it would allow other projects to support it without too much configuration.

@iusetabs
Copy link

iusetabs commented Feb 23, 2023

What would be better? Supporting multiple mirrors, or just one, which gets created from an environmental variable or a file as @phillebaba suggests with the containerd approach?

I think there is some sort of design mentioned in the comment above, but would be interested in joining you guys to discuss how one would approach this as I would like to make a change to skaffold so it can make use of whatever change is made here.

@phillebaba
Copy link

My personal opinion regarding this is that if you enable the simple use case you are immediately going to get request for the advanced use case. Containerd has gone through a couple of iterations for this design, which itself is partially based on what docker has done.

A benefit is the fact that with the Containerd configuration it is possible to configure headers, CA certs, and multiple mirros in a specific order.

@sherine-k
Copy link

Hi folks,
I've started working on a possible implementation.
Is this still relevant?
Anybody available for a short chat?
I could share the small bricks I have so far, and if it fits, I can work towards opening a PR.
Is there any slack/discord/mailing-list/other to get in touch?

@oursland
Copy link

This is definitely a needed feature!

It is way too easy to hit a pull limit using tools like skaffold that use the go-containerregistry because it has no concept of local mirrors that provide cached objects.

I think previous discussion got mired down into multiple mirrors, mirror ordering, etc. At least in my use case, a single local mirror that caches upstream objects is desired. If that could be possible, it would go a long way to cutting down on network and pull request budgets.

@sherine-k sherine-k linked a pull request Sep 17, 2024 that will close this issue
@tylerwilliams
Copy link

Another vote for this feature!

@alexwilson1
Copy link

Given the news that Docker Hub is enforcing stricter rate limits (10 pulls/hour/IP for unauthenticated users starting March 1), I wanted to check if this issue has been reprioritized.

Tools like Skaffold continue to hit these limits because go-containerregistry lacks proper support for registry mirrors. The inability to respect daemon-configured mirrors is a major blocker for many teams relying on cached artifacts to avoid unnecessary Docker Hub throttling.

I saw that PR #2010 is open as a POC for WithMirror support. Could we get an update on its status? Is there a roadmap for integrating this into a future release?

Thanks for all the work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers lifecycle/frozen question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.