-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Add cache command #3968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache command #3968
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the use of a flag for this. I think it'd be better off being a command of some sort. I'm not sure what exactly that command would look like though. Off the top of my head I can think of two different ways of going with this.
- We make a dedicated
pip cache
command that allows people to interact with the cache to do things like see the location, clear the cache, etc. - We make a dedicated
pip config
command that allows people to set and read configuration values, so that they can see what the cache dir, and other settings would be and also modify them via a command.
I kind of like the idea of |
Yea that could be real good. Wouldn't need to start off with everything either. I think |
+1 from me, too. That sounds like the right way to expose this type of functionality. |
9358a45
to
89778fa
Compare
Added a I see this reinvents part of #3734; any feelings about what to do there? |
There's a PEP8 issue, and I've just restarted the python 3.6 nightly run, as the failures there seem unrelated. You should probably also have a docs page for the new command. But otherwise it's looking OK to me. Regarding #3734 let's see what @xavfernandez thinks. I'm inclined to think that this PR is a nice quick win, and the extra functionality from #3734 could be added later if needed - but if #3734 is close to merging, it might be easier to get the whole lot at once. |
89778fa
to
a586fe4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking a lot better! Two small comments and one major one.
|
||
|
||
class CacheCommand(Command): | ||
"""\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the trailing slash here for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better auto-dedenting behavior; IIRC if the first line of the block quote doesn't start with whitespace the rest of the docstring won't be dedented.
"ERROR: Please provide one of these subcommands: %s" % | ||
", ".join(self.actions)) | ||
return ERROR | ||
method = self.__getattribute__("action_%s" % args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better written as method = getattr(self, "action_%s" % args[0])
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
return method(options, args[1:]) | ||
|
||
def action_location(self, options, args): | ||
logger.info(os.path.join(options.cache_dir, "wheels")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want this to be a bit more structured yea? I imagine something that outputs something more like:
Cache Locations:
HTTP = /Users/dstufft/Library/Caches/pip/http
Wheels = /Users/dstufft/Library/Caches/pip/wheels
Or something like that? Maybe It'd be better to switch pip cache locations
for pip cache info
and have somehting like:
HTTP Cache:
Location = /Users/dstufft/Library/Caches/pip/http
Size = 284M
Items = 1045
Wheel Cache
Location = /Users/dstufft/Library/Caches/pip/wheels
Size = 95M
Items = 391
I guess the key point here is that we currently have at least two caches, might add more in the future and arbitrarily picking one of them to return as the location feels like the wrong thing to do so we should at least support something that returns them all. Adding more information like Size and number of items might be interesting too to give people an idea of the scope of their caches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hoping this is the output from cache info
. :) I'd like $(pip cache location)
to be possible. Refraining from adding /wheels
is probably wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you hoping to do with $(pip cache location)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ask because I'm trying to think of a use case where you don't need to be able to explicitly ask for http versus wheel cache OR bake in an assumption about the names of those caches relative to the overall cache directory that isn't rm -rf $(pip cache location)
or du -h -d 0 $(pip cache location)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point; I think it follows that any verb that operates on a cache will have to understand which cache it wants to operate on, right?
Is it worth exposing anything about the http cache? It's less mysterious than the wheel cache; I haven't seen people having trouble with it. Should cache
just deal with the wheel cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the name I'd expect to be able to deal with any of pip's caches with pip cache
. Maybe we could add a --type [wheel|http]
with the default being wheel. That's explicit, and clearly documents that you get wheel by default, but maybe it's a bit over-engineered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I definitely think everything should encompass all of the caches we have. The reason I was asking what your use case was for that command is trying to determine if it makes sense to have pip cache location
return the top level cache directory, or if it made sense to have flags to return specific cache directories or if it made sense to just omit the command all together and roll whatever specific functionality you were looking at using that command for into it's own command (e.g. if you were looking to do rm -rf $(pip cache location)
then just add pip cache clear
).
However, I think the way to do it, assuming we can't just roll whatever functionality up, is maybe by default have it return the top level cache directory, and then add flags for --wheel
and --http
(or --type [wheel|http]
) that let you scope that down to a specific type instead of the entire thing. Could easily apply this pattern to other commands like pip cache info
(only show info for wheel or http) and pip cache clear
(only clear one type).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to make this an option on the cache
command or on each subcommand? Similarly, I guess, does a pip cache list
command make sense for the http cache?
I added an implementation of an Output looks like:
|
return result | ||
|
||
|
||
def human_size(n_bytes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, duplicates pip.utils.format_size
; should use that instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
0247b30
to
7dd3422
Compare
A) B) I'd suggest that C) If given my first point, then I'd suggest that rather than implementing a whole command this could fit neatly under the existing informational command in pip, I make most of these comments, under a vest of trying to keep feature-creep out of pip's existing UI. Not that I want to see anything added ever, but always very solid reasons when done so. |
wrt Sort of in that vein, I think |
One thing to be careful of is anything that bakes in an assumption about how our caches are laid out internally. For example, right now the wheel cache you can kind of figure out what's cached just by doing an exhaustive search of all of the filenames. However that is not currently true for the HTTP cache where filenames are sha256 (IIRC?) hashes of the URLs they are caching. For these, it's not currently possible to see what items are currently cached, only to ask "is this particular URL cached?". While adding a dedicated location command doesn't on it's own tell people that hey you can depend on the structure of our cache, it feels like it's paving a cowpath towards people doing just that (outside of cases like Given that, I think that it probably makes sense to not implement I think that Does that all sound reasonable? |
On reflection, I also think we should strongly enforce YAGNI here. ISTM that we've ended up at a point where we're implementing a feature without exactly remembering why we wanted it. As @dstufft said, the internals of the cache are undocumented and private, so we should be very clear on intended use cases before implementing features, as otherwise there's a risk that we open up the internals in a way that means people end up relying on them. Sorry if this feels like backing out of what seemed like an almost-complete PR, but sometimes it takes getting to this sort of stage before the implications are really clear. For example, re-reading the the original use case, it was "This is useful if users want to know where the cache is so that they can delete stale or poisoned wheels." Is the best way to clear out invalid wheels from the cache to let people locate the cache and go hunting for the relevant wheels themselves? If there's a need to delete wheels would an explicit command not be better? Something like There's also a reasonable use case for "blow away the whole cache" - something like But what other use case is there for "tell me where the cache is"? Or indeed for any other cache related commands? |
Looking good! I'm still on the fence about the I think it makes sense for the I kind of don't like the |
what about using |
This might be a bridge too far, but what if it's treated like a glob if it contains a metacharacter or ends in .whl and otherwise it's treated like a project name?
I wouldn't be mad. Thoughts about |
My thinking about restricting it to just project names also means it's not tied to wheel caching either. If we rejigger the HTTP cache to tag cached responses with a project name, we can make it just work with that. If we add caching of VCS repositories, we can make it just work with that as well. By keeping it higher level we eliminate some amounts of tying it to a specific implementation. Of course the flipside of that is with your proposed change, that is still somewhat true, just that glob characters/.whl would restrict it to just deleting stuff from the wheel cache in my hypothetical above. Overall, it feels somewhat magical to do that and I'm trying to think of a use case where it is vitally important to just delete a specific wheel from the cache and no others. Worst case scenario it seems like it'd just trigger the slow path again (once) but also probably save some space by cleaning up old wheels for versions you're no longer installing?
I'd make it act like As an aside, I don't really like the |
cd53d04
to
05a953a
Compare
If you want to extend this to cover arbitrary cache types I'm wondering if it makes sense to retain I almost wonder if
|
Hello, I was in holidays :) Note that https://github.com/pypa/pip/pull/3734/files#diff-2695f32c4432acd141c3dbe7e7e3a6b0R803 was also added to track the origin of the wheel cache. |
269a505
to
a11f61a
Compare
Hello! I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the |
@tdsmith @xavfernandez Is one of you going to look into this in the near future? |
@pradyunsg Well, seeing your comment, I rebased #3734 which still works fine but which contained less functionnality than @tdsmith's PR. I won't have much time to work on it though. |
@tdsmith May I take this forward? |
Closing in favour of #4685. |
Adds an option that prints the configured location of the cache directory and
exits. This is useful if users want to know where the cache is so that they can
delete stale or poisoned wheels.
This change is