@@ -15,17 +15,16 @@ Overview
15
15
Here is a rough description of the process that pip uses to choose what
16
16
file to download for a package, given a requirement:
17
17
18
- 1. Access the various network and file system locations configured for pip
19
- that contain package files. These locations can include, for example,
20
- pip's :ref: `--index-url <--index-url >` (with default
21
- https://pypi.org/simple/ ) and any configured
22
- :ref: `--extra-index-url <--extra-index-url >` locations.
23
- Each of these locations is a `PEP 503 `_ "simple repository" page, which
24
- is an HTML page of anchor links.
25
- 2. Collect together all of the links (e.g. by parsing the anchor links
26
- from the HTML pages) and create ``Link `` objects from each of these.
27
- The :ref: `LinkCollector <link-collector-class >` class is responsible
28
- for both this step and the previous.
18
+ 1. Collect together the various network and file system locations containing
19
+ project package files. These locations are derived, for example, from pip's
20
+ :ref: `--index-url <--index-url >` (with default https://pypi.org/simple/ )
21
+ setting and any configured :ref: `--extra-index-url <--extra-index-url >`
22
+ locations. Each of the project page URL's is an HTML page of anchor links,
23
+ as defined in `PEP 503 `_, the "Simple Repository API."
24
+ 2. For each project page URL, fetch the HTML and parse out the anchor links,
25
+ creating a ``Link `` object from each one. The :ref: `LinkCollector
26
+ <link-collector-class>` class is responsible for both the previous step
27
+ and fetching the HTML over the network.
29
28
3. Determine which of the links are minimally relevant, using the
30
29
:ref: `LinkEvaluator <link-evaluator-class >` class. Create an
31
30
``InstallationCandidate `` object (aka candidate for install) for each
@@ -111,6 +110,12 @@ One of ``PackageFinder``'s main top-level methods is
111
110
class's ``compute_best_candidate() `` method on the return value of
112
111
``find_all_candidates() ``. This corresponds to steps 4-5 of the Overview.
113
112
113
+ ``PackageFinder `` also has a ``process_project_url() `` method (called by
114
+ ``find_best_candidate() ``) to process a `PEP 503 `_ "simple repository"
115
+ project page. This method fetches and parses the HTML from a PEP 503 project
116
+ page URL, extracts the anchor elements and creates ``Link `` objects from
117
+ them, and then evaluates those links.
118
+
114
119
115
120
.. _link-collector-class :
116
121
@@ -119,12 +124,8 @@ The ``LinkCollector`` class
119
124
120
125
The :ref: `LinkCollector <link-collector-class >` class is the class
121
126
responsible for collecting the raw list of "links" to package files
122
- (represented as ``Link `` objects). An instance of the class accesses the
123
- various `PEP 503 `_ HTML "simple repository" pages, parses their HTML,
124
- extracts the links from the anchor elements, and creates ``Link `` objects
125
- from that information. The ``LinkCollector `` class is "unintelligent" in that
126
- it doesn't do any evaluation of whether the links are relevant to the
127
- original requirement; it just collects them.
127
+ (represented as ``Link `` objects) from file system locations, as well as the
128
+ `PEP 503 `_ project page URL's that ``PackageFinder `` should access.
128
129
129
130
The ``LinkCollector `` class takes into account the user's :ref: `--find-links
130
131
<--find-links>`, :ref: `--extra-index-url <--extra-index-url >`, and related
@@ -133,6 +134,10 @@ method is the ``collect_links()`` method. The :ref:`PackageFinder
133
134
<package-finder-class>` class invokes this method as the first step of its
134
135
``find_all_candidates() `` method.
135
136
137
+ ``LinkCollector `` also has a ``fetch_page() `` method to fetch the HTML from a
138
+ project page URL. This method is "unintelligent" in that it doesn't parse the
139
+ HTML.
140
+
136
141
The ``LinkCollector `` class is the only class in the ``index.py `` module that
137
142
makes network requests and is the only class in the module that depends
138
143
directly on ``PipSession ``, which stores pip's configuration options and
0 commit comments