DOC: Doc build for a single doc made much faster, and clean up #24428

datapythonista · 2018-12-26T00:00:54Z

The main goal of this PR is to be able to build single pages faster and without building anything else, so detecting and fixing the warnings (and other problems) in doc pages should be much more efficient.

Additionally, those things have been addressed:

Addressing technical debt in the doc build
Moved stuff (like rendering index.rst.template) from doc/make.py to doc/source/conf.py, which makes things simpler, and makes it possible to call sphinx-build directly
When --single is used, do not build anything else than the required page (before all whatsnew pages where additionally build), which makes it much faster
Added option to fail the doc build if warnings are found (cancelling the build immediately)

@jorisvandenbossche can you take a look please?

…the api pages

… regex

…ed api.rst)

…page is built

jreback · 2018-12-26T00:11:11Z

doc/source/conf.py

 import inspect
 import importlib
 import logging
 import warnings
-
+import jinja2
 from sphinx.ext.autosummary import _import_by_name
 from numpydoc.docscrape import NumpyDocString
 from numpydoc.docscrape_sphinx import SphinxDocString


you can remove the raw_input yes? as we only build on py3 now?

good point, and actually it wasn't used, so should still run in py2

right, its actualy totally fine to have only py3 doc building now.

codecov · 2018-12-26T00:20:01Z

Codecov Report

Merging #24428 into master will decrease coverage by 49.29%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #24428      +/-   ##
==========================================
- Coverage    92.3%      43%   -49.3%     
==========================================
  Files         163      163              
  Lines       51966    51966              
==========================================
- Hits        47967    22349   -25618     
- Misses       3999    29617   +25618

Flag	Coverage Δ
#multiple	`?`
#single	`43% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/io/formats/latex.py	`0% <0%> (-100%)`	⬇️
pandas/core/categorical.py	`0% <0%> (-100%)`	⬇️
pandas/io/sas/sas_constants.py	`0% <0%> (-100%)`	⬇️
pandas/tseries/plotting.py	`0% <0%> (-100%)`	⬇️
pandas/tseries/converter.py	`0% <0%> (-100%)`	⬇️
pandas/io/formats/html.py	`0% <0%> (-98.65%)`	⬇️
pandas/core/groupby/categorical.py	`0% <0%> (-95.46%)`	⬇️
pandas/io/sas/sas7bdat.py	`0% <0%> (-91.17%)`	⬇️
pandas/io/sas/sas_xport.py	`0% <0%> (-90.15%)`	⬇️
pandas/core/tools/numeric.py	`10.44% <0%> (-89.56%)`	⬇️
... and 121 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1905485...c489aa6. Read the comment docs.

codecov · 2018-12-26T00:20:02Z

Codecov Report

Merging #24428 into master will increase coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #24428      +/-   ##
==========================================
+ Coverage    92.3%   92.32%   +0.02%     
==========================================
  Files         163      166       +3     
  Lines       51969    52328     +359     
==========================================
+ Hits        47968    48310     +342     
- Misses       4001     4018      +17

Flag	Coverage Δ
#multiple	`90.74% <ø> (+0.03%)`	⬆️
#single	`43.04% <ø> (+0.03%)`	⬆️

Impacted Files	Coverage Δ
pandas/core/dtypes/cast.py	`88.72% <0%> (-0.67%)`	⬇️
pandas/core/arrays/datetimelike.py	`95.66% <0%> (-0.28%)`	⬇️
pandas/core/indexes/datetimes.py	`96.14% <0%> (-0.18%)`	⬇️
pandas/core/nanops.py	`94.9% <0%> (-0.16%)`	⬇️
pandas/core/indexes/datetimelike.py	`97.59% <0%> (-0.11%)`	⬇️
pandas/util/testing.py	`87.75% <0%> (-0.1%)`	⬇️
pandas/core/arrays/period.py	`98.39% <0%> (-0.09%)`	⬇️
pandas/core/indexes/period.py	`92.69% <0%> (-0.07%)`	⬇️
pandas/core/generic.py	`96.62% <0%> (-0.01%)`	⬇️
pandas/core/internals/blocks.py	`93.81% <0%> (-0.01%)`	⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aa1549f...4406988. Read the comment docs.

datapythonista · 2018-12-29T12:59:00Z

@TomAugspurger if you have time and can take a look at this. Besides simplifying the code to build the docs, this will allow to build a single page much faster. That will be useful for fixing the warnings, and I'd like to time the build of each page, and see if we can reduce the time of the slowest examples.

TomAugspurger

Just a couple small comments / questions. LGTM though.

TomAugspurger · 2018-12-29T13:05:26Z

doc/make.py

-                pass
+                return single_doc[len('pandas.'):]
+        else:
+            raise ValueError('--single value should be a valid path to a '


Could you print out single_doc here, so the user sees what they passed?

TomAugspurger · 2018-12-29T13:06:32Z

doc/make.py

@@ -326,7 +214,7 @@ def main():
                           help='command to run: {}'.format(', '.join(cmds)))
    argparser.add_argument('--num-jobs',
                           type=int,
-                           default=1,
+                           default=0,


In the DocBuilder init, the default is 1. Make them both the same?

What does a value of 0 do?

Good point. with 0 I don't pass a -j value to sphinx-build, while with 1, I pass -j 1, which is actually the same (unless sphinx changes its default).

It's a bit weird, because the -j parameter is supposed to make sphinx-build run with multiple cores, but it actually doesn't work, and the process takes exactly the same with -j 1 and -j 4. It could make sense to remove this option, but I guess at some point sphinx will be fixed at some point, so I guess it's worth leaving it.

I'm happy setting our default to 0 (use sphinx default), 1, or auto, just let me know if you have a preference.

I address your other comment.

Hmm I think multiple cores speeds things up for me, at least for a full doc build.

I checked it with time in Linux, 1 vs 4 cores, it took 17m30s in both cases (with a difference of less than 5 seconds between them). I tried it in the past with a different computer (also Linux) and was the same.

* upstream/master: DOC: Fixing broken references in the docs (pandas-dev#24497) DOC: Splitting api.rst in several files (pandas-dev#24462) Fix misdescription in escapechar (pandas-dev#24490) Floor and ceil methods during pandas.eval which are provided by numexpr (pandas-dev#24355) BUG: Pandas any() returning false with true values present (GH pandas-dev#23070) (pandas-dev#24434) Misc separable pieces of pandas-dev#24024 (pandas-dev#24488) use capsys.readouterr() as named tuple (pandas-dev#24489) REF/TST: replace capture_stderr with pytest capsys fixture (pandas-dev#24496) TST- Fixing issue with test_parquet test unexpectedly passing (pandas-dev#24480) DOC: Doc build for a single doc made much faster, and clean up (pandas-dev#24428) BUG: Fix+test timezone-preservation in DTA.repeat (pandas-dev#24483) Implement reductions from pandas-dev#24024 (pandas-dev#24484)

…strings * upstream/master: TST: Skip db tests unless explicitly specified in -m pattern (pandas-dev#24492) Mix EA into DTA/TDA; part of 24024 (pandas-dev#24502) DOC: Fix building of a single API document (pandas-dev#24506) DOC: Fixing broken references in the docs (pandas-dev#24497) DOC: Splitting api.rst in several files (pandas-dev#24462) Fix misdescription in escapechar (pandas-dev#24490) Floor and ceil methods during pandas.eval which are provided by numexpr (pandas-dev#24355) BUG: Pandas any() returning false with true values present (GH pandas-dev#23070) (pandas-dev#24434) Misc separable pieces of pandas-dev#24024 (pandas-dev#24488) use capsys.readouterr() as named tuple (pandas-dev#24489) REF/TST: replace capture_stderr with pytest capsys fixture (pandas-dev#24496) TST- Fixing issue with test_parquet test unexpectedly passing (pandas-dev#24480) DOC: Doc build for a single doc made much faster, and clean up (pandas-dev#24428) BUG: Fix+test timezone-preservation in DTA.repeat (pandas-dev#24483) Implement reductions from pandas-dev#24024 (pandas-dev#24484)

…s-dev#24428)

datapythonista added 9 commits December 14, 2018 03:34

WIP: Major refactoring and clean up, still some problems on building …

2571572

…the api pages

building only api pages when needed, based on SPHINX_PATTERN, and not…

83c3b45

… regex

Merge from master

a5e7ad2

Merge remote-tracking branch 'upstream/master' into single_doc

0fda07e

WIP simplifying doc build (pending not generate rst files from exclud…

e851cae

…ed api.rst)

Not generating all the api pages or the intersphinx db when a single …

fc9c359

…page is built

Merge remote-tracking branch 'upstream/master' into single_doc

8e1e9e4

Restoring python path env

ade46e6

Fixing comment with wrong info

648da45

datapythonista added Docs Clean labels Dec 26, 2018

jreback reviewed Dec 26, 2018

View reviewed changes

Removing not needed raw_input

c489aa6

datapythonista mentioned this pull request Dec 28, 2018

DOC: Splitting api.rst in several files #24462

Merged

4 tasks

Merging from master

2be1d61

TomAugspurger approved these changes Dec 29, 2018

View reviewed changes

datapythonista added 2 commits December 29, 2018 15:46

Adding value to --single error message

55e47a2

Changing num_jobs default value to 0 everywhere, to be consistent

4406988

datapythonista merged commit d5e5bf7 into pandas-dev:master Dec 30, 2018

jorisvandenbossche mentioned this pull request Jan 11, 2019

Doc make.py clean #24727

Closed

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

DOC: Doc build for a single doc made much faster, and clean up (panda…

8880ded

…s-dev#24428)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

DOC: Doc build for a single doc made much faster, and clean up (panda…

b21a8f3

…s-dev#24428)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Doc build for a single doc made much faster, and clean up #24428

DOC: Doc build for a single doc made much faster, and clean up #24428

datapythonista commented Dec 26, 2018

jreback Dec 26, 2018

datapythonista Dec 26, 2018

jreback Dec 26, 2018

codecov bot commented Dec 26, 2018

codecov bot commented Dec 26, 2018 •

edited

Loading

datapythonista commented Dec 29, 2018

TomAugspurger left a comment

TomAugspurger Dec 29, 2018

TomAugspurger Dec 29, 2018

datapythonista Dec 29, 2018

TomAugspurger Dec 30, 2018

datapythonista Dec 30, 2018

DOC: Doc build for a single doc made much faster, and clean up #24428

DOC: Doc build for a single doc made much faster, and clean up #24428

Conversation

datapythonista commented Dec 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Dec 26, 2018

Codecov Report

codecov bot commented Dec 26, 2018 • edited Loading

Codecov Report

datapythonista commented Dec 29, 2018

TomAugspurger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Dec 26, 2018 •

edited

Loading