Skip to content

Speedup sphinx-build using parallelization #991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 9, 2021
Merged

Conversation

weiji14
Copy link
Member

@weiji14 weiji14 commented Mar 2, 2021

Description of proposed changes

Distribute sphinx-build over several processes in parallel to make building on multiprocessor machines faster. Using 'auto' which equals the number of CPUs available. This should speed up running make html, make linkcheck and make doctest in the 'doc/' folder and cut down on time waiting for the Vercel Continuous Documentation and Github Actions Continuous Integration steps to run.

References:

Addresses #584.

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If adding new functionality, add an example to docstrings or tutorials.

Slash Commands

You can write slash commands (/command) in the first line of a comment to perform
specific operations. Supported slash commands are:

  • /format: automatically format and lint the code
  • /test-gmt-dev: run full tests on the latest GMT development version

Distribute sphinx-build over several processes in parallel
to make building on multiprocessor machines faster.
Using 'auto' which equals the number of CPUs available.
See docs at https://www.sphinx-doc.org/en/3.x/man/sphinx-build.html#cmdoption-sphinx-build-j.
@weiji14 weiji14 added maintenance Boring but important stuff for the core devs skip-changelog Skip adding Pull Request to changelog labels Mar 2, 2021
@weiji14 weiji14 added this to the 0.3.1 milestone Mar 2, 2021
@weiji14 weiji14 marked this pull request as ready for review March 3, 2021 00:37
@weiji14 weiji14 marked this pull request as draft March 3, 2021 00:38
@weiji14
Copy link
Member Author

weiji14 commented Mar 3, 2021

/test-gmt-dev

@seisman
Copy link
Member

seisman commented Mar 3, 2021

Perhaps you have to change a python file to trigger the workflows.

Copy link
Member Author

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parallelization doesn't seem to speed things up much on Github Actions CI. Scheduled test on master was 3m 47s while doc build on this PR was 3m 53s (a bit slower, but about the same). Maybe the Github Actions CI doesn't have that many CPU cores allocated? Vercel doc build time is still 7-8min.

Testing locally (8 CPU cores) does show a bit of speedup but not very significant, maybe a few tens of seconds.

@seisman
Copy link
Member

seisman commented Mar 3, 2021

The parallelization doesn't seem to speed things up much on Github Actions CI. Scheduled test on master was 3m 47s while doc build on this PR was 3m 53s (a bit slower, but about the same). Maybe the Github Actions CI doesn't have that many CPU cores allocated? Vercel doc build time is still 7-8min.

Testing locally (8 CPU cores) does show a bit of speedup but not very significant, maybe a few tens of seconds.

Most of the time is spent on running Python scripts. Maybe -j auto doesn't work for sphinx-gallery?

@weiji14
Copy link
Member Author

weiji14 commented Mar 3, 2021

The parallelization doesn't seem to speed things up much on Github Actions CI. Scheduled test on master was 3m 47s while doc build on this PR was 3m 53s (a bit slower, but about the same). Maybe the Github Actions CI doesn't have that many CPU cores allocated? Vercel doc build time is still 7-8min.
Testing locally (8 CPU cores) does show a bit of speedup but not very significant, maybe a few tens of seconds.

Most of the time is spent on running Python scripts. Maybe -j auto doesn't work for sphinx-gallery?

Possibly related to sphinx-doc/sphinx#4575.

P.S. Might need to run full tests to check, as I saw some issues with running parallel sphinx-build on Windows.

@weiji14 weiji14 marked this pull request as ready for review March 3, 2021 23:39
@weiji14
Copy link
Member Author

weiji14 commented Mar 4, 2021

Hmm, 5 test failures on Windows GMT dev tests again at https://github.com/GenericMappingTools/pygmt/runs/2026668071?check_suite_focus=true#step:15:1245. I think they're unrelated but it does looks scary!

```python-traceback ________________________ test_gmt_compat_6_is_applied _________________________

capsys = <_pytest.capture.CaptureFixture object at 0x0000020835D65E80>

def test_gmt_compat_6_is_applied(capsys):
    """

Error: Ensure that users with old gmt.conf files won't get pygmt-session [ERROR]:

    GMT_COMPATIBILITY: Expects values from 6 to 6; reset to 6.
    """
    end()  # Kill the global session
    try:
        # Generate a gmt.conf file in the currenty directory
        # with GMT_COMPATIBILITY = 5
        with Session() as lib:
            lib.call_module("gmtset", "GMT_COMPATIBILITY 5")
        begin()
        with Session() as lib:
            lib.call_module("basemap", "-R10/70/-3/8 -JX4i/3i -Ba")
            out, err = capsys.readouterr()  # capture stdout and stderr
            assert out == ""
            assert err != (

Error: "pygmt-session [ERROR]: GMT_COMPATIBILITY:"
" Expects values from 6 to 6; reset to 6.\n"
)

          assert err == ""  # double check that there are no other errors

Error: AssertionError: assert 'end [ERROR]:...rmissions?]\n' == ''
Error: + end [ERROR]: Failed to remove gmt_201.ps-! [remove error: Permission denied]
Warning: + end [WARNING]: Unable to remove gmt_201.ps- [permissions?]
Warning: + end [WARNING]: Unable to remove directory C:/Users/runneradmin/.gmt/sessions/gmt_session.4504 [permissions?]
Warning: + gmtset [WARNING]: GMT_COMPATIBILITY: Expects values from 6 to 6; reset to 6.
Warning: + pygmt-session [WARNING]: GMT_COMPATIBILITY: Expects values from 6 to 6; reset to 6.
Error: + begin [ERROR]: Failed to remove gmt_201.ps-! [remove error: Permission denied]
Warning: + begin [WARNING]: Unable to remove gmt_201.ps- [permissions?]

C:\Miniconda3\envs\test\lib\site-packages\pygmt\tests\test_session_management.py:47: AssertionError

During handling of the above exception, another exception occurred:

capsys = <_pytest.capture.CaptureFixture object at 0x0000020835D65E80>

def test_gmt_compat_6_is_applied(capsys):
    """

Error: Ensure that users with old gmt.conf files won't get pygmt-session [ERROR]:

    GMT_COMPATIBILITY: Expects values from 6 to 6; reset to 6.
    """
    end()  # Kill the global session
    try:
        # Generate a gmt.conf file in the currenty directory
        # with GMT_COMPATIBILITY = 5
        with Session() as lib:
            lib.call_module("gmtset", "GMT_COMPATIBILITY 5")
        begin()
        with Session() as lib:
            lib.call_module("basemap", "-R10/70/-3/8 -JX4i/3i -Ba")
            out, err = capsys.readouterr()  # capture stdout and stderr
            assert out == ""
            assert err != (

Error: "pygmt-session [ERROR]: GMT_COMPATIBILITY:"
" Expects values from 6 to 6; reset to 6.\n"
)
assert err == "" # double check that there are no other errors
finally:
end()
# Clean up the global "gmt.conf" in the current directory

      assert os.path.exists("gmt.conf")

E AssertionError: assert False
E + where False = <function exists at 0x0000020827DB35E0>('gmt.conf')
E + where <function exists at 0x0000020827DB35E0> = <module 'ntpath' from 'C:\Miniconda3\envs\test\lib\ntpath.py'>.exists
E + where <module 'ntpath' from 'C:\Miniconda3\envs\test\lib\ntpath.py'> = os.path

C:\Miniconda3\envs\test\lib\site-packages\pygmt\tests\test_session_management.py:51: AssertionError

</details>

@weiji14 weiji14 changed the title Parallel sphinx-build to speed up documentation build Speedup sphinx-build using parallelization and remove make linkcheck/doctest Mar 4, 2021
@seisman
Copy link
Member

seisman commented Mar 4, 2021

It seems adding -j auto doesn't speed up at all.

@seisman
Copy link
Member

seisman commented Mar 7, 2021

As -j auto doesn't work as we expected, I think we can just rename the issue title and only remove the linkcheck and doctest targets in this PR.

@weiji14
Copy link
Member Author

weiji14 commented Mar 7, 2021

As -j auto doesn't work as we expected, I think we can just rename the issue title and only remove the linkcheck and doctest targets in this PR.

There's no speedup on the Github Actions CI, but I think it helps a bit locally, shaving off a few tens of seconds on my 8-core machine when I tested, so perhaps ok to keep -j auto since it's harmless? Do you notice anything on your local machine?

@seisman
Copy link
Member

seisman commented Mar 7, 2021

As -j auto doesn't work as we expected, I think we can just rename the issue title and only remove the linkcheck and doctest targets in this PR.

There's no speedup on the Github Actions CI, but I think it helps a bit locally, shaving off a few tens of seconds on my 8-core machine when I tested, so perhaps ok to keep -j auto since it's harmless? Do you notice anything on your local machine?

Building the documentation takes 130 s in the master branch, and 120 s in this branch. It's even faster if you move -j auto to the first line, i.e.,

SPHINXOPTS    = -j auto

@seisman
Copy link
Member

seisman commented Mar 7, 2021

It's even faster if you move -j auto to the first line, i.e.,

SPHINXOPTS    = -j auto

It's not true, but still adding -j auto to SPHINXOPTS is better.

@seisman seisman mentioned this pull request Mar 8, 2021
8 tasks
Co-Authored-By: Dongdong Tian <[email protected]>
@weiji14 weiji14 changed the title Speedup sphinx-build using parallelization and remove make linkcheck/doctest Speedup sphinx-build using parallelization Mar 9, 2021
@weiji14 weiji14 merged commit b498f61 into master Mar 9, 2021
@weiji14 weiji14 deleted the parallel-sphinx-build branch March 9, 2021 00:34
sixy6e pushed a commit to sixy6e/pygmt that referenced this pull request Dec 21, 2022
Distribute sphinx-build over several processes in parallel
to make building on multiprocessor machines faster.
Using 'auto' which equals the number of CPUs available.
See docs at https://www.sphinx-doc.org/en/3.x/man/sphinx-build.html#cmdoption-sphinx-build-j.

* Move `-j auto` to SPHINXOPTS

Co-authored-by: Dongdong Tian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Boring but important stuff for the core devs skip-changelog Skip adding Pull Request to changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants