Skip to content

MAINT - Ensure Playwright tests use test sites and are run in CI #2133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Mar 20, 2025

Conversation

trallard
Copy link
Collaborator

Ensures playwright is always available for tests following conversations in #2119 (comment)

Note, however, that the test_version_switcher_highlighting test is currently failing due to changes to the version switcher component (as part of recent a11y improvements) and due to RTD version switcher no longer being flushed into the sidebar (#2034).

I could try and add an alternative test that checks perhaps that we have at least a latest and a dev version in the version switcher?

@trallard trallard added the tag: testing Issues related to PST testing label Feb 17, 2025
@peytondmurray
Copy link
Contributor

Tested this locally with tox run -ve compile-assets,i18n-compile,py312-sphinx61-tests. Looks like it did run the playwright tests 👍

Local testing replicated only one of these failures - the version switcher - could this be a test that should have been failing before (but wasn't, because it wasn't being run)?

FAILED tests/test_playwright.py::test_version_switcher_highlighting[chromium] - playwright._impl._errors.TimeoutError: Locator.get_attribute: Timeout 30000ms exceeded.

@trallard
Copy link
Collaborator Author

Definitely, that is the same test that I saw failing. Since that test was added the version switcher has changed a bit so nobody noticed this failing until now.

trallard pushed a commit that referenced this pull request Feb 26, 2025
This PR fixes an issue with the locator in the
`test_version_switcher_highlighting` test, which should allow #2133 to
pass automated tests.

Note that this test is not run by CI, but that should be addressed in
#2133. To test, please run this manually.

(Sorry about the extra PR - I don't have permissions to push to the
branch in #2133). cc @trallard
@peytondmurray
Copy link
Contributor

Looks like #2137 doesn't resolve the broken test here, though looking at https://pydata-sphinx-theme.readthedocs.io/en/latest/ it seems like the version picker is now correctly pointing to dev, at least. When I build locally, the version picker also points to dev, so that's good too.

I was thinking about the test a little more, and I don't like that it requires "dev" to be selected for the test to pass...that means that when a new release is cut, you're always going to have a failing test - because we won't be on a dev version for that period of time. I propose changing the locator in that test to use some other way of finding the version picker.

I wasn't able to reproduce the failing test_version_switcher_highlighting test - everything works fine locally, as far as I can see. Since the test is invoked from the CI job with

python -Im tox run -e compile-assets,i18n-compile,py312-tests

I'd expect the docs to be built with the "dev" version selected. Unless there's some weird caching thing going on... but I don't think we're caching the docs anywhere?

@peytondmurray
Copy link
Contributor

peytondmurray commented Feb 26, 2025

Some further investigation:

I added some error reporting in conftest.py to help debug what's going on, and was also looking at CI.yml. The run-pytest job does the following:

          # this will compile the assets and translations then run the tests
          # check if there is a specific Sphinx version to test with
          # example substitution: tox run -e compile-assets,i18n-compile,py39-sphinx61-tests
          if [ -n "${{matrix.sphinx-version}}" ]; then
            python -Im tox run -e compile-assets,i18n-compile,py$(echo ${{ matrix.python-version }} | tr -d .)-sphinx$(echo ${{ matrix.sphinx-version }} | tr -d .)-tests
          # if not we use the default version
          # example substitution: tox run -e compile-assets,i18n-compile,py39-tests
          else
            python -Im tox run -e compile-assets,i18n-compile,py$(echo ${{ matrix.python-version }} | tr -d .)-tests
          fi

From tox.ini, it looks like

  • compile-assets: stb compile (bundles js/css assets)
  • testenv:i18n-compile -> pybabel compile -d src/pydata_sphinx_theme/locale -D sphinx {posargs} (compiles translation catalogues to binary MO files)
  • py<whatever>-tests:
bash -c 'if [[ "{env:GITHUB_ACTIONS:}" == "true" ]]; then playwright install --with-deps; else playwright install; fi'
    py3{9,10,11,12}{,-sphinx61,-sphinxdev,}-tests: coverage run -m pytest -m "not a11y" {posargs}
    py3{9,10,11,12}{,-sphinx61,-sphinxdev,}-tests-no-cov: pytest -m "not a11y" {posargs}

It doesn't seem to include a sphinx-build step by default! This is confirmed by running git clean -fdx ./ to clear out any existing build artifacts and then attempting to run the tests with tox, which fails. I can make a PR that adds some more robust error handling in conftest.py when setting up the http server that serves up the built docs, and adds a step to build the docs before running the tests, if that works for everyone?

@drammock
Copy link
Collaborator

It doesn't seem to include a sphinx-build step by default!

⁉️

I can make a PR that adds some more robust error handling in conftest.py when setting up the http server that serves up the built docs, and adds a step to build the docs before running the tests, if that works for everyone?

sounds reasonable. @trallard is the tox expert around here though, so let's see what she has to say. I 🙈 tox.

@trallard
Copy link
Collaborator Author

trallard commented Feb 27, 2025

That kind of makes sense as the other tests we have been running with pytest do not need a separate sphinx-build. But it would seem these playwright tests were written with the expectation of having a site built already.
That can be easily fixed by calling sphinx-build within the tests command. Or in CI before calling the tests. So I am good with whatever y'all prefer.

Also more robust config/errors are always good.

@drammock
Copy link
Collaborator

actually now that I think harder about it, I might know what's going on. Originally, all of our tests acted on test sites rather than our own built site. This changed when we started to test a11y with playwright, and at that point only the a11y tests required our built site. Later, we started adding playwright tests that work on the test sites too. It sounds like things got muddled (?). If that's correct, then IMO we should try to restore and maintain this distinction:

  • a11y tests are all defined in a dedicated a11y pytest file, and all operate on the built version of our dev site
  • non-a11y tests are defined in other pytest files, and only operate on the test sites built with sphinx_build_factory

If that separation is maintained, then pytest shouldn't need a call to sphinx_build for the non-a11y tests.

@trallard
Copy link
Collaborator Author

You are right there. Seems that since these playwright tests were being skipped we never noticed this behaviour.

So I am +1 on making this distinction clearer and keep a11y tests separate from the rest.

@peytondmurray
Copy link
Contributor

@trallard Let me know if I can help in any way with this - I have time to push forward on this if that would be helpful.

@trallard trallard changed the title MAINT - Install playwright for all of our tests MAINT - Ensure Playwright tests use test sites and are run in CI Feb 28, 2025
@trallard
Copy link
Collaborator Author

Thanks @peytondmurray I fixed the failing test by adding a test site. But it seems my last push made another test to fail now (they were both working locally for me but seems I somehow messed paths in the CI).
If you could have a look it would be fab as I might otherwise only get back to this next week

@trallard
Copy link
Collaborator Author

Edit: @peytondmurray I had missed one test change needed 🤦🏽‍♀️ (of course I did). This should be good for a review.

Copy link

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  src/pydata_sphinx_theme
  short_link.py
Project Total  

This report was generated by python-coverage-comment-action

@@ -42,7 +47,8 @@ def _build_test_site(site_name: str, sphinx_build_factory: Callable) -> None:

def _check_test_site(site_name: str, site_path: Path, test_func: Callable):
"""Make the built test site available to Playwright, then run `test_func` on it."""
test_sites_dir.mkdir(exist_ok=True)
# Need to ensure parent directories exist in CI
test_sites_dir.mkdir(exist_ok=True, parents=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into this as well while debugging; strictly speaking parent=True should not be necessary if the docs have been built. So I think in the event that the parent directories of test_sites_dir don't exist, you've got a build problem that will make the tests fail anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly speaking parent=True should not be necessary if the docs have been built [...] in the event that the parent directories of test_sites_dir don't exist, you've got a build problem that will make the tests fail anyway

Is the reason for keeping it that site_path isn't a child of test_sites_dir? That would explain the symlinking a few lines below. (I know I wrote that code but I don't recall exactly why I did it that way)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought there might be a reason for the symlink 🤷🏽‍♀️ but I do not see a reason why we would have to keep this as is. I can go and refactor this.

@trallard
Copy link
Collaborator Author

I did not mean to resolve that comment and since I am on mobile before I forget my thoughts here I go.

For the switcher - we could remove stable and only point to concrete versions, since it's a test site it does not matter tbh as long as we have at least two items.

For the "dev" only problem, maybe can do something similar to what we do in our docs config to match for example "RC" https://github.com/pydata/pydata-sphinx-theme/blob/main/docs%2Fconf.py#L140

Wdyt @peytondmurray ?

@peytondmurray
Copy link
Contributor

peytondmurray commented Feb 28, 2025

For the "dev" only problem, maybe can do something similar to what we do in our docs config to match for example "RC" https://github.com/pydata/pydata-sphinx-theme/blob/main/docs%2Fconf.py#L140

Yep, I bet that would work, or I bet you could use a regex to look for text that looks like a version string. Maybe the cleanest way IMO would be to uses a CSS attribute selector, though:

        active_version_name = page.locator(
            "button[data-active-version-name]"
        ).get_attribute("data-active-version-name")

This locates the version switcher button and gets whatever the active version is no matter what it is. I tested locally and it seems to work. Thoughts?

Copy link
Contributor

@peytondmurray peytondmurray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the version switcher test being self contained, I think this looks good!

drammock
drammock previously approved these changes Mar 6, 2025
Copy link
Collaborator

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving, as I think things would be fine if merged as-is, but see below for a few questions/suggestions/nitpicks (which could be addressed here or in separate PRs)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] having all this link-shortening-related content here is a bit confusing; a future maintainer looking at this site might mistake the test site's purpose. I'm assuming it's a copy-paste job; although it's a bit more work I would tend toward making test sites as bare-bones as possible, with all content reinforcing the test site's purpose. For example, this page might say

Page 1
======

Meaningless content; the point of this test site is the version switcher.

That said, I'm also not opposed to grouping several tests into a single test site (to reduce the ratio of build time to test time), and have 1 page on the site for each test (e.g., a page for link shortening, a page for code blocks, a page for breadcrumbs, etc). But I wouldn't expect that to happen in a test site with the name version_switcher.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically copied the test site from the base one and made some adjustments.
I think we could simplify it somehow or even fold this into the base test site but that can be done in a separate pr.

@@ -0,0 +1,9 @@
:html_theme.sidebar_secondary.remove: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto the prior [nitpick] comment; a site that only tests the version switcher probably only needs a single page, and if it needs multiple pages, they shouldn't be doing unrelated things like hiding sidebars.

"""
Build minimal test sites with sphinx_build_factory and test them with Playwright.
When adding new tests to this file, remember to also add the corresponding test site
to `tests/sites/` or use an existing one.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now, but the advice "or use an existing one" might change depending on what others think of my (admittedly opinionated) suggestions to whittle down to a bare-bones one-purpose-per-test-site approach.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBF I like your proposal of simplifying the tests by reducing the number of test sites.
I added this comment for now as we struggled at first to identify why the site was not being built or on what sites we were running tests

@@ -42,7 +47,8 @@ def _build_test_site(site_name: str, sphinx_build_factory: Callable) -> None:

def _check_test_site(site_name: str, site_path: Path, test_func: Callable):
"""Make the built test site available to Playwright, then run `test_func` on it."""
test_sites_dir.mkdir(exist_ok=True)
# Need to ensure parent directories exist in CI
test_sites_dir.mkdir(exist_ok=True, parents=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly speaking parent=True should not be necessary if the docs have been built [...] in the event that the parent directories of test_sites_dir don't exist, you've got a build problem that will make the tests fail anyway

Is the reason for keeping it that site_path isn't a child of test_sites_dir? That would explain the symlinking a few lines below. (I know I wrote that code but I don't recall exactly why I did it that way)

@trallard
Copy link
Collaborator Author

trallard commented Mar 6, 2025

I also keep dismissing stuff on mobile and can't seem to unresolve stuff.
The breadcrumb tests were not changed I only added the headings and moved some tests so they were grouped by function.

gabalafou pushed a commit to gabalafou/pydata-sphinx-theme that referenced this pull request Mar 10, 2025
This PR fixes an issue with the locator in the
`test_version_switcher_highlighting` test, which should allow pydata#2133 to
pass automated tests.

Note that this test is not run by CI, but that should be addressed in
pydata#2133. To test, please run this manually.

(Sorry about the extra PR - I don't have permissions to push to the
branch in pydata#2133). cc @trallard
@trallard
Copy link
Collaborator Author

@drammock can you give this a last review/approval.
I just opened a new issue to address the nits/improvements you suggested #2171 so I can sort this in a subsequent PR.

@drammock drammock merged commit 83e7e51 into pydata:main Mar 20, 2025
30 of 31 checks passed
@trallard trallard deleted the trallard/patch-test-deps branch March 20, 2025 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tag: testing Issues related to PST testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants