Skip to content

Add Hudson estimator to Fst, and make it the default #302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 13, 2020

Conversation

tomwhite
Copy link
Collaborator

@tomwhite tomwhite commented Oct 5, 2020

Also, compute Fst from single divergence matrix which has diversity values on the diagonal. This idea is from tskit, which does the same thing. The advantage of this approach will come with windowing, since only one array (divergence) will need to be windowed, rather than two (diversity and divergence).

Fixes #292

Compute Fst from single divergence matrix which has diversity values on the diagonal.

Fixes sgkit-dev#292
@tomwhite tomwhite mentioned this pull request Oct 5, 2020
@tomwhite
Copy link
Collaborator Author

@eric-czech any chance you can take a look at this one?

Copy link
Collaborator

@eric-czech eric-czech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure thing @tomwhite, LGTM

np.testing.assert_allclose(div, ts_div)


@pytest.mark.parametrize(
"size, n_cohorts",
[(2, 2), (3, 2), (10, 2), (100, 2)],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the set of of size / n_cohort pairs so different for the Hudson vs Nei tests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look @eric-czech!

Hudson is tested by comparing it to scikit-allel, which only allows pairs of cohorts (populations), whereas Nei is compared to tskit, which allows any number of cohorts (and considers them in pairs). I've added a note to the test saying this.

pystatgen/sgkit@7cc6493

"size, n_cohorts",
[(2, 2), (3, 2), (10, 2), (100, 2)],
)
def test_Fst__Hudson(size, n_cohorts):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth making chunking a part of the tests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely. Added a chunks parameter to all popgen tests. This exposed an issue in the divergence code, which I've now fixed.

pystatgen/sgkit@c9338ff

@tomwhite tomwhite added the auto-merge Auto merge label for mergify test flight label Oct 13, 2020
@codecov-io
Copy link

Codecov Report

Merging #302 into master will decrease coverage by 1.25%.
The diff coverage is 39.02%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #302      +/-   ##
==========================================
- Coverage   97.61%   96.35%   -1.26%     
==========================================
  Files          26       26              
  Lines        1843     1866      +23     
==========================================
- Hits         1799     1798       -1     
- Misses         44       68      +24     
Impacted Files Coverage Δ
sgkit/stats/popgen.py 65.60% <39.02%> (-15.78%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 045acad...ee34283. Read the comment docs.

@mergify mergify bot merged commit b83ca1b into sgkit-dev:master Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge Auto merge label for mergify test flight
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Hudson estimator for Fst
3 participants