Skip to content

[WIP] Popgen stats #100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 14, 2020
Merged

[WIP] Popgen stats #100

merged 3 commits into from
Sep 14, 2020

Conversation

daletovar
Copy link
Collaborator

  1. adds @jeromekelleher's code for converting from tskit trees to sgkit datasets, diversity, and divergence.
  2. adds some minimal tests for the popgen stats.

Going forward in this PR I'd like to add:

  • a test for ts_to_dataset
  • windowing for popgen stats

I might save Fst and the others for a different PR.

Copy link
Collaborator

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @daletovar! I think your plan looks great. For the parameter tree sequences, I think having sample sizes of (say) [2, 3, 10, 100] would be a good start. I can follow up with another PR with some more examples that'll exercise various corner cases afterwards.

@hammer
Copy link
Contributor

hammer commented Aug 19, 2020

@daletovar thanks for your work on this PR so far! In the future, could you be sure to mention the issue in the PR? In this case, https://github.com/pystatgen/sgkit/issues/94. We should update our contributing docs with this request...

@daletovar
Copy link
Collaborator Author

@hammer, thanks. Yes, I'll be sure to do that in the future.

@jeromekelleher
Copy link
Collaborator

@daletovar, can you rebase this so that it's up to date with upstream/master please?

@mergify
Copy link
Contributor

mergify bot commented Aug 20, 2020

This PR has conflicts, @daletovar please rebase and push updated version 🙏

@mergify mergify bot added the conflict PR conflict label Aug 20, 2020
@jeromekelleher
Copy link
Collaborator

This branch looks pretty messed up @daletovar - I think it would be simplest to rebase against upstream/master and force push.

@daletovar
Copy link
Collaborator Author

@jeromekelleher, sorry for the radio silence. I'll see what I can do about that. I think for now I'll leave the Patterson F stats for another time. I've had a hard time finding where they're implemented in tskit and there are differences between the tskit and scikit allele outputs.

@jeromekelleher
Copy link
Collaborator

@jeromekelleher, sorry for the radio silence. I'll see what I can do about that. I think for now I'll leave the Patterson F stats for another time. I've had a hard time finding where they're implemented in tskit and there are differences between the tskit and scikit allele outputs.

I'm still not able to see a clean diff here @daletovar making it very difficult to see what's being done - can you rebase against upstream/master please?

@mergify
Copy link
Contributor

mergify bot commented Aug 31, 2020

This PR has conflicts, @daletovar please rebase and push updated version 🙏

@jeromekelleher
Copy link
Collaborator

@daletovar - there's more issues with linting here. Can you clean up please?

@tomwhite tomwhite self-requested a review September 1, 2020 16:18
@daletovar
Copy link
Collaborator Author

@jeromekelleher, when it's installing dependencies. it looks like there's an issue building msprime. I'm not sure what's causing it. Do you know of a fix or is there something else we can use to test against tskit?

@jd
Copy link

jd commented Sep 10, 2020

The simulator should work fine — at least it does here. I've sent you an email so we can figure this out. :)

@ravwojdyla
Copy link
Collaborator

ravwojdyla commented Sep 11, 2020

Got an update from @jd about the mergify's simulator issue:

Your user in particular seems to trigger a 403 on GitHub API side when we try to check user permission. We don't know why exactly yet, but we've a bug opened on our side to track that down and fix it. :)

Soooo, if someone could please check the configuration from #244 on this PR and report back, that would be helpful. Simulator link. @jeromekelleher ?

@jeromekelleher
Copy link
Collaborator

@Mergifyio update

@mergify
Copy link
Contributor

mergify bot commented Sep 14, 2020

Command update: success

Branch has been successfully updated

@jeromekelleher
Copy link
Collaborator

Looks like this is breaking because of #245

@tomwhite
Copy link
Collaborator

I just merged #246 which should fix the build.

@Mergifyio update

@mergify
Copy link
Contributor

mergify bot commented Sep 14, 2020

Command update: success

Branch has been successfully updated

@tomwhite
Copy link
Collaborator

@jeromekelleher @ravwojdyla how do we get mergify to actually merge this?!

@ravwojdyla
Copy link
Collaborator

@tomwhite we need to merge #244 first, but I can't validate that configuration because https://github.com/pystatgen/sgkit/pull/100#issuecomment-691276250.

@jeromekelleher
Copy link
Collaborator

Trying another update here, to see if the changes in #244 will unplug things.

@jeromekelleher
Copy link
Collaborator

@Mergifyio update

@mergify
Copy link
Contributor

mergify bot commented Sep 14, 2020

Command update: success

Branch has been successfully updated

@mergify
Copy link
Contributor

mergify bot commented Sep 14, 2020

This PR has conflicts, @daletovar please rebase and push updated version 🙏

@mergify mergify bot added the conflict PR conflict label Sep 14, 2020
@jeromekelleher
Copy link
Collaborator

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Sep 14, 2020

Command rebase: success

Branch already up to date

@mergify mergify bot dismissed stale reviews from jeromekelleher and tomwhite September 14, 2020 09:25

Pull request has been modified.

@mergify mergify bot removed the conflict PR conflict label Sep 14, 2020
jeromekelleher
jeromekelleher previously approved these changes Sep 14, 2020
remove ts_to_dataset from public api

make divergence take in two datasets

add minimal fst

Add read_vcfzarr (sgkit-dev#40)

add ts_to_dataset

add minimal diversity and divergence

remove ts_to_dataset from public api

add tajimas d

add ts_to_dataset

add minimal diversity and divergence

remove ts_to_dataset from public api

make divergence take in two datasets

add minimal fst

add tajimas d

add ts_to_dataset

add minimal diversity and divergence

remove ts_to_dataset from public api

add minimal fst

add tajimas d

fix allele count

update cfg

remove spaces

add msprime and use np.testing

add libgsl-dev dependency

add docstrings

ignore dep warning

add ts_to_dataset

add minimal diversity and divergence

remove ts_to_dataset from public api

make divergence take in two datasets

add minimal fst

Add read_vcfzarr (sgkit-dev#40)

add ts_to_dataset

add minimal diversity and divergence

remove ts_to_dataset from public api

add tajimas d

add ts_to_dataset

add minimal diversity and divergence

remove ts_to_dataset from public api

make divergence take in two datasets

add minimal fst

add tajimas d

add ts_to_dataset

add minimal diversity and divergence

remove ts_to_dataset from public api

add minimal fst

add tajimas d

fix allele count

update cfg

remove spaces

add msprime and use np.testing

add libgsl-dev dependency

add docstrings

fix divide by zero
@mergify mergify bot dismissed jeromekelleher’s stale review September 14, 2020 09:52

Pull request has been modified.

@jeromekelleher
Copy link
Collaborator

Github workflows have been modified, so this has be to be merged manually.

@jeromekelleher jeromekelleher merged commit 660c694 into sgkit-dev:master Sep 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge Auto merge label for mergify test flight
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants