You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
created by: Quansight Bot [[[[@quansight-bot](https://github.com/quansight-bot)](https://github.com/quansight-bot)](https://github.com/quansight-bot)](https://github.com/quansight-bot)
created by: Tom White [[[[[@tomwhite](https://github.com/tomwhite)](https://github.com/tomwhite)](https://github.com/tomwhite)](https://github.com/tomwhite)](https://github.com/tomwhite)
labels:
assigned:
Pull request description
This is a draft implementation of #29 (based on [[[[[@eric-czech](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)'s code there) for discussion. I'm quite happy for someone else to take if over if interested (e.g. [[[[[@daletovar](https://github.com/daletovar)](https://github.com/daletovar)](https://github.com/daletovar)](https://github.com/daletovar)](https://github.com/daletovar), [[[[[@jerowe](https://github.com/jerowe)](https://github.com/jerowe)](https://github.com/jerowe)](https://github.com/jerowe)](https://github.com/jerowe)).
API docs need to be added still. Also sample summary stats.
A couple of points for discussion:
Method naming. We currently have count_alleles (verb), but some of the new methods don't easily fit into that pattern: e.g. call_rate, variant_stats. Perhaps insisting the function name for each method is a verb is going too far, and we should just allow the name that we think sounds best (this seems to be the approach taken by Hail and Glow which both have a mixture of both styles).
Conditionally computing variables, and merging. The allele_frequency function calls count_alleles, but perhaps it should check first if it has already been computed (I think [[[[[@eric-czech](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech) does something similar in LD prune, for example). In that case, should it return the variant_allele_count variable in the returned dataset? I'm wondering whether we should actually merge the new variables with the original dataset as a general rule. The original dataset would be unchanged, and this would mean the user no longer had to call merge themselves, which might work better in pipelines, for example.
The text was updated successfully, but these errors were encountered:
Issue metadata
Issue description
Issue metadata
Issue description
Issue metadata
Issue description
Issue metadata
Issue description
Issue metadata
Issue description
Pull request metadata
Pull request description
This is a draft implementation of #29 (based on [[[[[@eric-czech](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)'s code there) for discussion. I'm quite happy for someone else to take if over if interested (e.g. [[[[[@daletovar](https://github.com/daletovar)](https://github.com/daletovar)](https://github.com/daletovar)](https://github.com/daletovar)](https://github.com/daletovar), [[[[[@jerowe](https://github.com/jerowe)](https://github.com/jerowe)](https://github.com/jerowe)](https://github.com/jerowe)](https://github.com/jerowe)).
API docs need to be added still. Also sample summary stats.
A couple of points for discussion:
count_alleles
(verb), but some of the new methods don't easily fit into that pattern: e.g.call_rate
,variant_stats
. Perhaps insisting the function name for each method is a verb is going too far, and we should just allow the name that we think sounds best (this seems to be the approach taken by Hail and Glow which both have a mixture of both styles).allele_frequency
function callscount_alleles
, but perhaps it should check first if it has already been computed (I think [[[[[@eric-czech](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech)](https://github.com/eric-czech) does something similar in LD prune, for example). In that case, should it return thevariant_allele_count
variable in the returned dataset? I'm wondering whether we should actually merge the new variables with the original dataset as a general rule. The original dataset would be unchanged, and this would mean the user no longer had to call merge themselves, which might work better in pipelines, for example.The text was updated successfully, but these errors were encountered: