-
Notifications
You must be signed in to change notification settings - Fork 35
Add variant annotation functions #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Given the approach in #227 to not vendor a PCA implementation, do you think we might want to just document how to use an external library to annotate variants, or do you think we will need some code inside |
🤔 The hail solution is to:
Assuming cloud storage, I want to say it would actually be easier (from a user's perspective) for us to create a docker image with vep installed and then have something like a snakemake pipeline on GKE read exported variant data from Xarray/sgkit and produce results you can read back in easily. Distributing variant data and running VEP on it isn't the hard part IMO, it's managing the installation on a cluster that will be a pain for users. I'm not sure how to make that go away without docker, so there is perhaps some advantage to us having an sgkit docker image that descends from the dask image used by Helm with this extra stuff installed. That would certainly make it easier to avoid needing an external pipeline tool. I would classify it a little differently than PCA though since the external library is so much harder to apply in this case. |
This feels to me like it's outside our remit - integrating with the Pydata ecosystem. If we start front-ending VEP for users, where do we stop? Certainly we should support processing VEP annotations but I think running VEP should be outside our scope. |
Good point. I can see there being some satellite pystatgen repos that are specific to putting some kind of compatible front end on hard-to-scale CLI tools. |
In quantitative genetics it is common not to treat alleles at a locus as equal. The "functional consequence" of each allele is important and the process for determining these consequences is well standardized in VEP (in coding regions at least).
Providing access to annotations like this, ideally using the LOFTEE plugin, would be very useful since it is a common task and not necessarily an easy one. Hail's vep and nirvana functions could be a good guide.
The text was updated successfully, but these errors were encountered: