gwas2vcf
sgkit
gwas2vcf | sgkit | |
---|---|---|
1 | 1 | |
39 | 0 | |
- | - | |
0.0 | 5.9 | |
about 1 year ago | about 1 month ago | |
Python | HTML | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gwas2vcf
-
Nuitka: An extremely compatible Python compiler
Here is the original repo I have tried to speed up using:
python -m nuitka --clang --follow-imports main.py
repo: https://github.com/MRCIEU/gwas2vcf
If someone can make the program run faster by whatever means, it will make a bunch of people quite happy.
sgkit
-
Has anyone stored/queried VCFs and their variant records in a relational database?
I once built on top of a library that used the zarr format for storing the vcf data. It allowed me to parse over it like it was a pandas dataframe and incredibly fast. sgkit is the successor library and seems to be growing really well, but I haven't played with it. I do have high hopes for the zarr format, but I am biased
What are some alternatives?
TileDB-VCF - Efficient variant-call data storage and retrieval library using the TileDB storage library.
truvari - Structural variant toolkit for VCFs
Dask - Parallel computing with task scheduling
graalvm-ten-things - Top 10 Things To Do With GraalVM
Hail - Cloud-native genomic dataframes and batch computing
pybenchmarks - Python Interpreters Benchmarks
Scoary - Pan-genome wide association studies
py2many - Transpiler of Python to many other languages
Nuitka - Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10, and 3.11. You feed it your Python app, it does a lot of clever things, and spits out an executable or extension module.