snippy
TileDB-VCF
snippy | TileDB-VCF | |
---|---|---|
6 | 4 | |
437 | 80 | |
- | - | |
3.9 | 8.6 | |
7 months ago | 2 days ago | |
Perl | C++ | |
GNU General Public License v3.0 only | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
snippy
-
Help/recommendations for whole genome sequence analysis
This should be of help, if you think you can get have a decent (closely related) reference genome from NCBI: https://github.com/tseemann/snippy
-
Software to automatically detect SNP's between 2 or more genomes?
snippy is a nice tool for this!
-
Microbiome - would it be reasonable to assume that a bacterial contaminant (eg during library prep) would be the same isolate across contaminated samples?
So, let's say I have a batch of 20 samples, and I found evidence for my genus of interest in 5. What I've done is used Snippy (https://github.com/tseemann/snippy) with mapping based on a reference species in the genus. Snippy finds shared "core" regions of the reference species that are present in all samples and compares the SNPs between the samples. I find that each sample has distinct SNPs, which I believe indicates that the bacteria in each sample are not the same isolate. Thus, I am trying to argue that the bacteria could not have come from a shared contamination event during library prep. Does this make things more clear?
-
How to find sequence variations from contigs
My first idea was to assemble the contigs into scaffolds (e. g. using RagTag) and then look for a tool that can identify these mutations. However, I've also come across snippy, which is a tool that can identify mutations from reads. I don't know if I can just apply it to contigs though (probably not).
-
Sorting reads to references with high identity
I would use snippy for this if there references are haploid genomes https://github.com/tseemann/snippy
-
Inquiry regarding Snippy (by Seemann). My job has been running for more than 5 days with no update. Can anyone help me?
[18:37:12] Obtained from https://github.com/tseemann/snippy
TileDB-VCF
-
Has anyone stored/queried VCFs and their variant records in a relational database?
Perhaps of interest https://github.com/TileDB-Inc/TileDB-VCF
-
[TileDB webinar] Population genomics is a data management problem
Here are the docs to the open-source TileDB-VCF storage engine: https://docs.tiledb.com/main/integrations-and-extensions/population-genomics
What are some alternatives?
vcf2maf - Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms
Hail - Cloud-native genomic dataframes and batch computing
dipcall - Reference-based variant calling pipeline for a pair of phased haplotype assemblies
gwas2vcf - Convert GWAS summary statistics to VCF
tronflow-vcf-postprocessing - A Nextflow variant normalization pipeline based on vt and bcftools
truvari - Structural variant toolkit for VCFs
RagTag - Tools for fast and flexible genome assembly scaffolding and improvement
Vcflib - C++ library and cmdline tools for parsing and manipulating VCF files with python and zig bindings
TileDB - The Universal Storage Engine
sgkit - We've moved to https://github.com/sgkit-dev/sgkit
gw - Genome browser and variant annotation
octopus - Bayesian haplotype-based mutation calling