Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work. Learn more →
Top 23 Python Bioinformatic Projects
-
Project mention: NiceGUI: Let any browser be the frontend for your Python code | reddit.com/r/Python | 2023-01-15
Of course there are valid use cases for splitting frontend and backend technologies. NiceGUI is for those who don’t want to leave the Python ecosystem and like to reap the benefits of having all code in one place. There are other options like Streamlit, Dash, Anvil, JustPy, and Pynecone. But we initially created NiceGUI to easily handle the state of external hardware like LEDs, motors, and cameras. Additionally, we wanted to offer a gentle learning curve while still providing the ability to go all the way down to HTML, CSS, and JavaScript if needed.
-
Project mention: Biology related exercices and "challenges" to train by myself | reddit.com/r/learnpython | 2023-02-01
I think you mind find something of a community around BioPython, which might be helpful. Just looking at the capabilities will probably be instructive as well.
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Project mention: Give me your suggestions for papers with a Convolutional Neural Network in Bioinformatics | reddit.com/r/bioinformatics | 2022-07-12See https://www.nature.com/articles/nbt.4235 for the paper and https://github.com/google/deepvariant for the code.
-
-
Project mention: Guidance needed: Extracting diseases and symptoms from medical text | reddit.com/r/LanguageTechnology | 2022-11-05
https://github.com/medspacy/medspacy and https://allenai.github.io/scispacy/ should get you most of the way there
-
-
deep_gcns_torch
Pytorch Repo for DeepGCNs (ICCV'2019 Oral, TPAMI'2021), DeeperGCN (arXiv'2020) and GNN1000(ICML'2021): https://www.deepgcns.org
-
InfluxDB
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
-
I would recommend looking at the pages for FastQC and MultiQC. I run FastQC on my fastq files, then MultiQC on them to collect all that individual data into one report. You can also use MultiQC to analyze the quality of your alignments, at least after using STAR aligner (probably others too, I just have only used STAR aligned).
-
Project mention: We're wasting money by only supporting gzip for raw DNA files | news.ycombinator.com | 2023-01-09
-
Project mention: I have a question about the FTP of annotation files from NCBI's Genbank and RefSeq | reddit.com/r/bioinformatics | 2022-08-02
If you have the taxonomic IDs of your organisms of interest, there are existing parallelized download tools that are more efficient like https://github.com/kblin/ncbi-genome-download or bit-dl-ncbi-assemblies from https://github.com/AstrobioMike/bit
-
also consider posting bioinformatics questions over at https://www.biostars.org/ instead of reddit
-
-
Project mention: New CRISPR-based map ties every human gene to its function | news.ycombinator.com | 2022-06-10
> Where are the polished, powerful design tools for biology
User interfaces for biology have drastically improved over the last 10 years.
Domain-specific tools like genome browsers, protein viewers, or phylogenetic explorers [1-3] almost all look and feel a lot better than they did in 2012.
The biggest exception here is UCSC Genome Browser, which has an old-school design and web technology stack. That said, it's steadily added features over the years, has substantially sleekened UX in its periphery, and remains widely used.
There are also bespoke visual design resources for biology applications that are good and getting better, like BioRender and PhyloPic [4-5]. There are multi-tiered packages like Dash Bio that wrap biology components together. There's Blender biology community, too!
---
1. Genome browsers and components: https://jbrowse.org/jb2/, https://www.ncbi.nlm.nih.gov/genome/gdv, https://igv.org/app, https://eweitz.github.io/ideogram
2. Protein viewers: https://pymol.org/, https://nglviewer.org/ngl/
3. Phylogenetic explorers: https://clades.nextstrain.org/
6. https://github.com/plotly/dash-bio, https://dash.gallery/Portal/?search=[Pharma]
-
Project mention: Software to make in-scale illustrations of genomic locations | reddit.com/r/bioinformatics | 2022-09-10
If you can build a Python environment and do a little coding, DnaFeaturesViewer or pyGenomeViz would be good choices. You can generate the following figure from a Genbank file with about 10 lines of code. Of course, you can specify the range of coordinates to be plotted.
-
-
-
-
have you seen https://www.nature.com/articles/s41587-019-0209-9?ref=https://githubhelp.com and https://github.com/sourmash-bio/sourmash ?
-
-
If you need more accurate ORF(CDS) prediction including functional annotation, I recommend using CLI tools such as prokka, bakta, or DFAST (DFAST is also available in a web version).
-
If you are okay with looking at already processed data, chdck out https://dee2.io/. Otherwise there is https://github.com/ncbi/sra-tools for getting fastq files (a cli tool) or https://github.com/saketkc/pysradb (python)
-
-
hgvs
Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`
Project mention: [Question] How to transform HGVSg to HGVSp? | reddit.com/r/bioinformatics | 2022-08-12Python > https://github.com/biocommons/hgvs
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Bioinformatics related posts
- Biology related exercices and "challenges" to train by myself
- Joining the Open Source Development Course
- Why bother reconstructing MAGs ?
- Circular visualization in Python (Circos Plot, Chord Diagram)
- Any good meta-transcriptomics pipelines
- first stop codon
- RNA-seq analysis
-
A note from our sponsor - Sonar
www.sonarsource.com | 7 Feb 2023
Index
What are some of the best open-source Bioinformatic projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | dash | 18,034 |
2 | Biopython | 3,422 |
3 | deepvariant | 2,702 |
4 | scanpy | 1,376 |
5 | scispacy | 1,315 |
6 | galaxy | 1,050 |
7 | deep_gcns_torch | 1,003 |
8 | MultiQC | 927 |
9 | Hail | 854 |
10 | ncbi-genome-download | 696 |
11 | biostar-central | 549 |
12 | dash-cytoscape | 485 |
13 | dash-bio | 457 |
14 | DnaFeaturesViewer | 449 |
15 | clinker | 416 |
16 | Sniffles | 387 |
17 | pyfaidx | 384 |
18 | sourmash | 344 |
19 | biotite | 340 |
20 | bakta | 257 |
21 | pysradb | 239 |
22 | truvari | 200 |
23 | hgvs | 192 |