data-science-development-project-template

A logical, reasonably standardized, but flexible project structure for doing and sharing data science research work while developing a software tool. (by michael-ford)

Data-science-development-project-template Alternatives

Similar projects and alternatives to data-science-development-project-template

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better data-science-development-project-template alternative or higher similarity.

data-science-development-project-template reviews and mentions

Posts with mentions or reviews of data-science-development-project-template. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-11-17.
  • How do you manage results, plots, etc.?
    4 projects | /r/bioinformatics | 17 Nov 2022
    Bioinf has a lot of biologists who have transitioned into more technical/coding focused roles, so you'll find there's not a lot of engineering workflow standards out there compared to DS or SWE. As others have said, snakemake is the most common, but thats just a pipeline managment tool, it doesn't manage data or outputs. I personally use DVC for data and pipeline management (and include jupyter and papermill to make it all work), although I haven't yet gotten onboard with their experiments feature (which is what would manage different parameters and figures/results beyond versioning). I looked into MLflow and some other options when I was getting started (I do tool development and bioinf analysis), but I wanted data versioning to ensure experiment reproducibility (kind of a critcal part of science IMO), and many of the other solutions like Airflow (common in DS industry) seemed to be overkill for smaller bioinfo projects. DVC meets the requirements and I like it in concept, although in practice there have been many updates that have been a bit of a pain to keep up with/integrate. I've got a bioinfo/ds project template on github that roles together git, conda, DVC, jupyter and papermill to ensure experiment reproducibility, and is setup as a template that can be deployed with cookiecutter - check it out if you like.

Stats

Basic data-science-development-project-template repo stats
1
4
5.4
28 days ago

michael-ford/data-science-development-project-template is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of data-science-development-project-template is Jupyter Notebook.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com