reproducible-research

Open-source projects categorized as reproducible-research

Top 23 reproducible-research Open-Source Projects

  • metaflow

    :rocket: Build and manage real-life ML, AI, and data science projects with ease!

  • Project mention: FLaNK Stack 05 Feb 2024 | dev.to | 2024-02-05
  • PyTorch-VAE

    A Collection of Variational Autoencoders (VAE) in PyTorch.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Sacred

    Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.

  • Project mention: Sacred VS cascade - a user suggested alternative | libhunt.com/r/sacred | 2023-12-05
  • nextflow

    A DSL for data-driven computational pipelines

  • Project mention: Nextflow: Data-Driven Computational Pipelines | news.ycombinator.com | 2023-08-10

    > It's been a while since you can rerun/resume Nextflow pipelines

    Yes, you can resume, but you need your whole upstream DAG to be present. Snakemake can rerun a job when only the dependencies of that job are present, which allows to neatly manage the disk usage, or archive an intermediate state of a project and rerun things from there.

    > and yes, you can have dry runs in Nextflow

    You have stubs, which really isn't the same thing.

    > I have no idea what you're referring to with the 'arbitrary limit of 1000 parallel jobs' though

    I was referring to this issue: https://github.com/nextflow-io/nextflow/issues/1871. Except, the discussion doesn't give the issue a full justice. Nextflow spans each job in a separate thread, and when it tries to span 1000+ condor jobs it die with a cryptic error message. The option of -Dnxf.pool.type=sync and -Dnxf.pool.maxThreads=N prevents the ability to resume and attempts to rerun the pipeline.

    > As for deleting temporary files, there are features that allow you to do a few things related to that, and other features being implemented.

    There are some hacks for this - but nothing I would feel safe to integrate into a production tool. They are implementing something - you're right - and it's been the case for several years now, so we'll see.

    Snakemake has all that out of the box.

  • fma

    FMA: A Dataset For Music Analysis

  • benchmark_VAE

    Unifying Variational Autoencoder (VAE) implementations in Pytorch (NeurIPS 2022)

  • EvalAI

    :cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • ITK

    Insight Toolkit (ITK) -- Official Repository. ITK builds on a proven, spatially-oriented architecture for processing, segmentation, and registration of scientific images in two, three, or more dimensions.

  • drake

    An R-focused pipeline toolkit for reproducibility and high-performance computing (by ropensci)

  • torch-fidelity

    High-fidelity performance metrics for generative models in PyTorch

  • targets

    Function-oriented Make-like declarative workflows for R

  • Weave.jl

    Scientific reports/literate programming for Julia

  • Project mention: GitHub - JunoLab/Weave.jl: Scientific reports/literate programming for Julia | /r/LitProg | 2023-05-31
  • disentangling-vae

    Experiments for understanding disentanglement in VAE latent representations

  • gpu-jupyter

    GPU-Jupyter: Leverage the flexibility of Jupyterlab through the power of your NVIDIA GPU to run your code from Tensorflow and Pytorch in collaborative notebooks on the GPU.

  • papaja

    papaja (Preparing APA Journal Articles) is an R package that provides document formats to produce complete APA manuscripts from RMarkdown-files (PDF and Word documents) and helper functions that facilitate reporting statistics, tables, and plots.

  • codebraid

    Live code in Pandoc Markdown

  • funflow

    Functional workflows

  • sarek

    Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing

  • Project mention: Recommendations for software or online resources | /r/bioinformatics | 2023-04-29
  • huxtable

    An R package to create styled tables in multiple output formats, with a friendly, modern interface.

  • Project mention: What type of table is this, and is there a way to do this in R? | /r/RStudio | 2023-12-06

    As for styling, I highly recommend the huxtable package. You can style rows, columns, and individual cells however you want. It uses dplyr pipelining, if you’re familiar with that, so it’s super intuitive to use too.

  • trackdown

    R package for collaborative writing and editing of R Markdown (or Sweave) documents in Google Docs.

  • example-get-started

    Get started DVC project

  • shournal

    Log shell-commands and used files. Snapshot executed scripts. Fully automatic.

  • htm.core

    Actively developed Hierarchical Temporal Memory (HTM) community fork (continuation) of NuPIC. Implementation for C++ and Python

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

reproducible-research related posts

Index

What are some of the best open-source reproducible-research projects? This list will help you:

Project Stars
1 metaflow 7,586
2 PyTorch-VAE 5,989
3 Sacred 4,157
4 nextflow 2,538
5 fma 2,108
6 benchmark_VAE 1,680
7 EvalAI 1,677
8 ITK 1,339
9 drake 1,330
10 torch-fidelity 870
11 targets 866
12 Weave.jl 814
13 disentangling-vae 753
14 gpu-jupyter 661
15 papaja 626
16 codebraid 361
17 funflow 360
18 sarek 333
19 huxtable 311
20 trackdown 209
21 example-get-started 167
22 shournal 159
23 htm.core 144

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com