Workflow for early research projects in your organization?

This page summarizes the projects mentioned and recommended in the original post on /r/datascience

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • cookiecutter-data-science

    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

  • While data science is not SE, it's fundamental to have some structure in your projects since you want the work to be somewhat reproducible. I recommend you start here https://github.com/drivendata/cookiecutter-data-science Since it's a cookie cutter it will be easier to implement at first since they can create the structure by running a short command, after some time you will tailor it to your specific company needs :) For notebooks it's kind of hard, they can't be peer reviewd that easily since cells are editable even after code has been run, keeping the old result... I recommend tools like deepnote, but I'm not sure how well they work for collaboration in notebooks because I never used them yet, I just know they are working on solving these problems. I hope these things help!

  • shournal

    Log shell-commands and used files. Snapshot executed scripts. Fully automatic.

  • At the institute I work at intermediate results are also often not documented. It is sometimes even worse than you described so that the researcher have difficulties to reproduce their own work. Therefore they hired me to develop shournal in order to have at least some safety net to reconstruct command-line work on the shell. Maybe it can help your team too? https://github.com/tycho-kirchner/shournal

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Questions about Cookiecutter and Anaconda.

    1 project | /r/datascience | 30 Dec 2022
  • What should the folder structure of my Python projects be?

    1 project | /r/learnpython | 16 Dec 2022
  • How to keep a project organized?

    1 project | /r/learnpython | 15 Jun 2022
  • Can anyone share how they structure their folder for data engineer project?

    1 project | /r/dataengineering | 22 Jan 2022
  • Personal Projects that are original

    1 project | /r/datascience | 17 Oct 2021