dslp
govcookiecutter
dslp | govcookiecutter | |
---|---|---|
4 | 1 | |
389 | 132 | |
2.6% | 0.0% | |
1.8 | 4.3 | |
about 3 years ago | 3 months ago | |
Python | ||
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dslp
-
Data Science Project Documentation
One great resource that I’ve found if you leverage GitHub is this DSLP process. Really succinctly ties everything to the code and it’s pretty quick to pick up in my experience. https://github.com/dslp/dslp
-
New DS here. Where can I learn best practices for organizing a project, folder structure, BASH scripting/scheduling, etc?
As an addendum to u/GryffinLoL I’d add this resource if you’re using any kind of VCS tooling. It has some solid suggestions.
- Does anyone know of comprehensive refresher material for a once Senior Data Scientist?
- What is the best structured ds project you have seen?
govcookiecutter
-
New DS here. Where can I learn best practices for organizing a project, folder structure, BASH scripting/scheduling, etc?
For testing your statistical assumptions and performance, govcookiecutter is worth a look for its integration of agile and baked in unit testing, see: https://github.com/best-practice-and-impact/govcookiecutter
What are some alternatives?
ploomber - The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
workspace-dotfiles-ansible - Automated and periodically configure workspace based on cronjob.
projects - Sample projects using Ploomber.
cookiecutter-data-science - A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
awesome-datascience - :memo: An awesome Data Science repository to learn and apply for real world problems.
robs_awesome_python_template - A Highly Configurable Python Project Template for Modern Python Projects
Kedro - Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
missing-semester - The Missing Semester of Your CS Education 📚