applied-ml
pipebase
applied-ml | pipebase | |
---|---|---|
13 | 6 | |
25,984 | 9 | |
- | - | |
3.0 | 0.0 | |
4 days ago | about 2 years ago | |
Rust | ||
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
applied-ml
-
[D] Favorite ML Youtube Channels/Blogs/Newsletters
Also, have any of you stumbled across any cool GitHub repos like this one: https://github.com/eugeneyan/applied-ml ?
- Curated Papers on Machine Learning in Production
-
Top Github repo trends in 2021
The second repo I LOVE is Eugene Yan’s Applied ML repository. This is a brilliant idea to create and actually something I was planning on sort of casually doing in my non-existent free time… Anyhow, it is a curated list of technical posts from top engineering teams (Netflix, Amazon, Pinterest, Linkedin, etc.) detailing how they built out different types of AI/ML systems (e.g. forecasting, recommenders, search and ranking, etc.). Ofc, it focuses on AI/ML, but something similar could be made for the traditional or BI-oriented analytics stack, as well as the streaming world, super high value for practitioners! Btw-one of my favorite things at BCG used to be looking at our IT architecture team’s reference architecture diagrams… the best way to understand technologies is to look at how a ton of stuff is architected… and its fun!
- Curated papers, articles, & blogs on data science and ML in production
-
Messed up my career by pivoting to DS. Wondering if it's too late to switch to MLE
Applied ML: A collection of papers, articles, and blogs on ML in production by different companies (Netflix, Uber, Facebook, LinkedIn, etc)
-
[D] A dilemma of an ML guy in industry
Eugene Yan's applied-ml has tons of case studies.
- Papers & tech blogs by companies sharing their work on data science & machine learning in production.
-
My information dump for people trying to break into data science/interview notes
https://github.com/eugeneyan/applied-ml You may find some of his links interesting. I would avoid anything that refers to scaling up a platform as these are more backend engr focus. The more relevant posts to you are probably on the scale of blog posts that are product oriented like the ones I listed in section 4 (e.g. we wanted to solve X for our users and this is how we scoped and defined it). The technical aspects should come backseat to the business aspects. There's def a lot of companies/blog posts that he missed, but the internet is huge.
-
[D] Can anyone point me to resources/case studies of companies/business creating infrastructure for their data needs?
Check the resources mentioned in applied-ml. It includes blog posts/papers from many companies describing how they built some ML product X.
-
What content would be useful to intermediate Data Scientist
Check out this repo. They collect hundreds of case studies, broken down by dozens of methodologies from large real-world companies such as AirBnB, Nvidia, Uber, Netflix etc.
pipebase
-
pipebase 0.2.0 released !
In general, the framework allow developer customize data pipeline through manifest definition and wire a variety of system through pipeware plugins to sync/transform data.
-
pipebase 0.2.0 released
pipebase is a low code data integration framework.
-
pipebuilder release ! - CI for low code data integration app
If you want to build app locally, here is a tutorial for a timer (your first 'Hello World' app).
-
pipebuilder release ! - CI server for low code data integration app
If you want to learn more about how to define/compose your pipeline manifest - the YAML file, see [`doc`](https://github.com/pipebase/pipebase/blob/main/pipegen/README.md)
pipebuilder released ! pipebuilder is a CI for pipebase application. pipebase is a framework to allow developer compose/build low code data integration app (ex: ETL) in YAML manifest. Here is a list of supported connection pipeware. pipebuilder enable developer submit build task, download app binary through command-line tool pbctl against CI server.
What are some alternatives?
awesome-mlops - A curated list of references for MLOps
pipeline-model-definition-plugin
awesome-ml-blogs - Curated list of technical blogs on machine learning · AI/ML/DL/CV/NLP/MLOps
ansilo - Unlocking the power of SQL/MED to create data ecosystems from disparate data sources
machine-learning-roadmap - A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
paradedb - Postgres for Search and Analytics
Cookbook - The Data Engineering Cookbook
ml-surveys - đź“‹ Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.
data-engineering-book - Accumulated knowledge and experience in the field of Data Engineering
PowerToys - Windows system utilities to maximize productivity
stanford-cs-229-machine-learning - VIP cheatsheets for Stanford's CS 229 Machine Learning
awesome-artificial-intelligence-research - A curated list of Artificial Intelligence (AI) Research, tracks the cutting edge trending of AI research, including recommender systems, computer vision, machine learning, etc.